-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A way forward with the JsonProvider #1478
Comments
fwiw, I've seen comments elsewhere about breaking the type providers for different data types into different packages so maybe that's a place to start for that, otherwise just creating something new might help to avoid any legacy stuff altogether? |
I went ahead and started a Here are some progress screenshots and side-by-side comparisons with the original An array is an array. It does not depend on the number of items it contains: |
@mlaily I love this work! Is your JsonProvider2 working ? I guess these changes are not merging here soon. Could you publish your own nuget? |
@goswinr thanks for your interest. This new json provider is still a prototype: there are a lot of things that are still changing or are not tested properly and can crash. My branch is available but it's way too early for a release. Handling null or missing values properly is proving challenging, but I'm learning a lot and making progress. Ideally, I would also like to use System.Text.Json, but for now all my changes are based on the existing JsonValue parser. I'll worry about how to release it when it works well enough for production use :) |
I just learnt about empty strings being inferred as null. Using this file as a sample: {
"Lines": [
" **",
" *",
" ",
" ",
" ",
" ",
" *",
" *",
" ***** *",
" ******* ",
" ***** *",
" *",
" ",
" ",
" ",
" ",
" "
]
} Now reading this file using the provider results in an array for
Very surprising and interesting to me that I now find this newly created issue. Are the nuances of the JsonProvider documented somewhere? For example, I don't find any mention of |
@dlidstrom does |
@goswinr It doesn't seem to help. Here is my test program and files: #r "nuget: FSharp.Data"
open FSharp.Data
type Provider = JsonProvider<"sample.json", InferenceMode = InferenceMode.NoInference>
let sample = Provider.Load "test.json"
// show the Lines from json value and from generated type
printfn
"%A\n%A"
(sample.JsonValue.Properties() |> Array.find (fst >> ((=) "Lines")) |> snd)
(sample.Lines) sample.json: {
"Lines": [
""
]
} test.json: {
"Lines": [
" **",
" *",
" ",
" ",
" ",
" ",
" *",
" *",
" ***** *",
" ******* ",
" ***** *",
" *",
" ",
" ",
" ",
" ",
" "
]
} It looks like the provider reads all lines, as can be seen when inspecting the PS C:\> dotnet fsi .\Scratch.fsx
[
" **",
" *",
" ",
" ",
" ",
" ",
" *",
" *",
" ***** *",
" ******* ",
" ***** *",
" *",
" ",
" ",
" ",
" ",
" "
]
[|" **"; " *"; " *"; " *";
" ***** *"; " ******* "; " ***** *"; " *"|] |
Yes it seems MS puts quite an effort for System.Text.Json (also Newtonsoft would probably be faster than custom implementation). System.Text.Json source code is open (https://github.com/dotnet/runtime/tree/main/src/libraries/System.Text.Json/src/System/Text/Json) and one of the maintainers @eiriktsarpalis has been active in F# community. I don't know if there would be any quick wins without rewriting a lot, and do we even want dependencies like System.Memory to FSharp.Data (to use Spans). |
His advice was:
This makes sense: And now that FSharp.Data is split to multiple NuGet packages, maybe a dependency of System.Text.Json to FSharp.Data.Json.Core would be acceptable? |
I haven't worked on it for a while but I actually have started to use System.Text.Json on my branch. I opted to use the It seemed the most fitting for a balance between (user)-convenience and performance for a general use library. The parsing somewhat works (not for all cases yet I think), but serialization is currently non-existent. One big problem I had that I decided to ignore for now is that I didn't manage to make the dependency on System.Text.Json work everywhere when using the compiled provider. Another thing is my Anyway if you are curious I pushed my current version to my branch (still in a somewhat PoC-state):
(side note: the |
I'm trying to get this working. So far:
Now I'm this far:
|
I forgot to mention another interesting thing about this: if you |
As far as I know: System.Text.Json is released on multiple frameworks: TypeProvider uses the runtime-code on compile-time. After compilation is the easy part, because if the customer referenced the System.Text.Json to their project (or other dependencies did it), then it probably picks the correct runtime-version anyway. |
I tried to play with this version but I think it goes very complex to try to expose the System.Text.Json types on design-time. I think what I would rather prefer, is just to use the current "obsolete" JSON-model mapping and add 2 methods:
And so, not to expose System.Text.Json types to outside at all. Design-time would work as it is, no System.Text.Json, and runtime user would have option to use faster serialization if needed. Keep backward compatibility. Just map current Json types to Utf8JsonReader types with simple match clause. And, actually, if even more independency wanted, the System.Text.Json.dll could be loaded with reflection, so that if it's missing, the methods would just throw an error, that way there would not be dependency needed to Nuget-packet level at all to the actual library. |
I think exposing the underlying I don't like the idea of choosing an implementation based on a method name but without actually seeing anything else change. I agree adding STJ as a dependency of FSharp.Data isn't that great, but I don't see a good path around that... |
hmm, maybe the ideas are both doable.. Maybe the "faster serializer" I was thinking (not as good as raw STJ as you pointed out) could be a separate nuget package depending both fsharp.data and system.text.json. That way current old provider can stay as it is, and old projects could benefit with minimal migration. |
That would probably be ideal if we could make the providers independent yeah. Even though I'm aware some work has already been done to go in that direction, I'm not sure how easy it will be though. |
Hello!
The more I use the JsonProvider and the more I work inside its code base, the more I feel it needs a big update.
I think some kind of "strict" mode with less (surprising/bad) magic is much needed.
(For instance, inferring empty strings as null made sense with other providers that don't have types other than strings, but I don't think it makes much sense with json. In the same vein, multiplicity inference for arrays doesn't feel natural at all with json...)
I also think it might be a good idea to use System.Text.Json types as the erased implementation instead of the custom parser it currently uses.
This is easier said than done of course, but first this raises a few questions.
What about backward compatibility?!
What would be the preferred approach for such a venture?
I'm thinking it might be best to create a new provider instead of trying to fit these changes into the existing one.
What's your opinion about that?
And generally speaking, do you have any recommendations regarding this?
Thanks.
The text was updated successfully, but these errors were encountered: