-
-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine-readable dialect (not vocabulary) definition document #1423
Comments
You wanted me to respond specifically to the
paragraph here right? I don't think any changes need to be done for historical drafts as they work the way they currently work, any changes strictly apply to future releases. Isn't it a bit too early to think about precisely how to restructure the suite until we nail down how we want the feature to work (which is the purpose of this issue otherwise, no)? But if you're asking for agreement strictly on whether I think it's useful for us to have a way to have tests which specifically indicate they're testing validity of schemas, yes that I certainly already agree with, as we already have a use and need for such a thing given that the definitions of "valid schema in version XYZ" and "schema valid under XYZ's metaschema" are not the same (this was an old discussion about it, though we never made progress), and in general implementations don't do anything besides the latter because the former is complex and has no test cases -- so I think we already have this need today, and definitely am ok with it if it's even more needed in the future. Lemme know if I missed the point though on what you were hoping for feedback on. |
(Posting this separately from the above as it's not related to testing) but it seems odd to me at first glance to put validation in So for The rest of the ideas here I have to mull on I suppose -- I don't know that I understand from first read what advantage there is in changing EDIT: ok, I think you're trying to answer the latter with:
|
There was a purpose to decoupling the semantics of a vocabulary with it's syntax. For example, if you want to create a dialect that uses the Moving the meta-schema into the vocabulary definition affects this property of the vocabulary system. It can still be done, but it would be a little different. You would need to include your custom syntax directly in the meta-schema. The result would be that both the default and the custom schema are applied. That can lead to some duplication, but might also be a good thing because it makes it impossible to define a syntax that contradicts the default (for example, using a type the original syntax doesn't support), which would be a good thing. (Personally, I always thought things like that should just be defined as distinct keyword. So, I'm not concerned if we end up loosing the semantics/syntax decoupling in the end.)
I don't see I think having the schema in the vocabulary definition would effectively be the same thing except that I should mention that it's already the case that we need to handle validating a schema against a meta-schema differently than a normal instance against a schema. If validating a Compound Schema Document that includes an embedded schema with a different dialect than the parent, simple validation against a meta-schema doesn't work. You need to disassemble the bundle and validate each Schema Resource individually. One very important thing that I think the Vocabulary System is missing is the ability to declare the use of a keyword or vocabulary in the schema without needing to construct a whole new dialect. Constructing a custom dialect is too much to ask of users who just want to use one keyword in one schema. I'm not sure that introducing that functionality is something we can fit into the current vocabulary system. It might need drastic changes. So, my concern is, is this proposal an incremental improvement to a system that's ultimately a dead end? I think we need to take a step back, identify all the things we want out of a vocabulary system, and determine if the current approach is viable or we need to try something different. If we determine that it is viable, then I'd feel a lot better about working on incremental changes like this. |
Including that information is definitely part of this proposal; it's just an undefined part right now. But, yes, it definitely needs to be included in the vocab file. I think performing this as a multi-step process is a good thing (iterative changes and all that). If you think defining keyword meta-data and moving the keyword meta-schemas need to be separate steps, I'm okay with that. |
Would people feel any better about this if, instead of changing I think that "vocabulary" is an overloaded term at this point, anyway. Really, because we determined that a dialect is defined by a collection of vocabularies, what the |
I'd like to leave this here, just to record it, but ultimately, I think we need to discuss it elsewhere. I just want to wrap up the larger conversation before opening a new issue for this. @jdesrosiers and I were chatting over DMs where he proposed the idea of breaking this up further so that each keyword has its own file. If we do this, then a vocabulary is just a collection of keyword file IDs (and probably a description, etc.). Doing this would potentially allow individual keywords to be added directly into new-form- This would mean that vocabularies are convenient groupings of common keywords, and individual keywords can still be added to extend the vocabularies. There was also some discussion around potentially being able to add keyword file references directly into the schemas that needed to use them via a |
The more I let this sit, the more I like this idea.
ConceptCore meta-schema {
"$schema": "https://json-schema.org/meta/schema",
"$id": "https://json-schema.org/meta/schema",
"$dialect": [
"https://json-schema.org/dialects/core",
"https://json-schema.org/dialects/applicator",
"https://json-schema.org/dialects/unevaluated",
"https://json-schema.org/dialects/validation",
"https://json-schema.org/dialects/meta-data",
"https://json-schema.org/dialects/format-annotation",
"https://json-schema.org/dialects/content"
],
"$dynamicAnchor": "meta",
"title": "Core and Validation specifications meta-schema",
"type": ["object", "boolean"]
} Core dialect {
// do we need `$schema` here? maybe (read on)
"$id": "https://json-schema.org/dialects/core",
"$keywords": [
"https://json-schema.org/keywords/$id",
"https://json-schema.org/keywords/$schema",
"https://json-schema.org/keywords/$ref",
"https://json-schema.org/keywords/$anchor",
"https://json-schema.org/keywords/$dynamicRef",
"https://json-schema.org/keywords/$dynamicAnchor",
"https://json-schema.org/keywords/$dialect",
"https://json-schema.org/keywords/$comment",
"https://json-schema.org/keywords/$defs"
],
"title": "Meta-schema core dialect",
}
{
// do we need `$schema` here? maybe (read on)
"$id": "https://json-schema.org/keywords/properties",
"roles": [ "applicator", "annotation", "assertion" ], // because it does all three
"name": "properties", // maybe implicit by the id?
"type": "object",
"additionalProperties": {
"$dynamicRef": "#meta"
},
"default": {}
} What does this look like for an author that wants a custom assertion keyword?They'd have to create a keyword file: {
"$id": "https://json-schema.org/keywords/minDate",
"roles": [ "assertion" ],
"name": "minDate",
"type": "string",
"format": "date-time"
} (Note that this doesn't tell an implementation what to do with the keyword, just how to validate it's being used right. The keyword still needs logic written to support it in an implementation.) then a dialect file: {
"$id": "https://my-company.com/dialects/dates",
"$keywords": [
"https://my-company.com/keywords/minDate",
"https://my-company.com/keywords/maxDate"
],
"title": "Date/Time support"
} then a meta-schema: {
"$schema": "https://my-company.com/meta/schema",
"$id": "https://my-company.com/meta/schema",
"$dialect": [
"https://json-schema.org/dialects/core",
"https://json-schema.org/dialects/applicator",
"https://json-schema.org/dialects/unevaluated",
"https://json-schema.org/dialects/validation",
"https://json-schema.org/dialects/meta-data",
"https://json-schema.org/dialects/format-annotation",
"https://json-schema.org/dialects/content",
"https://my-company.com/dialects/dates"
],
"$dynamicAnchor": "meta",
"title": "Core and Validation specifications meta-schema",
"type": ["object", "boolean"]
} If we allow implicit references, the
Custom meta-schema with inlined dialect {
"$schema": "https://my-company.com/meta/schema",
"$id": "https://my-company.com/meta/schema",
"$dialect": [
"https://json-schema.org/dialects/core",
"https://json-schema.org/dialects/applicator",
"https://json-schema.org/dialects/unevaluated",
"https://json-schema.org/dialects/validation",
"https://json-schema.org/dialects/meta-data",
"https://json-schema.org/dialects/format-annotation",
"https://json-schema.org/dialects/content",
{
"$id": "https://my-company.com/dialects/dates",
"$keywords": [
{
"$id": "https://json-schema.org/keywords/minDate",
"roles": [ "assertion" ],
"name": "minDate",
"type": "string",
"format": "date-time"
},
{
"$id": "https://json-schema.org/keywords/maxDate",
"roles": [ "assertion" ],
"name": "maxDate",
"type": "string",
"format": "date-time"
}
],
"title": "Date/Time support"
}
],
"$dynamicAnchor": "meta",
"title": "Core and Validation specifications meta-schema",
"type": ["object", "boolean"]
} What needs to be added to JSON Schema to do this?We need two keywords, I don't think it'd be too hard to define schemas for these keywords. I'd expect they'd be somewhat more restrictive than just the meta-schema. For example, |
This will need to be moved into whatever vocabularies ends up being. See #1510. |
IMPORTANT: This changes how meta-schemas are organized but not really how they work.
Relevant to this discussion:
I've been thinking about all of these ☝️ things together to get a larger picture of where vocabularies could go. The discussions I've been a part of have all described a vocabulary definition file as serving several purposes:
properties
functions as all of theseImpact to the Meta-Schema
The ⭐ in particular is where the meta-schema is changed. Currently the schema for a keyword's value is contained in the meta-schema body, generally under a
properties
keyword. However, if the vocabulary definition file carries and enforces the schema for a keyword's value, then the meta-schema's entry is redundant. This means that the entireproperties
keyword for a meta-schema could be removed as it's all in the vocab files.I don't think this is a breaking change, however. A significant reorganization, sure, but the functionality is all still there. Moreover, we can make this change iteratively.
Suppose the only change we make to how the meta-schema is processed is that
$vocabulary
acquires some validation behavior, applying the keyword schemas from all of the vocabularies it lists (it becomes an in-place applicator similar toproperties
). Ideally, those keyword schemas would be the same as what's already in the meta-schema. However, even if they're not, the meta-schema is defining a dialect by virtue of declaring a set of vocabularies. In doing so, it's free to apply additional constraints to keywords.For example, consider a modified Validation meta-schema where I've required that
enum
have unique values (which isn't a current requirement):enum
, as defined in the vocabulary, doesn't have the uniqueness constraint. This is actually possible now: the above meta-schema should be supported without any issues.Now consider adding in-place-applicator / assertion functionality to
$vocabulary
which (forenum
) enforces thetype
anditems
constraints but notuniqueItems
. The functionality of this meta-schema is unchanged.Going further, we could change the original Validation meta-schema to this:
We don't need
properties
because that's only defining the keywords, which are now defined in the vocabulary document identified byhttps://json-schema.org/draft/2020-12/vocab/validation
, and we don't need$defs
because that was only used to support the subschemas inproperties
.In fact we may not even need the vocab meta-schemas anymore. Because the top-level meta-schema lists all of the vocabularies, it would automatically perform all of the validation that the vocab meta-schemas currently provide. We could remove the
allOf
making it just:(I've also removed the deprecated keywords listing.)
Adoption
First of all, we've agreed that vocabularies and the
$vocabulary
keyword are (at best) unstable, so modifying it (even in a breaking way) isn't out of the question.Adding in-place-applicator / assertion behavior to
$vocabulary
in the way described above isn't a breaking change as long as we copy the keyword schemas correctly.Later, once
$vocabulary
is promoted to being a stable feature, we can update the meta-schemas to remove the redundancies.Readability and Accessibility
There is an issue of readability and accessibility when all of the keywords are defined in vocab files. While most people would be used to just looking in the meta-schema to see what keywords are available and how they're defined, now they'd have to follow another file reference to get that same information.
I don't think this is a big issue, though, and people will eventually get used to it.
On the other hand, creating a new meta-schema is immensely easier: you just list the vocabularies you want, and everything else is taken care of.
Automatic Support for Undefined Keyword Checking
With this in place, implementations will be able to look at the vocab files to see if and how a keyword is defined.
Further, the implementation would be able to detect trying to circumvent the "keywords must be defined in vocabs" requirement by defining a new keyword directly in the meta-schema. Currently, trying to do this is troublesome for implementations (annoying but not impossible).
(There may be some intersection here with
x-
keywords, but I haven't thought about it too hard.)$vocabulary
Requires Special TreatmentCurrently
$vocabulary
is only to be processed when the schema that contains it is being processed as a meta-schema. I don't think this should change as it only defines what keywords the instance (another schema) can use.In this way, maybe it does break the nice symmetry we have around "a meta-schema validating a schema" is just "a schema validating an instance." But it could be argued that such symmetry was broken when
$vocabulary
was introduced.It may have an impact on the Test Suite since we do have a number of tests that validate schemas based on the meta-schema, and they'd need to be updated to pass along the context of "this is a meta-schema evaluation" in order to get the validation result from
$vocabulary
.Out of scope
I haven't addressed
$vocabulary
might change (which depends on whether optional vocabs are still worth having, see link at top)$ref
in some capacity?)I'd like to get the concept defined before we start considering mechanics.
The text was updated successfully, but these errors were encountered: