Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DID parameter for dereferencing an information resource identified by a DID #480

Merged
merged 5 commits into from
Dec 27, 2020

Conversation

brentzundel
Copy link
Member

@brentzundel brentzundel commented Dec 9, 2020

fixes Issue #199

Signed-off-by: Brent Zundel [email protected]


Preview | Diff

@brentzundel brentzundel changed the base branch from master to main December 9, 2020 20:15
@peacekeeper
Copy link
Contributor

@brentzundel something is weird about this PR, it contains a lot of commits from other open PRs that you apparently have in your "content" branch?

@brentzundel
Copy link
Member Author

@brentzundel something is weird about this PR, it contains a lot of commits from other open PRs that you apparently have in your "content" branch?

Those are the diff between master branch and main. If PR #482 gets merged, those extra commits should disappear.

index.html Outdated Show resolved Hide resolved
@brentzundel brentzundel changed the base branch from main to master December 10, 2020 17:49
@brentzundel brentzundel changed the base branch from master to main December 10, 2020 17:49
Signed-off-by: Brent Zundel <[email protected]>
index.html Outdated Show resolved Hide resolved
@OR13
Copy link
Contributor

OR13 commented Dec 10, 2020

only comment on this is that true does not provide much information, but I don't have a better solution.

Co-authored-by: Markus Sabadello <[email protected]>
@brentzundel
Copy link
Member Author

only comment on this is that true does not provide much information, but I don't have a better solution.

I agree. The spec I found said that technically parameters doesn't need to have a value, but some parameter processing software expects them, so rather than break things I figured I'd add 5 chars to the URL.

@peacekeeper
Copy link
Contributor

technically parameters doesn't need to have a value

I think at some point we had the idea to introduce a parameter called "content" without a value, to fulfill this purpose. But I agree adding a "true" value is probably a good idea. We may want to add a note to the DID Parameters section to clarify this.

Copy link
Contributor

@peacekeeper peacekeeper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good way to solve the use case.

@kdenhartog
Copy link
Member

Just to confirm I'm understanding the mapping properly is did:example:123?subject=true supposed to dereferences to {"hello": "world"} or is the resource supposed to be embedded in the did metadata object or somewhere else instead?

and

did:example:123?subject=true with "application/did+json" resolves to { "id": "did:example:123" } ?

@brentzundel
Copy link
Member Author

Just to confirm I'm understanding the mapping properly is did:example:123?subject=true supposed to dereferences to {"hello": "world"} or is the resource supposed to be embedded in the did metadata object or somewhere else instead?

My understanding is that it would dereference to {"hello": "world"} (assuming that is the DID Subject, of course)

did:example:123?subject=true with "application/did+json" resolves to { "id": "did:example:123" } ?

I think, if you dereference did:example:123?subject=true you would need to use the MIME type of the DID Subject, whatever that may be.

@msporny
Copy link
Member

msporny commented Dec 13, 2020

I think, if you dereference did:example:123?subject=true you would need to use the MIME type of the DID Subject, whatever that may be.

I think we may have to ask for TAG guidance on this... We seem to be creating a new class of information (not saying it's good or bad, and there are some rumblings of this being a correct solution for HTTP Range 14)...

Just so I understand this, we're saying:

  • If you resolve a DID, you get a DID Document (which is this ephemeral thing -- just some information is magic'd into existence when you resolve it).
  • If you dereference a DID, you could get anything (which may be something that comes from the DID Document (a key), or something that comes from something else (a schema)).

From an HTTP Range 14 perspective, I don't think there's an issue... both URLs point to different resources did:example:123 vs. did:example:123?subject=true -- both of those may or may not point to the same DID Document... or the first could point to a DID Document and the second could point to a Schema file.

In the latter case, I expect @jandrieu would still object -- because you can put all sorts of PII in the second just as you could the first, which puts us right back to the beginning of this discussion.

I'm also concerned about the mental model here -- feels really complex. It would be simpler to just put a property called schema in the DID Document and embed the data there... I would be surprised if folks would object to this solution.

I'd like the group to discuss this more deeply... I have a feeling that folks didn't have a strong opinion on the call because there was nothing concrete to look at when it was being discussed. Now that there is something concrete, we need to see what sorts of concerns it raises.

I'm certainly concerned about the complexity of this concept where there is more than just the DID Document that holds information... that now there is this arbitrary set of other things that could be on the ledger and fetched using foo=true or bar=true.

Copy link
Member

@msporny msporny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the complexity created by properties that allow things other than a DID Document to be returned during the dereferencing process.

@OR13
Copy link
Contributor

OR13 commented Dec 17, 2020

Some thoughts, feels sorta related to services, and I would want to comment on things like this:

did:example:123?service=repo&relativeRef=/user/name.git
did:example:123?service=schemas&relativeRef=/user/name.json
did:example:123?service=foo&subject=true
did:example:123?service=schemas&relativeRef=/user/name.json
did:example:123?service=schemas&subject=true&relativeRef=/user/name.json

I don't know that it matters, managing query strings is already a complicated subject which brings into question things like URI canonicalization and order of query param processing.

@iherman
Copy link
Member

iherman commented Dec 18, 2020

In any case, can we use another term and not 'subject'? It associates to the DID subject, and can be the source of confusion. What about 'resource'?

@brentzundel
Copy link
Member Author

In any case, can we use another term and not 'subject'? It associates to the DID subject, and can be the source of confusion. What about 'resource'?

I like this suggestion and made the change in 1d11f7e

@brentzundel
Copy link
Member Author

Some thoughts, feels sorta related to services, and I would want to comment on things like this:

did:example:123?service=repo&relativeRef=/user/name.git
did:example:123?service=schemas&relativeRef=/user/name.json
did:example:123?service=foo&subject=true
did:example:123?service=schemas&relativeRef=/user/name.json
did:example:123?service=schemas&subject=true&relativeRef=/user/name.json

I don't know that it matters, managing query strings is already a complicated subject which brings into question things like URI canonicalization and order of query param processing.

I don't think that overloading the service property to enable this use case is the right approach.

@brentzundel brentzundel changed the title Add 'subject' DID parameter Add DID parameter for dereferencing an information resource identified by a DID Dec 18, 2020
@talltree
Copy link
Contributor

If a DID identifies an information resource directly (which is already becoming a common use case and one needed by the Indy DID Method), I find this approach to be a wonderfully simple and elegant mechanism for a client to request the information resource instead of the DID document.

@iherman
Copy link
Member

iherman commented Dec 19, 2020

I'm concerned about the complexity created by properties that allow things other than a DID Document to be returned during the dereferencing process.

I do not understand this remark. Dereferencing is not specified to return a DID Document; that is the goal of DID resolution. The current spec says, in §8.2:

The DID URL dereferencing function dereferences a DID URL into a resource with contents depending on the DID URL's components, including the DID method, method-specific identifier, path, query, and fragment. This process depends on DID resolution of the DID contained in the DID URL.

And this proposal is perfectly aligned with this; the only thing it does is adding resource as yet another pre-defined DID parameter.

I believe this approach indeed solves the use case.

@msporny
Copy link
Member

msporny commented Dec 20, 2020

I'm not opposed to this PR and won't stand in the way if that's where the WG wants to go (and agree that this is more or less the option that would probably result in the least severity of gnashing of teeth). Changing from subject to resource is an improvement.

I do think that the rest of the group needs to agree that this is the right path forward while reconsidering the more direct path.

To remind folks of the more direct path, which I believe directly contradicts what @jandrieu was arguing for... we should use type -- it's going to happen, people are going to disregard the advice given by the group because the solution proposed in this PR is going to give them typing information anyway. The resource is going to be typed (or at least, type will be easily inferred by the properties associated with the document). So, instead of using one feature type -- we are now possibly going to see DID Method Parameters per type... schema=true, image=true, foo=true -- each returning a different resource type.

The groups position on avoiding the use of type is going to backfire on it - and this PR is an example of that happening. We're bending over backwards to provide another mechanism to get a very specific document type to the requesting party instead of using a more sane feature to do it thinking that we've escaped the quandary raised by @jandrieu. I suggest that there is no escaping that quandary.

That said, I don't want to re-open that discussion based on new information (this PR is the new information that demonstrates the downside of taking the position of no type property). Defer to the group to determine if this is an issue, or to just let this PR through. To be clear -- the use case is valid no matter what direction we go and important enough that the use case will be addressed using one mechanism or another... regardless of the position the WG takes.

@agropper
Copy link
Contributor

agropper commented Dec 20, 2020 via email

@msporny
Copy link
Member

msporny commented Dec 21, 2020

Changing from subject to resource is an improvement.

Thinking about this a bit more... what about doing dereference=(true|false) instead? What if we made the value of dereference true by default? So, here are the possibilities:

did:example:123?dereference=false

The above would give you the DID Document.

did:example:123?dereference=true

The above would give you the resource, which would be the default. I mean, that's effectively what we're doing with this PR, right? We're just telling the resolver what to give us back -- the DID Document, or the resource (DID subject).

I guess the argument against this is that service=xyz&relative-ref=/foo/bar does dereferencing as well? So, based on the current spec, what you send into the abstract resolve / dereference functions is what determines what you get back, which raises the question of what you should get back when you call resolve on this resolve(did:example:123) vs dereference on this dereference(did:example:123).

It feels like this could be a solved problem already, or the spec is horribly broken, or the spec is so vague on this point that it's not useful. This is why I'm struggling with this PR... something feels off.

/cc @peacekeeper @jricher -- help -- what was the intent with the resolution/dereferencing sections wrt. this PR? If a DID should dereference to a data schema, does calling the dereference function on that DID give you back the data schema, or the DID Document? I would expect it to do the former, but the spec doesn't really say, does it?

@msporny
Copy link
Member

msporny commented Dec 27, 2020

Normative, multiple reviews, changes requested and made, no objections, merging.

@msporny msporny merged commit b5a6b49 into w3c:main Dec 27, 2020
@jandrieu
Copy link
Contributor

jandrieu commented Jan 5, 2021

Unfortunately, I believe this was merged without consensus. I understand there was the traditional time for feedback, but that was during the holidays.

Requesting dereferencing at the time of resolution is a fine parameter for the resolver.

However, making it a unique URL parameter still creates issues with herd privacy.

As pointed out by others, this requirement can already be achieved with a service endpoint. I'm also not a fan of service endpoints, but they are already part of the gestalt of how DIDs work.

We should just document in the spec that service endpoints can be used for linking to a specific resource via a DID-URL.

I believe that additionally we should make sure that the resolver contract supports a parameter for asking for dereferencing at the same time as resolution.

@talltree
Copy link
Contributor

talltree commented Jan 5, 2021

@jandrieu As with the comment I left you on PR #457, insisting that herd privacy must apply to all DIDs is going to far. It's clear we need a special topic call about this, so I will leave further discussion for that special topic call.

@OR13
Copy link
Contributor

OR13 commented Jan 5, 2021

PROPOSAL: We could make did:example:123/ always return a resource with an id and type member, and did:example:123 always return a resource with an id member and never a type property.

however, its already allowed by "pure json" to include any property, including ones that are not registered and are therefore even more useful for eroding privacy....

I think folks are not seeing the big picture here...

  1. we have 3 did document representations that are supposedly "equally safe" but have different privacy properties.
  2. we have normative requirement in JSON / the ADM that preserves "unknown" properties.
  3. people can use "type" as an unknown property.
  4. @type is aliased to type in JSON-LD

As @msporny mentioned here #480 (comment)

This PR is the result of our failure to accept that type is inevitable (just like thanos).

If we are going to rehash this PR / topic, I suggest we fix the root problem regarding type and address the points I outlined above.

I find this PR language / syntax not great, but I have no objection to the concept of "type" or "dereferencing".... we should make sure that we are solving this problem in an equal security context for all representations or we should be adding security warnings to all representations that are "not equal"...

@brentzundel
Copy link
Member Author

I don't understand how a service endpoint could be used to dereference a resource referred to by a DID, when the DID is the only URI for that resource.
service objects are required to contain id and serviceEndpoint properties, both of which MUST be URIs.
What URI should be provided for these properties for a resource whose only URI is the DID?

@OR13
Copy link
Contributor

OR13 commented Jan 6, 2021

I agree, was convinced that overloading the service query param was a mistake... using the dangling / however, would provide at least one use for the path attribute, but the solution we chose also works for me.

@jandrieu
Copy link
Contributor

jandrieu commented Jan 6, 2021

@talltree You can reject herd privacy, but you don't get to redefine it. Herd privacy only works BECAUSE it applies to all DIDs. That's the herd. When something applies to just certain DID Subjects, then that enables privacy penetration that would not be possible if true herd privacy exists. Because the herd is bifurcated or worse, into segments that are distinquishable.

For example, IP addresses have excellent herd privacy. In part because every public IP address is the same (modulo v4/v6) and is interpreted the same way. On top of that, special, functionally unique private addresses are , with three modest exceptions like the private identifier subnets like 192.168., link-local address like169.254., and localhost 127.0.0.0. Yes, if the IP address is one of those exceptions, you know something about the endpoint, but NOTHING else. You don't know who owns it, what machine is running there, or what applications might be running on it.

If you want your entire DID Method to be of a particular nature, such as a did method that ALWAYS represents cars, go for it. DID Methods are free to violate herd privacy in this manner. But DID Core should not.

Returning to the IP example, there are services that will, based on publicly available information map an IP address to a specific geographic location with some level of accuracy and precision. These services highlight a flaw in IP addresses herd privacy that is a result of early routing simplifications that ultimately got embedded in hardware. So that, in fact, because of the allocation strategy undermined the geographic herd privacy.

We have a moral obligation to prevent these sorts of privacy problems. Herd privacy is how we do that. And herd privacy is violated directly proportionally to the variability that can be used to separate the herd.

@jandrieu
Copy link
Contributor

jandrieu commented Jan 6, 2021

@brentzundel I see your point. If the DID is the only URI that can be dereferenced then service endpoints can't solve that problem.

I see two options here.

Either you use a CID as a service endpoint, and use a content-addressable network to do dereferencing, such a pointing to an IPFS resource.

Or, you use an URL that is designed for dereferencing to return a resource, such as an http, https, or ftp URL.

Treating DIDs as dereferencable is the layer violation here. You want DIDs to be a dereferencable URL that returns a resource when doing resolution. But DIDs were not created to deliver resources, they were created to provide demonstrable proof of control over identifiers without dependence on a trusted third party.

Resolution should return the meta-data needed to interact with the resource. Not the resource itself.

If you can't resolve without dereferencing, you can't support herd privacy.

Using a service endpoint that points to a URL, however, WOULD allow you to combine dereferencing with resolution should the requester desire that. As such, having a parameter in the resolution contract would, IMO, address your need from the perspective of the DID-URL being able to dereference to a digital resource.

However, only by transmogrifying DIDs from an identity architecture to a distribution architecture will you be able to do what are asking for here.

DIDs weren't designed for this use case. IMO, we shouldn't be supporting it.

@agropper
Copy link
Contributor

agropper commented Jan 6, 2021 via email

@iherman
Copy link
Member

iherman commented Jan 6, 2021

Resolution should return the meta-data needed to interact with the resource. Not the resource itself.

@jandrieu I'm trying to understand.

The correct proposal is not on the resolution returning the resource itself. I.e., the proposal we are talking about is not using the DID resolution to get to the resource. It is using the DID URL dereferencing for the purpose. These two notions are strictly separated in the spec as well in the method specifications.

I wonder whether the source of disagreement is the somewhat blurry relationships between a DID and a DID URL. The current spec makes it clear that these are related but different notions, that methods interact with DID-s differently than with DID URL-s, but maybe that separation is still not clear in the spec. I know I always felt uneasy about the very notion of a DID URL-s, but I acknowledge that usefulness in practice. But if my assumption on the basis of the disagreement is correct, maybe we will have to look at ways of separating these notions even more than we are doing now.

@agropper
Copy link
Contributor

agropper commented Jan 6, 2021 via email

@msporny
Copy link
Member

msporny commented Jan 6, 2021

@jandrieu wrote:

However, making it a unique URL parameter still creates issues with herd privacy.

I'm having a hard time understanding your concern, @jandrieu. Can you please provide a concrete example of exactly what happens, in which order, and the concrete outcome you are attempting to avoid? Ideally with step-by-step explanations --- A happens, then B happens, then C happens -- with code/did examples as well.

What is the concrete alternative that addresses @brentzundel's use case? What type of concrete service endpoint would need to be defined?

I'm trying to analyze the attack you're concerned about and I can't glean what the attack would look like from the text above.

@jandrieu
Copy link
Contributor

jandrieu commented Jan 6, 2021

@iherman Thanks for that clarification. I definitely had misunderstood the PR. My apologies for that.

I appreciate that the layer violation I had thought was happening is maybe not. That helps. But then I don't understand how the proposal here addresses @brentzundel's goal of returning the resource itself when there is no transport URL (like http and ftp) to dereference. If there is such a URL, then a service endpoint works great. But based on @brentzundel's latest comment, I understand he wants the DID itself (without a service endpoint) to be dereferenceable to some resource WITHOUT using some other URL. Adding a "resource=true" to a DID itself feels like putting the meaning of the http accept header into a query parameter and does not provide guidance for what is ultimately going to be a method-specific dereferencing.

Can someone explain how this PR addresses @brentzundel's initial request?

But we still have a problem with how that is going to be used.

If I want to do something like this:
EXAMPLE A.1

@context : "did:example:abc?resource=true"

Or
EXAMPLE A.2

<img src="did:example:xyz?resource=true">

How is that better than these equivalents where a method provides a default dereference method:
EXAMPLE B.1

@context : "did:example:abc"

Or
EXAMPLE B.2

<img src="did:example:xyz">

These examples would need methods provide a response for dereferencing a DID that is not the DID Document.

IMO, this may be the ideal answer. You should not be able to discern, from the DID itself, the DID-URL, or the DID Document, between a DID for a schema or image or a person or a corporation.

Any method that defines a default dereferencing could be used in this way. This option is not yet defined because we've deferred dereferencing, but I think we should add this to the resolver contract: that methods define how a DID of that method is dereferenced (rather than a DID-URL with a service reference).

Or, using service endpoints
EXAMPLE C.1

@context : "did:example:abc#context"

Or
EXAMPLE C.2

<img src="did:example:xyz#image">

Either way, the creator of those DID URLs (the author of the JSON-LD doc in Example *.1 and the html author in *.3) has to know what they expect to get back from the dereferencing process. Just like when you do with http.

In the latter case, the DID Document leaks some information about what is expected to be returned, but it reveals nothing in particular about the DID Subject, as any DID could specify any type of service or return any kind of resource for each of those services. So, service endpoints is less privacy preserving, but that's a different battle. IMO, the best answer here is to ensure one and only one endpoint per DID, and dereferencing that DID means dereferencing the one and only endpoint. But, as long as we allow arbitrary service endpoints, we are leaking details that could be deferred to another layer in a more privacy-respecting architecture.

Yes, there will be situations where particular usage can lead to inferences about the DID Subject. But the opaque examples B.1 and B.2 are leaking by the nature of the use of the DID, which is ALWAYS a potential privacy leak. Anyone could use DIDs in a way that could lead to inferences, but that is fundamentally different that the DID infrastructure leading to those inferences.

So, yes, people can accumulate and publish directories of DIDs that are believed to refer to particular groups (cryptocurrency owners, sexual predators, corporations, schemas, etc.), but that is an assertion of the directory and IMO, should NOT be something that is baked into the underlying DID infrastructure. Just like you can do a geo lookup of an IP address, but IP addresses themselves don't provide geo data.

Rethinking my previous answer, if what @brentzundel wants is dereferencing without reliance on a secondary transport URL, that can be supported with something like

 "service": [{
    "id":"did:example:123#image",
    "type": "methodDereferencing",
    "serviceEndpoint": "did:example:123#image"
  },

Where "methodDereferencing" indicates that one should use the method-specific way of retrieving the resource from the registry substrate. That service endpoint needs a method-specific dereferencing algorithm, but that's what I understand @brentzundel to be asking for, because he'd rather not use a http(s) or ftp URL as a service endpoint.

Any DID Method that wants to could use this today and could attain interoperability by defining the methodDereferencing service type in the Did Spec Registries.

FWIW, this pattern aligns with work I'm doing elsewhere with a whole family of decentralized identifiers I am working to bring into the W3C fold.

Instead, there's a different layer violation, that of embedding claims about the Subject in the DID URL. Which is an improvement over embedding this information as a claim in the DID Document, but we can do better. Ultimately, this comes back to herd privacy. I'll pick that thread up in a separate response to @msporny's query on its own issue thread.

@agropper
Copy link
Contributor

agropper commented Jan 6, 2021 via email

@brentzundel
Copy link
Member Author

@jandrieu I don't understand how examples B1 and B2 would work. If only the DID is provided, wouldn't resolution just result in the DID Document? How would the relying party indicate they'd like the resource dereferenced instead of the DID Document resolved?

Apart from that, your final example using a service endpoint intrigues me, but I would like to see a PR with proposed spec changes that introduces language to the service endpoint section so I can be sure I understand how such a solution might work.

@OR13
Copy link
Contributor

OR13 commented Jan 7, 2021

@brentzundel maybe I missed this before, but does the resource that is returned by this proposal have to have an id?

{
  "id": "did:example:123"
}

This is a did document ^.

And the did core spec has a lot of text dedicated to explaining how to add properties too did documents, including properties that are not verification relationships.

from a type theory perspective, a "DID Document" is a base class, which can be extended with the following structure (in typescript).

export interface DidDocument {
id: string;
verificationMethod?: Array<string | VerificationMethod>
}

only the id is required, but the type system of the did document is constrained by the representation, for example, JSON types for JSON and JSON-LD and CBOR types for CBOR.

the following is totally valid according to did core today, although not recommended:

RESOLVE did:example:123#mugshot --accept application/did+json

{
  "id": "did:example:123",
  "mugshot": "data-uri.png"
}

RESOLVE did:example:123#mugshot --accept application/did+cbor

{
  "id": "did:example:123",
  "mugshot": "image.png binary"
}

fragments are to be interpreted according to the the mime type, which means CBOR, JSON and JSON-LD can all decide to handle #mugshot differently, but a fragment is to refer to a part of the resource, regardless of its representation.

If the resource cannot be represented in CBOR, JSON or JSON-LD... you are out of luck for using a "DID Document" to represent it... and I would suggest that a service can be used for whatever the resource is.

I'm struggling with the conclusions folks are coming to regarding privacy and the spec.

  1. Anything that can be represented in JSON, CBOR or JSON-LD can be returned by DID Resolution or Dereferencing.
  2. fragments refer to sections of the returned resource and are interpreted in the context of the mime-type they were retrieved with.
  3. UNKNOWN PROPERTIES ARE PRESERVED

When you add these together, it means the examples above regarding mugshots are valid.... and in fact for JSON, you can add literally whatever you want to a did document, including type, taxID, SSN or naughty pictures.

A DID Method has the following structures which can control the content returned by dereferencing:

path query resolver options, like accept.... once returned the fragment can be used to further identity how the resource should be handled.

These are logically equivalent to HTTP bits, and the privacy proposals I am seeing here seem to be suggesting:

  1. HTTP URLS cannot be used to identify things.
  2. HTTP Responses cannot contain members that reduce the anonymity set of the identified thing
  3. Everyone must interpret URLS and HTTP the same way.

None of these are true of HTTP, and none are true for DIDs.... I appreciate the desire to have strong privacy language, and we should... but security folks will also be reading, and it is important we not create "privacy theater"... the normative requirements of the spec today, make the above assertions false... so lets not have normative statements that are then countered by privacy concerns which say that what we have defined in normative text, is not allowed / advised.

Looking forward to a special topic call on this.

@agropper
Copy link
Contributor

agropper commented Jan 7, 2021 via email

@OR13
Copy link
Contributor

OR13 commented Jan 8, 2021

I'm not sure I fall into either camp, but if I were to describe the maximalist perspective, I would do so like this:

JSON and RDF can be used to represent any information
DIDs and HTTP can be used to store any information.

whether information being stored in a DID Document or on an HTTP server is a "good idea" or "bad idea" has to do with a careful security analysis of the VDR, DID Method, cryptography etc....

Therefore DID Core doesn't have enough information to say if providing hints that reduce the anonymity set of the DID Subject is bad, because DID Core is not about specific DID Methods and there are DID Methods where such information would be "safe" to expose / where exposing it is a valuable usability feature.... in fact, DID Core doesn't have a right to talk about privacy at all, because privacy doesn't exist in "an abstract data model"... it exists in concrete systems which can be analyzed, audited, etc...

Saying there are privacy issues in an abstract data model, is like saying there are privacy issues in C# or RDF.... or like complaining that numbers can be added, or used to count dead bodies... its the wrong place to fight for privacy, and its a waste of time.

Because commenting on the security / design of specific DID Methods involves reviewing their source code, writing tests / attacks and otherwise checking the cryptography they rely on, and because people tend not to provide those services for free, especially to competitors, the maximalists choose to draw a line between normatively legal data models, and the continuum of terrible privacy and security engineering, leaving the former to the DID WG, and the later to consultants, lawyers and the red team.

In the end, neither a Min or a Max approach is appropriate without considering a concrete use case / threat environment, and since did core does not have a single use case / threat environment... I tend to fall closer to a Max perspective for DID Core and a Min perspective for high security DIDs that are backed by public ledgers with crypto-economic security models....

Anyone who assumes they know the threat environment for all DID Methods that will ever be created is wrong, but there are clearly some DID Methods with different privacy issues that others... evaluating specific DID Methods remains.... out of scope for this WG :)

@burnburn
Copy link

burnburn commented Jan 8, 2021

@iherman Thanks for that clarification. I definitely had misunderstood the PR. My apologies for that.

I appreciate that the layer violation I had thought was happening is maybe not. That helps.

@jandrieu given your comment, perhaps it's appropriate to close this PR since it has been merged appropriately. If your comments now are around improving the text in the specification (which includes this PR), then opening a new PR with your suggestions would be clearer for reviewers. Discussion on a merged PR is difficult for others to follow.

@jandrieu
Copy link
Contributor

jandrieu commented Jan 8, 2021

@burnburn absolutely not. This merging happened, I believe, with good intentions but without consensus.

  1. There was a discussion on 12/22 where no resolution was made.
  2. All regular and special topic calls were cancelled through to the new year.
  3. The PR was merged four days later on 12/26, in the midst of the end-of-year holidays

That is inappropriate.

In fact, we should revert this merge immediately while we find some sort of consensus.

There was no resolution, there was only one additional github comment after the last group discussion, and neither a regular nor a special call to establish a resolution of any kind.

@jandrieu
Copy link
Contributor

jandrieu commented Jan 8, 2021

I like @agropper's description of the Min position.

Reading @OR13's latest comment, I can see at least one area of miscommunication, which may help clear up some of the contention.

It is understood that DID Method creators, and even resolver implementers, may add bespoke properties (to the DID Document and DID meta-data respectively). That is fundamental to the extensibility of DIDs. This conversation isn't about preventing what anyone might consider "bad design choices" for DID Methods. We aren't talking about any particular method-specific property. Those debates will play out in the marketplace of technology and ideas.

What we are talking about is enshrining what some of us feel are bad practices in the core specification itself.

Those elements that we enshrine in the did-core spec will establish a baseline of expected functionality and, even when optional, will endorse those practices as if everyone should do it. DID Methods that don't support those features will be perceived--rightly or wrongly--as incomplete. This places a higher burden on DID Core than on DID Methods, because what we specify there has longer term, broader impact. It is also the only DID specification that is currently on a standards track. Since we have embraced DID Method specifications as a "wild west" of innovation, it is vital that we ensure the core specification truly embodies consensus best practices.

Instead, what we keep seeing--and which I keep opposing--are implementers who find an innovative way to solve a niche need in a manner that is not inconsistent with the core spec, then arguing that this bespoke addition should be added to the core spec.

However, any additions to the core spec MUST be considered with greater scrutiny than simply checking to see if those innovations are compatible with existing documentation. We must ensure that those additions meet the higher standard of ubiquitous endorsement across the DID ecosystem. IMO, such additions MUST meet a standard of usefulness, appropriateness, and necessity.

There's no doubt that the proposed approach is useful. There is legitimate debate as to whether it is appropriate. And since this particular PR can easily be achieved as a method-specific property, it does not meet the requirement of necessity: the functionality CAN be achieved without updating DID Core.

There is an argument that if it isn't in DID Core, then there won't be interoperability. This is false. Our extensibility model uses the DID Spec Registry for interoperability specifically to allow this sort of innovation. If you want interoperbility, add it to the registry and advocate for Method implementers to adopt support.

It is imperative that we avoid avoidable harms. This is the foundational argument for herd privacy and the Min perspective. If we can avoid particular harms by making better technical choices, we have a moral obligation to do so. When building a system that anticipates usage by billions of people over decades of use, even a modest risk of harm becomes a statistical certainty. And when those risks are embedded in the core spec, we dramatically reduce the opportunity for DID Method innovators to find better solutions.

@msporny
Copy link
Member

msporny commented Jan 8, 2021

This merging happened, I believe, with good intentions but without consensus.

  1. There was a discussion on 12/22 where no resolution was made.
  2. All regular and special topic calls were cancelled through to the new year.
  3. The PR was merged four days later on 12/26, in the midst of the end-of-year holidays
    That is inappropriate.

No, the merge was entirely appropriate and happened per the process that the group has consensus on.

The issue is that we don't have consensus on the specification text in this PR, not that what happened was inappropriate or that there was a process violation.

To be more specific, I tagged you on December 13th as potentially being opposed to this -- that was not "during the holidays". I then tagged you again on and December 20th -- there was no response that time either. We asked people to review this issue on the calls -- there were no objections. The group went to lengths to get input on this PR.

When WG Members raise the prospect that the Editors are acting inappropriately, or that there was a W3C Process violation -- it creates A LOT of work for the Editors and Chairs. We have to comb through the issues, PRs, mailing lists, and transcripts to rebuild the order of operations and defend that there wasn't a process violation. That is time spent away from dealing with the issue or working on other things that need to get done.

So, let's be clear here -- @jandrieu disagrees with the specification text that got merged, but that merge happened per the WGs established process in addition to telecon announcements requesting review of this issue and personally tagging @jandrieu twice in this PR.

@burnburn
Copy link

burnburn commented Jan 8, 2021

@burnburn absolutely not. This merging happened, I believe, with good intentions but without consensus.

There was a discussion on 12/22 where no resolution was made.
All regular and special topic calls were cancelled through to the new year.
The PR was merged four days later on 12/26, in the midst of the end-of-year holidays
That is inappropriate.

In fact, we should revert this merge immediately while we find some sort of consensus.

There was no resolution, there was only one additional github comment after the last group discussion, and neither a regular nor a special call to establish a resolution of any kind.

To add even more precision to Manu's email,

If this PR had been raised on the 22nd and then merged over the holiday period with little discussion, then I would agree with @jandrieu that we may be looking at a process violation.

The fact that this PR was open for review for a full two weeks before the working group took a holiday break, during which time @jandrieu was pinged multiple times personally to invite his feedback, makes it clear there was no process violation.

As Manu says, the process was followed correctly, and there were actually explicit efforts made to request his feedback. @jandrieu , if you would like to submit a new issue for discussion, please do so.

@agropper
Copy link
Contributor

agropper commented Jan 8, 2021

I hope I've been paying adequate attention to this very important issue. I would like to continue discussing this.

Will there be a special topic call? If there is to be a special topic call, I think it would be useful for someone to respond to the points made in #480 (comment)

Should I open a separate issue?

@jandrieu
Copy link
Contributor

jandrieu commented Jan 11, 2021

@msporny and @burnburn Respectfully, that was not my experience.

  1. There was contentious debate on this proposed feature from the beginning.
  2. My opposition was stated and recorded consistently throughout the discussions
  3. It was not recorded in the minutes on 12/22 that state that there was a Final Call for Review
    a. https://www.w3.org/2019/did-wg/Meetings/Minutes/2020-12-22-did
  4. It was recorded on 12/22 that concerns about the PR existed
    a. > Brent Zundel: Manu’s concern was mostly that others might be concerned.
  5. There was no resolution or other directive recorded in the 12/22 call notes
  6. If there HAD been a Final Call for Review on the 12/22 (perhaps there was one verbally on the call), then there still should have been a 7 day period for before merging. Merging on the 27th is would not be consistent with that.
  7. When I voiced my surprise in the 1/5 meeting, I was assured by @brentzundel that it was "merged under the assumption of agreement by group, so please review."
  8. I did so.

To merge it over the holidays in the face of known concerns without even a 7 day does not meet my standard of collaborative consensus building. Since the chairs and editors feel otherwise, I'll move on.

As requested, I will create a new issue focused on herd privacy, which will likely impact this issue and hopefully result in a PR that reverses this change. IMO, resource has no place as a core DID-URL parameter. It should be a parameter for resolution at best.

FWIW, I had an issue partially drafted after the 1/5 call, but I got distracted by the assault on the Capitol and that draft was lost when a power surge power cycled my PC. I'll get something submitted ASAP.

@jandrieu
Copy link
Contributor

To answer @brentzundel's questions:

@jandrieu I don't understand how examples B1 and B2 would work. If only the DID is provided, wouldn't resolution just result in the DID Document? How would the relying party indicate they'd like the resource dereferenced instead of the DID Document resolved?

Yes. Resolution just results in the DID Document. That's ALWAYS true. Regardless of this issue. You need to call the dereferencing function to get the resource, which is Method specific.

In examples *.1 and *.2, it is clear from context that a DID Document is not what is being referenced, just like when I use a http://joeandrieu.com as an HREF value in an HTML anchor tag, the intent is NOT to return the DNS record. I believe this conflation of resolution results and resource dereferencing is the heart of the problem here. I may have missed it in the current spec, but nowhere could I find language stating that dereferencing a naked DID returns the DID Document. We leave it up to the DID Method to define what is returned from dereferencing.

So, the function you seem to be asking for is really a resolver parameter that means ALSO dereference in the same step.

That's the key. Are you calling the resolve function https://pr-preview.s3.amazonaws.com/brentzundel/did-spec/pull/480.html#did-resolution
or the dereference function https://pr-preview.s3.amazonaws.com/brentzundel/did-spec/pull/480.html#did-url-dereferencing? If you resolve, you get the DID Document (and meta-data), if you dereference, you get whatever content-stream the DID Method defines (and meta-data).

For DID-URLs with a service parameter, it is understood that dereferencing would dereference the service endpoint identified by that parameter. I also understand that many of us have assumed that dereferencing a DID without a service parameter would return the DID Document. But that is NOT in the spec. It is up to the method to define those functions. So, you are free to define any dereferencing result you want for your method.

Apart from that, your final example using a service endpoint intrigues me, but I would like to see a PR with proposed spec changes that introduces language to the service endpoint section so I can be sure I understand how such a solution might work.

No spec text changes are required. It is already supported. It would be up to the DID Method to provide a service type that would mean to retrieve an associated asset from whatever registry is in use. Users of that Method could simply put in their DID Document a service of that service type. Just as a resolver must know how to resolve the Methods it handles, so too, dereferencers must know how to dereference service types.

All you need to do is ensure that the output of resolution includes suitable information for that dereferencing to take place, either in the DID Document as a method-specific property or in one of the meta-data fields. You pass that to the dereference function, and you get back your resource.

What also puzzles me is why we have a resource property when we already have an "accept" parameter that could be set to the mime type of the expected resource:

accept
The MIME type of the caller's preferred representation of the DID document. The DID resolver implementation SHOULD use this value to determine the representation contained in the returned did-document-stream if such a representation is supported and available. This property is OPTIONAL. It is only used if the resolveRepresentation function is called and MUST be ignored if the resolve function is called.

You should be able to just use the "accept" property to the dereference function and the dereferencer figures it out. An added benefit there is that you don't have to leak in the URL what resources might be available. Content negotiation via accept header handles all of that. *.1 (a JSON-LD context) would specify "application/ld+json" for its accept property while *.2 (an html image element) would likely use some range of appropriate image types, e.g., "image/jpeg | image/png | image/svg+xml".

@jandrieu
Copy link
Contributor

Herd Privacy discussion moved to issue #539

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.