-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback #15
Comments
Hey, thanks for the thoughtful comments.
I wouldn't say browsers are the driving use case, but consistency (determinism) across browsers and desktop implementations (and even hosted services) is. For example:
There's some tension here because BLAKE3 is heavy to run in browsers but useful for large files (@bnewbold goes into more detail on atproto's divergent needs in #1).
Noted! People are saying both "this is too IPFS-centric" and "this isn't IPFS-related enough" so there's clearly something here... |
Re: gozala's just-in-time chunking, there is a proposal open to make these merkle-aware CIDs an alternative to CIDv1s, i.e. CIDvM (not calling it v2 cuz I'm not sure there's backwards compatibility worked out or promised). It's not quite "ready to merge" and giev up one of the precious single-byte multicodecs and lots of big design questions are open, but it's something I'm tracking re: future of multiformats and seems relevant here as a back-of-mind/long-tail planning consideration. On the other hand, I don't think having DASLs point to the "whole file hash" rather than to the root CID of a presumed [recursively retrieved] hash is only about incremental verification and just-in-time chunking; it's also about a cleaner interop with other use-cases, like CIDs that function as NIHs/dataURIs, CIDs that can be binary master-keys in package managers/binary archives, etc etc. A better framing would be that "chunking at transport time" is one of many use-cases where people want an identifier for the "whole content", not a promise that may timeout in unpredictable recursive delivery 😈 |
This part I think I get.
Further use case details would be helpful.
As I understand it, DASL is just a subset of CID around which you propose to do interoperability. If this is case, I'd just say that, and skip the "we're making a new standard".
The reason I feel this way is that compatibility with CIDS seems to be an explicit goal. Even in this reduced spec you're willing to pay a price on complexity to maintain CID compatibility. (cause why else have these unused bytes for multibase, version, encoding and hash algo). Is not breaking compatibility with CIDs in the future also a goal? It seems that way.
In this case, it's a useful subset/convention, closer to something like Javascript strict mode or using prettier to format your code.
Again, still very useful, and especially useful if it's well defined the target context you'd want to use this with (I get the feeling it's CIDs in browsers, but again not 100% clear)
I think the first thing to decide is if this is a subset that will be permanently compatible with the larger CID standard or not. If compatibility is intended, then even if you follow @b5 's suggestion to "hide the IPFS" the fact that it is an explicit subset of a larger standard is something to remain explicit about. Otherwise it's implied insider knowledge.
One piece stands out as very different from the others:
"Regardless of size, resources should not be "chunked" into a DAG or Merkle tree (as historically done with UnixFS canonicalization in IPFS systems) but rather hashed in their entirety and content-addressed directly."
It stands out because it's the only thing that goes well beyond "I will limit what is allowed for CIDs". I can somewhat intuit the goal here (incremental verifiability is certainly not natively supported in browsers), but it's not explicit.
Also as a spec, absent the context of IPFS and UnixFS annoyances, it's super unclear what constitutes linked data vs chunked data. Clearly there are some DAGs here unless you're saying CIDs are not allowed as fields in dCBOR 42 -- which given you're using 0x71 is implied. I think maybe you just need to define more precisely what a "resource" is (maybe you mean an HTTP resource?).
BTW, chunked encoding is a real transport concern, I tend to feel our fail in IPFS design was doing it BEFORE the moment of transport rather than just in time. That's why Blake3 can be kinda awesome -- you can chunk at transport time rather than encoding time. There's even an interesting proposal from @Gozala to do Blake3 for structured IPLD data in a very interesting way: https://github.com/Gozala/merkle-reference/blob/main/docs/spec.md
The text was updated successfully, but these errors were encountered: