Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use an invalid url with https protocol for invalid url tests. #32

Merged
merged 1 commit into from
Aug 10, 2023

Conversation

aljones15
Copy link
Contributor

@aljones15 aljones15 commented Aug 2, 2023

Addresses:

#27

Bug: it turns out between the vocab in the Vc2.0 context and the somewhat liberal way json-ld creates relative urls, our existing not-a-url tests were not failing in the way they were intended. This PR updates those tests to use a full invalid url with the https protocol. Further issues will need to be made on json-ld processors as it looks like what json-ld will turn into a relative url is kind of interesting.

@aljones15 aljones15 self-assigned this Aug 2, 2023
Copy link
Member

@msporny msporny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to use multiple different types of non-URLs, and possibly stuff that looks like a URL, but isn't (using a UTF-8 character, for instance, which is supposed to be URL-encoded).

Copy link
Member

@msporny msporny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, the existing "Not a URL" tests, were, in fact, valid URLs because of the way JSON-LD documents are processed.

Concrete change suggestion is to use the new URLs you're using /and/ add a UTF-8 charater to the URL that encodes a non-ASCII UTF-8 character.

You might also want to add a malformed URL that looks like a real URL but fails to parse as a URL in the browser.

@aljones15
Copy link
Contributor Author

aljones15 commented Aug 2, 2023

Just so this is known the new invalid url is this one:

https ://not-a-url.org

Additionally json-ld has no issues with many invalid urls such as:
"http://s[ËËËme.com"
"https://1.2.3.4.5/" (5 part ipv4 fails javascript's url parser, but passes json-ld's url parser)

Both of the above json-ld will use with no errors at all. So for instance an ipv4 url with 5 parts is kind of hard for a human to spot and an understandable mistake that a URL parser will catch, but json-ld will not catch it and allow.

Concrete change suggestion is to use the new URLs you're using /and/ add a UTF-8 charater to the URL that encodes a non-ASCII UTF-8 character.

non-ASCII characters are valid characters unless used in a protocol. So for instance a url with an umulaut in it would technically need to be translated escaped, but in practice url parsers will just automatically escape the non-ascii parts of the url:

> new URL('https://motlËycrËw.com')
URL {
  href: 'https://xn--motlycrw-w1ad.com/',
  origin: 'https://xn--motlycrw-w1ad.com',
  protocol: 'https:',
  username: '',
  password: '',
  host: 'xn--motlycrw-w1ad.com',
  hostname: 'xn--motlycrw-w1ad.com',
  port: '',
  pathname: '/',
  search: '',
  searchParams: URLSearchParams {},
  hash: ''
}

so using a non-ascii character in a url in say a VC is kind of invalid, but for convenience urls with non-ascii characters most parsers just automatically escape the url making adding a non-ascii character to a url pretty much mute outside of a protocol.

> new URL('heËyps://motlËycrËw.com')
Uncaught TypeError [ERR_INVALID_URL]: Invalid URL
    at __node_internal_captureLargerStackTrace (node:internal/errors:478:5)
    at new NodeError (node:internal/errors:387:5)
    at URL.onParseError (node:internal/url:565:9)
    at new URL (node:internal/url:641:5) {
  input: 'heËyps://motlËycrËw.com',
  code: 'ERR_INVALID_URL'
}

Copy link
Contributor

@davidlehn davidlehn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best non-URL forms are. If these correctly fail, that's fine. jsonld.js in particular likely doesn't do full validation for performance reasons. Garbage in, garbage out. Other implementations may have similar issues. Or they may fail earlier or later.

@filip26
Copy link

filip26 commented Aug 3, 2023

just thinking loudly. A positive test, testing that an issuer/verifier accepts valid URI is important, but negative one on malformed URI? How important is this negative test? Given the fact that most strings are treated as a valid relative URI.

  • Positive tests are to confirm a conformance to the standard in terms of being able to correctly accept an input, in this case.
  • Negative tests enforce that nothing other than what's defined by the spec is accepted.

@msporny
Copy link
Member

msporny commented Aug 10, 2023

just thinking loudly. A positive test, testing that an issuer/verifier accepts valid URI is important, but negative one on malformed URI? How important is this negative test? Given the fact that most strings are treated as a valid relative URI.

Yes, we might want to remove this test if implementers feel like this test isn't adding anything worthwhile... if you use an invalid URL, well, your software won't be interoperable with other software that might not process that URL.

Let's merge this and see if implementers complain.

@aljones15 aljones15 merged commit c735b70 into main Aug 10, 2023
@aljones15 aljones15 deleted the fix-not-url-again branch August 10, 2023 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants