-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test suite #38
Comments
Are there "test suite problems" that are not captured by the updates made as part of the rdf-tests CG? |
See RDF tests issue 51 -- w3c/rdf-tests#51 -- for previous discussions. The draft charter for the SPARQL 1.1 CG specifically recognizes liaison with the "RDF Test Curation Community Group". |
I do not know if it is the right time to discuss of test suite but the first thing to do is to define exactly the same minimal API for all SPARQL services (#27). When the minimal API will be accepted. We can imagine a new solution to test integrally each SPARQL implementation in parallel of works of future WG. In this new solution to test SPARQL implementation, I would like:
I think it's time to indutrialise SPARQL. If the GC officially asks me to participate in the WG to consolidate the next version of SPARQL, I can start to propose a new research project to develop this new platform. |
@kasei Good question. Nearly all files at https://github.com/w3c/rdf-tests/tree/gh-pages/sparql11 are last updated 3-4 years ago. See TFT-tests/issues: I believe the following are bugs in the tests: BorderCloud/TFT-tests#18, BorderCloud/TFT-tests#15, BorderCloud/TFT-tests#20, BorderCloud/TFT-tests#2. We even had some absurd discussions like
Sure enough, tvs02 is still not fixed. BorderCloud/TFT-tests#4 is pervasive: many tests use relative URLs but fix no base. @jeenbroekstra said that's not a bug and Karima's runner adds some base, but I think it is a bug. The other issues are more important: we need a flexible test result comparator (perhaps based on c14n), else there are many false negatives. @afs the W3C Tests CG site doesn't have any posts since 3.5y ago (2015). I asked a couple days ago "Is the activity of this group closed? TFT-tests runs continuous tests over some RDF repos, and tries to fix some of the tests. The biggest improvements needed in this suite is more flexible result comparison by the test runner". The comment is still awaiting moderation: I think that group is closed and gone. Karima replied "I finished my thesis (in french): Karima Rafes. Le Linked Data à l'université : la plateforme LinkedWiki. Université Paris-Saclay, 2019. Français. The chapter 5 is the conclusion of this work. So, the project TFT is in standby and will disappear when W3C offer all tests with a tool such as TFT to validate the compliance with SPARQL. If the tests and the tools to run the tests becomes a prerequisite for validate the specifications, there will be less functionalities but SPARQL 1.2 will not have the interoperability problems of SPARQL 1.1. When the CG will work on the tests needed for SPARQL 1.2, I will try to work with it (if I have the time). Maybe I should have pressed with Ergo the point of this issue: SPARQL test suite activity needs to be restarted, and kept continuous for 3-4 years. Every SPARQL 1.2 feature must come with tests, and there should be a continuous-testing framework in place. Else there is a risk that users won't know which repo implements what and how well, and the new features won't be used much. |
If another group can take over testing that would be great. But it seems to me the W3C Tests CG is disbanded/passive. I think that together with forming this SPARQL 1.2 CG, the Tests CG must be restarted. @iherman and @gkellogg, please comment? |
CG is not disbanded, it has been quiescent for a long time. It makes sense to have this CG to drive SPARQL tests, but may want to work out of the RDF tests CG repo. |
because there have been no fixes needed. https://github.com/w3c/rdf-tests/commits/gh-pages and https://github.com/w3c/rdf-tests/pulls?q=is%3Apr+is%3Aclosed show recent activity. Moving the work across CGs does not change the fact that someone has to do the work. Is there a barrier to contributing to RDF test CG? |
@afs then please move this task to rdf-tests (but change the title to something more descriptive). @gkellogg and @kasei and whoever else was active in rdf-tests, you'll be the best people to continue leading this work! I've long marveled at EARL and how EARL reports are used to generate Implementation Report htmls, a work of beauty. But do you agree with the more ambitious goals that Karima and I have proposed above?
Truth be told I never tried, I didn't know it was active. I (or a QA at ONTO) would love to work with rdf-test to eliminate false negatives. |
If that community wish to take the issue, then fine. I do not believe pushing it at them is productive. There is RDF tests issue 51 -- w3c/rdf-tests#51 -- for previous discussions. Work on a test runner does not need any permission from anyone but the idea of changing SPARQL to fit one particular runner seems a bad idea. Base URI handling is explained in the SPARQL test suite. RFC 3986 section 5.1 explains the general mechanism that applies to all URI resolution. |
I'm not seeking permission, I seek willingness for collaboration on this important topic. Do you think it'd be important to run a centralized continuous test runner for everyone's benefit?
Don't know what gave you that idea. I think that using relative URLs in tests without |
Base URL resolution is well defined. Beyond this issue, there have been other suggestions (e.g. in #27) to make backwards incompatible changes for the benefit of testing. I strongly agree with @afs that this sort thing would be a bad idea. |
RDFa did something like that, which was a pain. Every implementation must maintain a service to respond to test queries. In reality, it was a lot of work. Today, you might use containerized apps, but might be better to define a CI best practice for implementations to use to run the tests, and potentially send an update report. Conceivably, the implementation report could be automatically updates, but it’s required a lot of hand holding in the past.
I don’t see that it would eliminate false negatives, as C14N and Isomorphism effectively allow equivalent comparisons. C14N might generate more useful diffs when results don’t compare.
Consider joining the CG. |
My continuous testing framework works already via Travis CI and the results of tests are collected in a RDF database via a SPARQL service. But for the federated query protocol, my first implementation is insufficient. We have to imagine another method in the future. |
But when a test doesn't define a
I myself don't know what Karima means by #27. But don't throw away the baby with the bath water. Have you looked at http://sparqlscore.com and what do you think of it?
Most vendors (and I speak for one) have eval or free versions, that's what Karima used for her service. Vendors have an interest in perfecting their score. Karima's done a good job, but she needs the support of the RDF Test CG to keep it going and to improve it.
Have you considered the reproducibility of the Implementation Report? If I want to check all claimed results, what am I to do?
I'll speak to colleagues at ONTO.
It's easier to compare two c14n-ed result sets (the etalon and the SUT (system under test) response). The SUT response often can include extra triples, which the comparator must allow.
Yes, what do you use as counterparty server for Federated queries is a difficult question.
|
The tests run from manifest files, which are Turtle. Suppose the manifest file is
when that is read by a Turtle parser, the RDF term for |
@afs Exactly my point: what is the actual value of @kasei comment on Protocol validation: #1 (comment). Would be great to include protocol tests in the suite. |
It is wherever the test suite resides. It is not fixed and does not need to be. This allows people to download the suite and run it locally as they have done. (After all, it is mostly the test suite for query engines.) This has been discussed at length before. What is the problem you are facing with relative URI resolution to make the test suite portable? |
It's wrong. A unique and reproductible test suite is not a optional tool when we want to build a real interoperability for the Semantic Web. I demontrated it is possible to use the same protocol test suite to evaluate our interoperability. It's free and reproductible online by anybody. It's a excellent news for the next version of SPARQL, isn't it ? It's time to use the same test suite to build a real interoperability for SPARQL 1.1 and 1.2 and 2.0... |
Andy's comment "mostly the test suite for query engines" applies to the question of whether queries should specify their On the other hand, I believe that protocol tests are definitely fair game for such a test suite. |
Please update sparqlscore to work with RDF 1.1. |
I would like... but the test suite is implemented in RDF 1.0 (Turtle 1.0). I'm not sure I understood the meaning of the sentence. Sparqlscore loads the turtle 1.0 of the official test suite (compliant in theory with 1.1). |
The issue for sparqlscore seems to be in the comparison of results. In RDF1.1, simple strings and xsd:string are the same thing and there is a preference for omitting the datatype. For running tests, it is the comparison that can handle that even if up until then a mix of simple strings and xsd:string happens. |
@afs I dream... In the future version of test suite, the SPARQL results should be strictly the same for the same query on the same data for any query engines (and ofcourse with the same protocol). |
@BorderCloud surely it's better to support the current standard than keep outdated implementations appearing to pass while ensuring new implementations appear to fail? sparqlscore.com says:
(Emphasis added.) I read that as implying the current standards, so if that's not what you're choosing to do, you might want to explicit state as much.
The nature of scheduling different working groups and their related standards will make your dream very difficult to achieve in practice. In practice, however, I think there is already broad consensus around the test suite and what counts as a conforming implementation. |
@BorderCloud It will not invalidate a result from an RDF 1.0 based engine. |
I think that's true for everything except two tests. This rdf-tests commit explains the reasoning, and removes the old tests from the manifest list. |
@afs @kasei The attribute "datatype" seems required (for RDF 1.0 or 1.1). There is not a default type when the attribute "datatype" not exists. |
I'm not sure what the problem is. Could you provide some more context? Possibly helpful to this discussion, I'll point out that the RDF 1.1 Concepts and Abstract Syntax has this to say about literals:
|
I'm not sure, it's the best place for this discussion... This is only one of problems that still need to explicitly specify in the next version. |
The datatype attribute was not required at SPARQL 1.0.
You are right there is no default datatype because in RDF 1.0 plain strings didn't have a datatype. |
This dream is neither realistic nor necessary. Query engines are allowed some flexibility eg
We need a more flexible comparator |
Comparing serialized results via byte-by-byte equality is brittle. Using a canonical serialization or testing result graph isomorphism helps, but as you mention above, there are still cases, in which we want to give query engines more leeway. In such cases, we can define looser tests via invariants (e.g., |
Comparing two results sets: |
Unordered |
Yes - trying to avoid parsing the results in some way becomes more trouble than its worth, effectively becoming a parser eventually. After all, XML and JSON allow layout variations and engines need room to deliver implementation choices and optimizations. Sounds to me like something to be written up as a "Practice and Experience" note. |
I had a chat with Nikolay Kolev, one of our leading testers.
|
@VladimirAlexiev Don't forget to insert also the tests about the protocol. |
We have a rather basic test suite based on bash and curl: |
@namedgraph Great ! |
Note that the RDF Test Suite Curation CG has taken on curation of RDF and SPARQL test suites, and there have been a number of additions and corrections. https://github.com/w3c/rdf-tests/tree/gh-pages/sparql11. Issues and PRs are welcome there. Of course, this is not official, as there is no active WG, but it has proven to be a useful resource fo the community. |
Hello I updated my works about test suite with SPARQL (may be for the last time ?). You can find here a draft report for SPARQL 1.1. With sponsors, I can develop all the tests on the protocols (query, update and error messages) and I can generate all the possible combinations between all the SPARQL services that really want to share the same protocol. With this approach, the working group of SPARQL1.2 can remove words like "should be", "Want to be", etc. from the specification and only precise the tests for each functionality in the official repository of W3C. I have proven that it is possible to automate tests with SPARQL protocol. It's time to recommend only one protocol at SPARQL 1.2. Hope my work helps you build a better SPARQL. |
The sparql 1.1 test suite is useful, but
@BorderCloud (Karima Rafes) has been running http://sparqlscore.com/ valiantly for 4 years (see documentation), added some tests and fixed some; and given up on others because of ambiguities in the spec.
She proposed and I support that whatever 1.2 features are standardized by this group, should have tests. I also put forward that this group should try to fix 1.1 test suite problems, and help w3c host a continuous testing harness.
The biggest improvements needed on this testing site are
Karima please add more from recent emails
The text was updated successfully, but these errors were encountered: