-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add class FragmentID. #4607
Add class FragmentID. #4607
Conversation
This pull request has been linked to Shortcut Story #38659: Implement class FragmentURI for succinct fragment parsing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of code change here because of the removal of get_timestamp_range
. We don't want to enlarge the scope of the PR unnecessarily, but there's one deficiency that we need to correct in the PR: there are no tests with invalid URI inputs. Parsers can fail in more ways than they can succeed. On cursory examination, the parser is going to need work ensuring that it's tight. We don't need to do all that work in this PR, but we do need to make sure that we have test infrastructure now.
- Fix the defect that is
assert
in the constructor. - Write a table-driven test that ensure that invalid inputs fail.
- Populate the table with invalid inputs.
- There's one class of errors that the
assert
will find. - The empty string should be an input, as well as one-character strings of all the punctuation characters.
- It will be very ease to make fragment version 1 and 2 strings that
sscanf
chokes on.
- There's one class of errors that the
- Populate a second table with invalid inputs that the current parser does not correctly throw an exception for. Instead of fixing the parser in this PR, call this the "known failure" table and test accordingly.
I've been thinking for a while that using the term "URI" everywhere has been something of a conflation of meaning and representation. We're using URI as the means by which we're implementing identifiers, and that the result is an identifier. The upshot is some name recommendations.
class FragmentID
in namespacetiledb::sm
fragment_identifier.h
We want to have FragmentID
stand alone without need for a separate URI object.
- Construction from URI would have to call the copy constructor of
class URI
to initialize the base class. - A more typical usage would construct directly from
std::string_view
.
We need both version accessors.
tiledb/sm/array/array_directory.cc
Outdated
|
||
if (fragment_version < 12) { | ||
utils::parse::FragmentURI frag_uri{fragment_uri}; | ||
if (frag_uri.version() < 12) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of how the array format version is being used to stand in place of the fragment URI version. What we want in this PR is the function that we'll use to rewrite with. We don't need to rewrite this function at the time.
Also note that the function that this is used in, get_commit_uri
, itself belongs somewhere in the storage_format
directory. When we rewrite this function to use the more appropriate version number, we can move it to that directory at the same time.
Note that this function doesn't use any member variables of ArrayDirectory
, which is one of those signs that the function would be better defined elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does use ArrayDirectory::uri_
, but that could be passed in from its accessor function.
(std::string("%") + std::string(PRId64)).c_str(), | ||
(long long int*)×tamp_range->first); | ||
timestamp_range->second = timestamp_range->first; | ||
(long long int*)×tamp_range_.first); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sscanf
does not allow any decent error handling. Probably need to rewrite this.
39396d7
to
181bebe
Compare
…er, subsequent related changes and cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new failure tests are exactly what's necessary. We need some success tests as well and then we'll be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with two small caveats.
- The problem with the definition with stats needs to be addressed in the dependency. It should not be patched over at the point of use.
- Issue with optional array-format version number in fragment-format version 3.
Both of these should be addressed with followup stories and future PR.
this_target_object_libraries(baseline vfs) | ||
if(TILEDB_STATS) | ||
this_target_compile_definitions(-DTILEDB_STATS) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definition certainly does not belong here. Fixing it here is masking a defect elsewhere in the build.
This should be on the baseline
directory. It will then propagate through the interface properties of the object library.
std::pair{1, 2}, | ||
3, | ||
5}, | ||
}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a test vector for a fragment-format version 3 identifier with an optional array-format identifier at the end. At least it's documented as optional.
Not making this a blocker, but it's an inconsistency between the documentation and the tests.
@@ -16,4 +16,4 @@ _Note_: The presence of `[]` is indicative of an optional parameter. | |||
| :-: | :-: | :-: | | |||
| 1 - 2 | 1.4 - 1.5 | `__uuid_t1[_t2]` | | |||
| 3 - 4 | 1.6 - 1.7 | `__t1_t2_uuid` | | |||
| 5+ | 2.0+ | `__t1_t2_uuid_[v]` | | |||
| 5+ | 2.0+ | `__t1_t2_uuid[_v]` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment about inconsistent/missing unit test below.
Add class FragmentID to validate, parse and handle components of a fragment uri.
TYPE: NO_HISTORY
DESC: Add class FragmentID.