Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle timestamp_ntz in delta conversion target #647

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vinishjail97
Copy link
Contributor

Important Read

  • Please ensure the GitHub issue is mentioned at the beginning of the PR

What is the purpose of the pull request

Handle timestamp_ntz in delta target, this needs to be done in a better way by finding this function in delta codebase.
https://docs.delta.io/2.0.0/versioning.html
When creating a table, Delta Lake chooses the minimum required protocol version based on table characteristics such as the schema or table properties

Brief change log

(for example:)

  • Handle timestamp_ntz in delta target

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added TestConversionController to verify the change.
  • Manually verified the change by running a job locally.

@@ -66,9 +66,9 @@
import org.apache.xtable.spi.sync.ConversionTarget;

public class DeltaConversionTarget implements ConversionTarget {
private static final String MIN_READER_VERSION = String.valueOf(1);
private static final String MIN_READER_VERSION = String.valueOf(3);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle this gracefully by calling the function which upgrades version in delta codebase, this was the pending comment from the previous PR that wasn't addressed.

#428

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to timestamp_ntz? Increasing the min_reader version could break certain consumers using old libraries. If it's unrelated, could we create a separate issue for this and perhaps make it configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delta does the automatic version upgrade today based on the schema of the table and other properties, if we call the same function when initializing the delta table it shouldn't break anything.

https://docs.delta.io/2.0.0/versioning.html
When creating a table, Delta Lake chooses the minimum required protocol version based on table characteristics such as the schema or table properties

@@ -66,9 +66,9 @@
import org.apache.xtable.spi.sync.ConversionTarget;

public class DeltaConversionTarget implements ConversionTarget {
private static final String MIN_READER_VERSION = String.valueOf(1);
private static final String MIN_READER_VERSION = String.valueOf(3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to timestamp_ntz? Increasing the min_reader version could break certain consumers using old libraries. If it's unrelated, could we create a separate issue for this and perhaps make it configurable?

@@ -61,6 +61,11 @@ public class DeltaSchemaExtractor {
private static final String DELTA_COLUMN_MAPPING_ID = "delta.columnMapping.id";
private static final String COMMENT = "comment";
private static final DeltaSchemaExtractor INSTANCE = new DeltaSchemaExtractor();
// Timestamps in Delta are microsecond precision by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does XTable need to handle nanoseconds precision?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants