Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add the self-optimizing.partition-filter parameter. #3424

Open
2 of 3 tasks
lintingbin opened this issue Jan 23, 2025 · 0 comments · May be fixed by #3426
Open
2 of 3 tasks

[Feature]: Add the self-optimizing.partition-filter parameter. #3424

lintingbin opened this issue Jan 23, 2025 · 0 comments · May be fixed by #3426
Labels
type:feature Feature Requests

Comments

@lintingbin
Copy link
Contributor

Description

Specify the partition range for Amoro optimization by adding self-optimizing.partition-filter, similar to the function of the where parameter in Spark's rewrite_data_files procedures.

Use case/motivation

Currently, when Amoro optimizes Iceberg tables, it defaults to optimizing data from all partitions. However, this can bring about the following issues in practical use:

High cost of optimizing historical data: Some tables' historical data may not conform to Amoro's optimization rules, and optimizing all historical data can lead to resource waste and performance degradation.

Conflict between concurrent writing and optimization: Some historical partitions may have data repair operations involving deletions, which can take a relatively long duration. In such cases, it's preferable for Amoro to skip the optimization of these partitions to avoid conflicts.

Describe the solution

Add the self-optimizing.partition-filter parameter.

Subtasks

  • Add the self-optimizing.partition-filter parameter. @lintingbin

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@lintingbin lintingbin added the type:feature Feature Requests label Jan 23, 2025
@lintingbin lintingbin linked a pull request Jan 24, 2025 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature Feature Requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant