Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Limit number of sources in merged scan task #3695

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

colin-ho
Copy link
Contributor

@colin-ho colin-ho commented Jan 16, 2025

Don't merge more than N (default 10) sources in a scan task. This is so that we don't over merge.

@github-actions github-actions bot added the feat label Jan 16, 2025
Copy link

codspeed-hq bot commented Jan 16, 2025

CodSpeed Performance Report

Merging #3695 will degrade performances by 19.68%

Comparing colin/cap-scan-task-merge (a5b9f5e) with main (603199f)

Summary

⚡ 1 improvements
❌ 1 regressions
✅ 25 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
test_iter_rows_first_row[100 Small Files] 160.9 ms 200.3 ms -19.68%
test_show[100 Small Files] 20.2 ms 16.6 ms +21.86%

Copy link

codecov bot commented Jan 16, 2025

Codecov Report

Attention: Patch coverage is 80.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 77.04%. Comparing base (4b67e5a) to head (a5b9f5e).
Report is 18 commits behind head on main.

Files with missing lines Patch % Lines
src/common/daft-config/src/python.rs 57.14% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3695      +/-   ##
==========================================
- Coverage   77.84%   77.04%   -0.81%     
==========================================
  Files         732      729       -3     
  Lines       90605    92649    +2044     
==========================================
+ Hits        70534    71380     +846     
- Misses      20071    21269    +1198     
Files with missing lines Coverage Δ
daft/context.py 87.65% <ø> (ø)
src/common/daft-config/src/lib.rs 82.50% <100.00%> (+0.22%) ⬆️
src/daft-scan/src/scan_task_iters/mod.rs 90.98% <100.00%> (-0.31%) ⬇️
src/common/daft-config/src/python.rs 65.93% <57.14%> (-0.28%) ⬇️

... and 145 files with indirect coverage changes

@colin-ho colin-ho requested a review from jaychia January 28, 2025 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant