Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column selection / Filtering / Projections #15

Merged
merged 5 commits into from
May 2, 2024
Merged

Conversation

mkaruza
Copy link
Collaborator

@mkaruza mkaruza commented Apr 22, 2024

  • Read columns that are needed for query execution

closes #11

@mkaruza mkaruza requested a review from Tishj April 22, 2024 10:45
@mkaruza mkaruza force-pushed the projections-selection branch from 3dab29d to 111f266 Compare April 22, 2024 11:12
src/quack_types.cpp Outdated Show resolved Hide resolved
auto &array_mask = duckdb::FlatVector::Validity(result);
array_mask.SetInvalid(offset);
} else {
Datum value = projections.empty() ? values[i] : values[projections[i]];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add tests for the pushdown?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both to verify that the change is working, but more so to serve as a regression test for any future changes

* Read columns that are needed for query
* Skip query if any table is from PG_CATALOG_NAMESPACE or PG_TOAST_NAMESPACE
* Thread safe detoasting
@mkaruza mkaruza force-pushed the projections-selection branch from 111f266 to 24c6c40 Compare April 25, 2024 11:28
@Tishj
Copy link
Collaborator

Tishj commented Apr 25, 2024

I feel like the last commit should be a separate PR after this one is finished, don't you agree?

@mkaruza mkaruza changed the title Projection & column selection Column selection / Filtering / Projections Apr 26, 2024
@mkaruza
Copy link
Collaborator Author

mkaruza commented Apr 26, 2024

Yeah, i agree but this is one big pile of features right now (probably a lot of testing and experimentation)

@mkaruza
Copy link
Collaborator Author

mkaruza commented Apr 26, 2024

Need also testing part, but that will be added soon

* Filter tuple on page level
@mkaruza mkaruza force-pushed the projections-selection branch from 6a92567 to 28e46d3 Compare April 27, 2024 08:25
mkaruza added 2 commits April 27, 2024 16:22
* COUNT(*) doesn't require any columns to be retrieved so we only count
  tuples that pass visibility without fetching.
* Fixed incorrect column id for query filtering
* Writing output vector now works with/without projection information
struct varlena *result;
int32 rawsize;

result = (struct varlena *)duckdb_malloc(VARDATA_COMPRESSED_GET_EXTSIZE(value) + VARHDRSZ);
Copy link
Collaborator

@Tishj Tishj Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duckdb_malloc/free are just wrappers around malloc and free, they don't serve an added benefit
Also these methods are part of the C api, which we're not using here

void *duckdb_malloc(size_t size) {
	return malloc(size);
}

void duckdb_free(void *ptr) {
	free(ptr);
}

@@ -162,6 +185,24 @@ PostgresHeapSeqScan::ParallelScanState::AssignNextBlockNumber() {
return block_number;
}

void
PostgresHeapSeqParallelScanState::PrefetchNextRelationPages(Relation rel) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently unused?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unused - but if this is not something that will be needed will be removed before public release.

@mkaruza mkaruza merged commit 3cdef3a into main May 2, 2024
2 checks passed
@mkaruza mkaruza deleted the projections-selection branch May 2, 2024 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

heap scan: column selection pushdown
2 participants