Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue found on page 'Environment' #4351

Open
oldmoe opened this issue Dec 12, 2024 · 1 comment
Open

Issue found on page 'Environment' #4351

oldmoe opened this issue Dec 12, 2024 · 1 comment
Labels
performance benchmarking and performance

Comments

@oldmoe
Copy link

oldmoe commented Dec 12, 2024

The memory requirements are provided as per thread memory, given certain query types, and it seems to completely ignore the dataset size. e.g. 5GB per thread is good for aggregate queries over what data size? 10TB? 1GB? 3.1MB?

The documentation is not clear on how the data size affects the memory requirements or if it has no effect at all

Page URL: https://duckdb.org/docs/guides/performance/environment.html

@szarnyasg szarnyasg changed the title Issue found on page 'Environment' The memory requirements are provided as per thread memory, given certain query types, and it seems to completely ignore the dataset size. e.g. 5GB per thread is good for aggregate queries over what data size? 10TB? 1GB? 3.1MB? Issue found on page 'Environment' Dec 12, 2024
@szarnyasg szarnyasg added the performance benchmarking and performance label Jan 7, 2025
@szarnyasg
Copy link
Collaborator

Hi @oldmoe, this a good point but it's unfortunately very tricky to answer this question. Typically, the memory requirement – for running the workload in memory – increases with the dataset size but it's even more dependent on the queries. For example, join-heavy workloads, especially workloads with many-to-many joins with large intermediates require more memory than e.g. simple aggregations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance benchmarking and performance
Projects
None yet
Development

No branches or pull requests

2 participants