-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add delay to avoid rate limits #43
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe changes introduce a new Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
graphrag_sdk/kg.py (3)
137-144
: Consider adding validation for the delay parameter.While the parameter addition is good, consider adding validation to ensure the delay value is non-negative to prevent potential misuse.
def process_sources( self, sources: list[AbstractSource], instructions: str = None, delay: float = 0 ) -> None: + if delay < 0: + raise ValueError("Delay must be non-negative")
156-156
: Consider optimizing rate limit handling.The current implementation adds delay after adding sources to the set, but the rate-limited operations likely occur in
_create_graph_with_sources
. Consider:
- Moving the delay into
_create_graph_with_sources
where the actual API calls happen- Adding debug logging to help track rate limit occurrences
def _create_graph_with_sources( self, sources: list[AbstractSource] | None = None, instructions: str = None ): + logger.debug(f"Processing sources with delay: {delay} seconds") for source in sources: step = ExtractDataStep( sources=[source], ontology=self.ontology, model=self._model_config.extract_data, graph=self.graph, ) step.run(instructions) + if delay > 0: + logger.debug(f"Applying rate limit delay: {delay} seconds") + time.sleep(delay)
Consider using the
ratelimit
decorator for consistent rate limitingThe codebase already uses the
ratelimit
package with@limits
and@sleep_and_retry
decorators in bothextract_data_step.py
andcreate_ontology_step.py
for rate limiting API calls (15 calls per 60 seconds). For consistency, consider replacing the manualtime.sleep(delay)
implementation with the same decorator pattern.
graphrag_sdk/kg.py
: Replace manualtime.sleep(delay)
with@limits
and@sleep_and_retry
decorators from theratelimit
package- Consider standardizing the rate limits across the codebase (currently 15 calls/60s in other files)
🔗 Analysis chain
Line range hint
137-156
: Verify consistency with existing rate limiting patterns.Let's check if there are similar rate limiting implementations in the codebase that we could learn from or standardize with.
Let me gather more context about the other rate limiting implementations to make a proper assessment.
Let me check how the
ratelimit
decorators are being used in these files to understand the complete rate limiting pattern.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for existing rate limiting patterns echo "Searching for time.sleep usage..." rg "time\.sleep" echo "Searching for rate limit related comments..." rg -i "rate limit" echo "Searching for delay parameter patterns..." rg "delay:\s*float"Length of output: 799
Script:
#!/bin/bash # Check the implementation in extract_data_step.py and create_ontology_step.py echo "Checking extract_data_step.py implementation:" rg -A 5 -B 5 "time\.sleep\(10\)" graphrag_sdk/steps/extract_data_step.py echo -e "\nChecking create_ontology_step.py implementation:" rg -A 5 -B 5 "time\.sleep\(10\)" graphrag_sdk/steps/create_ontology_step.py # Check if there are any rate limit related imports or decorators echo -e "\nChecking for rate limit related imports:" rg "import.*rate.*limit" -i rg "from.*rate.*limit" -iLength of output: 1625
Script:
#!/bin/bash # Check how ratelimit decorators are used echo "Checking ratelimit usage in extract_data_step.py:" rg -A 10 -B 2 "@.*limits" graphrag_sdk/steps/extract_data_step.py echo -e "\nChecking ratelimit usage in create_ontology_step.py:" rg -A 10 -B 2 "@.*limits" graphrag_sdk/steps/create_ontology_step.pyLength of output: 1125
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
graphrag_sdk/kg.py
(3 hunks)
🔇 Additional comments (1)
graphrag_sdk/kg.py (1)
3-3
: LGTM: Import placement is correct.
The time
module import is appropriately placed with other standard library imports.
simple time.sleep() parameter to add delay
Summary by CodeRabbit
New Features
delay
, in the process of handling sources to manage rate limits with customizable pauses.Documentation
delay
parameter in the process_sources method.