-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Lint to CI (Ruff & Black) #63
base: main
Are you sure you want to change the base?
Conversation
Warning Rate limit exceeded@gkorland has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 13 minutes and 43 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (32)
WalkthroughThe pull request introduces two new steps in the GitHub Actions workflow for running tests. The first addition is a step that utilizes the Changes
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
.github/workflows/test.yml (3)
36-37
: Consider additional Ruff configuration optionsWhile the basic setup works, consider enhancing the linting configuration:
- Pin to a specific version (e.g.,
v3.0.0
) for reproducibility- Add
rules
parameter to specify which checks to enable/disable- Configure
format
parameter for better CI outputExample configuration:
- uses: astral-sh/ruff-action@v3 with: + rules: "E,F,B,I" # Example: Enable error, flake8, bugbear, and isort rules + format: "github" # Format output for better GitHub integration src: >-
38-41
: Simplify path specifications and consider adding pyproject.tomlThe source paths can be simplified, and Ruff configuration could be moved to a configuration file:
- Simplify paths:
with: src: >- - "./graphrag_sdk" - "./examples" - "./tests" + graphrag_sdk + examples + tests
- Consider adding a
pyproject.toml
file with Ruff configuration:[tool.ruff] target-version = "py3" select = ["E", "F", "B", "I"] # Example rules ignore = [] # Add rules to ignore
42-42
: Remove trailing spacesThere are trailing spaces on this line that should be removed.
- +🧰 Tools
🪛 yamllint (1.35.1)
[error] 42-42: trailing spaces
(trailing-spaces)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/test.yml
(1 hunks)
🧰 Additional context used
🪛 yamllint (1.35.1)
.github/workflows/test.yml
[error] 42-42: trailing spaces
(trailing-spaces)
🔇 Additional comments (1)
.github/workflows/test.yml (1)
36-36
: Well-placed linting step!Good placement of the Ruff linting step early in the workflow, right after Python setup. This ensures code quality issues are caught before proceeding with dependency installation and testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
graphrag_sdk/orchestrator/steps/parallel.py (2)
5-5
: Optimize imports to avoid redundancyConsider importing
StepBlockType
directly and removing the module-level import sincePlanStep
is now directly imported. This would make the dependencies clearer and reduce namespace pollution.-import graphrag_sdk.orchestrator.step +from graphrag_sdk.orchestrator.step import PlanStep, StepBlockType -from graphrag_sdk.orchestrator.step import PlanStep
62-62
: Simplify StepBlockType usage for consistencyThe class inheritance has been simplified to use the direct
PlanStep
reference, butStepBlockType
is still being accessed through the full module path. For consistency, consider using a direct import forStepBlockType
as suggested in the import optimization above.class ParallelStep(PlanStep): # ... @property def block(self) -> graphrag_sdk.orchestrator.step.StepBlockType: - return graphrag_sdk.orchestrator.step.StepBlockType.PARALLEL + return StepBlockType.PARALLELAlso applies to: 74-75
graphrag_sdk/source.py (1)
2-2
: Consider enforcing abstract methodsThe
AbstractSource
class doesn't enforce any abstract methods despite being an abstract base class. Consider markingload()
as abstract to ensure all subclasses implement it.from abc import ABC +from abc import abstractmethod class AbstractSource(ABC): def __init__(self, path: str): self.path = path self.loader = None self.instruction = "" + @abstractmethod def load(self) -> Iterator[Document]: - return self.loader.load() + passgraphrag_sdk/document_loaders/pdf.py (1)
18-22
: Consider using a more readable import patternWhile using
ModuleNotFoundError
is more specific and appropriate, the__import__
approach is less readable than the standard import pattern.try: - __import__('pypdf') + import pypdf # noqa: F401 except ModuleNotFoundError: raise ModuleNotFoundError( "pypdf package not found, please install it with " "`pip install pypdf`" )🧰 Tools
🪛 Ruff (0.8.2)
20-22: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
.github/workflows/test.yml (1)
36-40
: Remove trailing spaces and consider adding ruff configuration
- Remove trailing spaces on line 40.
- Consider adding a
pyproject.toml
orruff.toml
to configure ruff rules for the project.- uses: astral-sh/ruff-action@v3 with: src: >- "./graphrag_sdk" - +Would you like me to help you create a basic ruff configuration file with recommended settings for your project?
🧰 Tools
🪛 yamllint (1.35.1)
[error] 40-40: trailing spaces
(trailing-spaces)
graphrag_sdk/steps/extract_data_step.py (1)
Line range hint
1-20
: Consider additional code quality improvements.While the current changes improve code quality, here are some additional suggestions that align with the PR's objective:
- Consider organizing imports into standard library, third-party, and local imports
- Replace print statements in
_create_entity
and_create_relation
with proper logging- Consider adding more context to error messages, such as including the actual values that caused the error
Example of organized imports:
# Standard library import os import time import json import logging from threading import Lock from uuid import uuid4 from concurrent.futures import Future, ThreadPoolExecutor # Third-party from tqdm import tqdm from falkordb import Graph from ratelimit import limits, sleep_and_retry # Local from graphrag_sdk.steps.Step import Step from graphrag_sdk.source import AbstractSource from graphrag_sdk.models.model import OutputMethod from graphrag_sdk.helpers import extract_json, map_dict_to_cypher_properties from graphrag_sdk.ontology import Ontology from graphrag_sdk.models import ( GenerativeModel, GenerativeModelChatSession, GenerationResponse, FinishReason, ) from graphrag_sdk.fixtures.prompts import ( EXTRACT_DATA_SYSTEM, EXTRACT_DATA_PROMPT, FIX_JSON_PROMPT, COMPLETE_DATA_EXTRACTION, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (15)
.github/workflows/test.yml
(1 hunks)graphrag_sdk/agents/agent.py
(0 hunks)graphrag_sdk/agents/kg_agent.py
(0 hunks)graphrag_sdk/attribute.py
(0 hunks)graphrag_sdk/document_loaders/pdf.py
(1 hunks)graphrag_sdk/helpers.py
(1 hunks)graphrag_sdk/models/litellm.py
(1 hunks)graphrag_sdk/orchestrator/__init__.py
(0 hunks)graphrag_sdk/orchestrator/execution_plan.py
(1 hunks)graphrag_sdk/orchestrator/orchestrator.py
(0 hunks)graphrag_sdk/orchestrator/steps/parallel.py
(3 hunks)graphrag_sdk/orchestrator/steps/summary.py
(0 hunks)graphrag_sdk/source.py
(1 hunks)graphrag_sdk/steps/create_ontology_step.py
(3 hunks)graphrag_sdk/steps/extract_data_step.py
(2 hunks)
💤 Files with no reviewable changes (6)
- graphrag_sdk/agents/kg_agent.py
- graphrag_sdk/orchestrator/orchestrator.py
- graphrag_sdk/orchestrator/steps/summary.py
- graphrag_sdk/agents/agent.py
- graphrag_sdk/attribute.py
- graphrag_sdk/orchestrator/init.py
✅ Files skipped from review due to trivial changes (4)
- graphrag_sdk/models/litellm.py
- graphrag_sdk/helpers.py
- graphrag_sdk/steps/create_ontology_step.py
- graphrag_sdk/orchestrator/execution_plan.py
🧰 Additional context used
🪛 yamllint (1.35.1)
.github/workflows/test.yml
[error] 40-40: trailing spaces
(trailing-spaces)
🔇 Additional comments (5)
graphrag_sdk/orchestrator/steps/parallel.py (2)
38-38
: LGTM! Clean type annotation updateGood improvement replacing the string literal type with a direct reference, which is cleaner and more maintainable.
40-40
: LGTM! Consistent type annotationsGood consistency maintaining the same type annotation style between the property and constructor parameter.
graphrag_sdk/source.py (1)
Line range hint
137-156
: LGTM! Type hints and parameter additions look good.The addition of type hints for
path
parameters and the newrows_per_document
parameter in CSV and JSONL classes improves code quality and consistency.graphrag_sdk/steps/extract_data_step.py (2)
176-176
: LGTM! Simplified logging statement.The logging statement has been simplified while maintaining its functionality, which aligns with good logging practices.
190-190
: LGTM! Improved error message clarity.The error message has been refined to be more concise while still providing the essential information about missing required fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (1)
graphrag_sdk/models/azure_openai.py (1)
Client initialization check required in ask() method
The review comment is correct. The
self.client
is only initialized whenwith_system_instruction()
is called, butask()
can be invoked without prior initialization, leading to a NoneType error. Consider either:
- Initializing the client in
__init__
- Adding a client initialization check in
ask()
🔗 Analysis chain
Line range hint
95-105
: Ensure client is connected before making requestsIn the
ask
method,self.client
may not be initialized if_connect_to_model
hasn't been called. This can lead to runtime errors when attempting to create chat completions. Verify that the client is connected before making API calls.Run the following script to check where
_connect_to_model
is called:🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Find all calls to '_connect_to_model' in the codebase. rg '_connect_to_model\(' --type pyLength of output: 182
Script:
#!/bin/bash # Get the class definition and context around _connect_to_model usage ast-grep --pattern 'class AzureOpenAI { $$$ }' graphrag_sdk/models/azure_openai.py echo "=== Context around _connect_to_model call ===" rg '_connect_to_model\(' -A 5 -B 5 graphrag_sdk/models/azure_openai.pyLength of output: 1043
Script:
#!/bin/bash # Find the ask method implementation and its context ast-grep --pattern 'def ask($$$) { $$$ }' graphrag_sdk/models/azure_openai.py echo "=== Class structure ===" ast-grep --pattern 'class AzureOpenAI { def __init__($$$) { $$$ } $$$ }' graphrag_sdk/models/azure_openai.pyLength of output: 259
Script:
#!/bin/bash # Get the complete file content echo "=== Complete file content ===" cat graphrag_sdk/models/azure_openai.pyLength of output: 9160
🧹 Nitpick comments (12)
graphrag_sdk/orchestrator/step_result.py (1)
Line range hint
3-22
: Well-structured abstract base class with comprehensive interface.The
StepResult
ABC provides a robust contract for step results with proper type hints and essential methods. The design follows good practices by:
- Using property decorator for output
- Including serialization methods (to_json/from_json)
- Providing string representation methods
Consider adding docstrings to document the expected behavior of each abstract method, especially the format expected by
to_json
andfrom_json
.class StepResult(ABC): + """Abstract base class defining the interface for step execution results.""" @property @abstractmethod def output(self) -> str: + """Returns the output string representation of the step result.""" pass @abstractmethod def to_json(self) -> dict: + """Serializes the step result to a JSON-compatible dictionary.""" passgraphrag_sdk/orchestrator/step.py (2)
Line range hint
5-23
: Consider simplifying the StepBlockType.from_str method.The current implementation using multiple if-elif statements could be simplified using a mapping or match statement.
@staticmethod def from_str(text: str) -> "StepBlockType": - if text == StepBlockType.PARALLEL: - return StepBlockType.PARALLEL - elif text == StepBlockType.AGENT: - return StepBlockType.AGENT - elif text == StepBlockType.SUMMARY: - return StepBlockType.SUMMARY - elif text == StepBlockType.USER_INPUT: - return StepBlockType.USER_INPUT - else: + valid_types = { + StepBlockType.PARALLEL, + StepBlockType.AGENT, + StepBlockType.SUMMARY, + StepBlockType.USER_INPUT + } + if text not in valid_types: raise ValueError(f"Unknown step block type: {text}") + return text
44-53
: Enhance error handling in from_json method.The method could benefit from more robust error handling:
- Validate required fields in JSON
- Provide more descriptive error messages
def from_json(json: dict | str) -> "PlanStep": json = json if isinstance(json, dict) else loads(json) from graphrag_sdk.orchestrator.steps import PLAN_STEP_TYPE_MAP + if "block" not in json: + raise ValueError("Missing required field 'block' in JSON") + block = StepBlockType.from_str(json["block"]) step_type = PLAN_STEP_TYPE_MAP[block] if step_type is None: - raise ValueError(f"Unknown step block type: {block}") + raise ValueError(f"No implementation found for step block type: {block}")graphrag_sdk/models/model.py (1)
10-14
: LGTM! Consider adding docstring documentation.The
OutputMethod
enum is well-structured and follows Python best practices. Consider adding a docstring to document the purpose and usage of each enum value.class OutputMethod(Enum): + """Defines the available output methods for model responses. + + Attributes: + JSON: Format the response as JSON + DEFAULT: Use the default response format + """ JSON = "json" DEFAULT = "default"graphrag_sdk/models/azure_openai.py (1)
Line range hint
190-197
: Use a constant for maximum message lengthIn the
send_message
method, the message is sliced tomessage[:14385]
. Hardcoding this value can lead to maintenance challenges. Consider defining a constant or using the model's maximum token limit to make the code more flexible and maintainable.Apply this diff to introduce a constant:
+MAX_MESSAGE_LENGTH = 14385 # Adjust based on model's token limit def send_message( self, message: str, output_method: OutputMethod = OutputMethod.DEFAULT ) -> GenerationResponse: """ Send a message in the chat session and receive the model's response. """ - prompt.append({"role": "user", "content": message[:14385]}) + prompt.append({"role": "user", "content": message[:MAX_MESSAGE_LENGTH]})tests/test_kg_litellm_openai.py (2)
89-89
: Consider enhancing test assertionsThe test only checks the number of actors but could be more comprehensive. Consider adding assertions for:
- The correctness of the answer format
- The presence of specific expected information
- Edge cases with different input data
Also applies to: 93-96
89-89
: Use raw string for regex patternWhen using regex patterns, it's recommended to use raw strings (
r"pattern"
) to avoid any potential issues with escape sequences.- actors_count = re.findall(r"\d+", answer) + actors_count = re.findall(r"\d+", answer).github/workflows/test.yml (1)
36-44
: Remove trailing whitespace in YAMLThere's a trailing whitespace on line 44 that should be removed for consistency.
src: >- "./graphrag_sdk" - +🧰 Tools
🪛 yamllint (1.35.1)
[error] 44-44: trailing spaces
(trailing-spaces)
graphrag_sdk/steps/create_ontology_step.py (2)
65-69
: Simplify nested context managersUse a single
with
statement with multiple contexts for better readability.- with tqdm( - total=len(self.sources) + 1, - desc="Process Documents", - disable=self.hide_progress, - ) as pbar: - with ThreadPoolExecutor(max_workers=self.config["max_workers"]) as executor: + with tqdm( + total=len(self.sources) + 1, + desc="Process Documents", + disable=self.hide_progress, + ) as pbar, ThreadPoolExecutor( + max_workers=self.config["max_workers"] + ) as executor:🧰 Tools
🪛 Ruff (0.8.2)
65-70: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
136-139
: Consider adding a maximum retry limit constantThe retry logic uses a hardcoded value. Consider extracting this to a named constant for better maintainability.
+MAX_RESPONSE_RETRIES = 3 # Add at the top of the file with other constants + while ( responses[response_idx].finish_reason == FinishReason.MAX_TOKENS - and response_idx < retries + and response_idx < MAX_RESPONSE_RETRIES ):graphrag_sdk/kg.py (2)
92-95
: Add stacklevel to warnings for better traceability.When using warnings, specify the
stacklevel
to ensure warnings point to the correct source location.warnings.warn( "Cypher system instruction should contain {ontology}", category=UserWarning, + stacklevel=2 )
🧰 Tools
🪛 Ruff (0.8.2)
92-92: No explicit
stacklevel
keyword argument found(B028)
122-125
: Add stacklevel to warnings for better traceability.When using warnings, specify the
stacklevel
to ensure warnings point to the correct source location.warnings.warn( "Cypher generation prompt with history should contain {last_answer}", category=UserWarning, + stacklevel=2 )
🧰 Tools
🪛 Ruff (0.8.2)
122-122: No explicit
stacklevel
keyword argument found(B028)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (42)
.github/workflows/test.yml
(1 hunks)graphrag_sdk/__init__.py
(1 hunks)graphrag_sdk/agents/__init__.py
(1 hunks)graphrag_sdk/agents/agent.py
(0 hunks)graphrag_sdk/agents/kg_agent.py
(1 hunks)graphrag_sdk/attribute.py
(1 hunks)graphrag_sdk/chat_session.py
(6 hunks)graphrag_sdk/document.py
(1 hunks)graphrag_sdk/document_loaders/jsonl.py
(1 hunks)graphrag_sdk/document_loaders/pdf.py
(3 hunks)graphrag_sdk/document_loaders/text.py
(2 hunks)graphrag_sdk/document_loaders/url.py
(3 hunks)graphrag_sdk/fixtures/regex.py
(1 hunks)graphrag_sdk/helpers.py
(5 hunks)graphrag_sdk/kg.py
(9 hunks)graphrag_sdk/model_config.py
(2 hunks)graphrag_sdk/models/__init__.py
(1 hunks)graphrag_sdk/models/azure_openai.py
(7 hunks)graphrag_sdk/models/gemini.py
(3 hunks)graphrag_sdk/models/litellm.py
(11 hunks)graphrag_sdk/models/model.py
(2 hunks)graphrag_sdk/models/ollama.py
(11 hunks)graphrag_sdk/models/openai.py
(2 hunks)graphrag_sdk/orchestrator/__init__.py
(1 hunks)graphrag_sdk/orchestrator/orchestrator.py
(1 hunks)graphrag_sdk/orchestrator/step.py
(2 hunks)graphrag_sdk/orchestrator/step_result.py
(1 hunks)graphrag_sdk/orchestrator/steps/__init__.py
(1 hunks)graphrag_sdk/orchestrator/steps/agent.py
(1 hunks)graphrag_sdk/orchestrator/steps/parallel.py
(3 hunks)graphrag_sdk/orchestrator/steps/user_input.py
(1 hunks)graphrag_sdk/relation.py
(1 hunks)graphrag_sdk/steps/create_ontology_step.py
(9 hunks)graphrag_sdk/steps/extract_data_step.py
(8 hunks)graphrag_sdk/steps/graph_query_step.py
(1 hunks)tests/test_auto_create_ontology.py
(1 hunks)tests/test_helper_validate_cypher.py
(1 hunks)tests/test_kg_gemini.py
(1 hunks)tests/test_kg_litellm_openai.py
(2 hunks)tests/test_kg_ollama.py
(3 hunks)tests/test_kg_openai.py
(2 hunks)tests/test_multi_agent.py
(0 hunks)
💤 Files with no reviewable changes (2)
- graphrag_sdk/agents/agent.py
- tests/test_multi_agent.py
✅ Files skipped from review due to trivial changes (19)
- graphrag_sdk/models/init.py
- graphrag_sdk/init.py
- graphrag_sdk/document.py
- tests/test_helper_validate_cypher.py
- graphrag_sdk/orchestrator/steps/agent.py
- graphrag_sdk/document_loaders/jsonl.py
- tests/test_kg_openai.py
- tests/test_kg_gemini.py
- graphrag_sdk/models/ollama.py
- graphrag_sdk/fixtures/regex.py
- graphrag_sdk/document_loaders/text.py
- graphrag_sdk/steps/graph_query_step.py
- graphrag_sdk/agents/init.py
- graphrag_sdk/chat_session.py
- graphrag_sdk/models/gemini.py
- graphrag_sdk/models/openai.py
- graphrag_sdk/relation.py
- tests/test_kg_ollama.py
- graphrag_sdk/document_loaders/url.py
🚧 Files skipped from review as they are similar to previous changes (7)
- graphrag_sdk/orchestrator/init.py
- graphrag_sdk/attribute.py
- graphrag_sdk/orchestrator/orchestrator.py
- graphrag_sdk/orchestrator/steps/parallel.py
- graphrag_sdk/steps/extract_data_step.py
- graphrag_sdk/helpers.py
- graphrag_sdk/agents/kg_agent.py
🧰 Additional context used
🪛 yamllint (1.35.1)
.github/workflows/test.yml
[error] 44-44: trailing spaces
(trailing-spaces)
🪛 Ruff (0.8.2)
graphrag_sdk/document_loaders/pdf.py
21-23: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
graphrag_sdk/kg.py
92-92: No explicit stacklevel
keyword argument found
(B028)
122-122: No explicit stacklevel
keyword argument found
(B028)
graphrag_sdk/models/litellm.py
239-241: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
graphrag_sdk/steps/create_ontology_step.py
65-70: Use a single with
statement with multiple contexts instead of nested with
statements
(SIM117)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: test
🔇 Additional comments (11)
graphrag_sdk/orchestrator/steps/__init__.py (1)
20-20
: LGTM! Appropriate addition to public API.Making
PLAN_STEP_TYPE_MAP
public is correct as it's a key part of the step type system.graphrag_sdk/orchestrator/steps/user_input.py (1)
51-56
: LGTM! Well-implemented string representations.The implementation follows Python's best practices for
__str__
and__repr__
methods.graphrag_sdk/models/model.py (2)
94-96
: LGTM! Clean signature formatting with proper type annotations.The method signature is well-formatted with clear type annotations and appropriate default value using the new
OutputMethod
enum.
10-14
: Verify implementations and usage of OutputMethod enum.Since we've added a new parameter to the abstract
send_message
method, we should ensure all implementations and callers are updated accordingly.Let's verify the implementations and usage:
Also applies to: 94-96
✅ Verification successful
OutputMethod enum implementation verified successfully
All implementations of
GenerativeModelChatSession
correctly handle theoutput_method
parameter with proper default values, ensuring backward compatibility. The parameter is properly used to adjust generation configuration in each implementation.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Find all implementations of GenerativeModelChatSession echo "=== Checking implementations of GenerativeModelChatSession ===" ast-grep --pattern 'class $NAME($$$GenerativeModelChatSession$$$): $$$ def send_message($$$): $$$' # Find all calls to send_message to ensure they use the enum correctly echo "=== Checking calls to send_message ===" rg "\.send_message\(" -A 2Length of output: 5333
Script:
#!/bin/bash # Find the base class definition echo "=== Checking GenerativeModelChatSession base class ===" ast-grep --pattern 'class GenerativeModelChatSession: $$$' # Find implementations with a more flexible pattern echo "=== Checking implementations with adjusted pattern ===" ast-grep --pattern 'class $NAME($$$): $$$ async def send_message($$$)' # Find the specific file content echo "=== Checking specific file content ===" cat graphrag_sdk/models/model.pyLength of output: 3915
Script:
#!/bin/bash # Find implementations with corrected pattern echo "=== Checking implementations with corrected pattern ===" ast-grep --pattern 'class $NAME($$$): $$$ def send_message($$$): $$$' # Find all files that might contain implementations echo "=== Checking files that might contain implementations ===" fd "model.*\.py" --type f --exec cat {}Length of output: 6348
Script:
#!/bin/bash # Find files containing GenerativeModelChatSession echo "=== Finding files with GenerativeModelChatSession ===" rg -l "GenerativeModelChatSession" # Check content of potential implementation files echo "=== Checking implementation files ===" fd -e py -p "gemini|openai|ollama" --type f --exec cat {}Length of output: 40363
graphrag_sdk/models/azure_openai.py (1)
Line range hint
3-3
: Verify the existence ofAzureOpenAI
in the OpenAI libraryThe import statement
from openai import AzureOpenAI
assumes that theAzureOpenAI
class is available in theopenai
package. Please verify thatAzureOpenAI
is a valid class in the OpenAI library or ensure that it is defined elsewhere in the codebase to prevent import errors.Run the following script to check if
AzureOpenAI
is defined in the codebase:tests/test_auto_create_ontology.py (1)
2-2
: Good addition ofload_dotenv()
to load environment variablesCalling
load_dotenv()
ensures that environment variables from the.env
file are loaded, providing necessary configurations for the test. This improves the reliability of the test setup.graphrag_sdk/model_config.py (1)
Line range hint
50-68
: Addition of JSON serialization methods enhances configurabilityThe
from_json
andto_json
methods enable serialization and deserialization ofKnowledgeGraphModelConfig
instances. This enhances the flexibility of the configuration management and is implemented correctly.graphrag_sdk/steps/create_ontology_step.py (1)
95-98
: Consider adding error handling for the final taskThe final task submission should include error handling to ensure the ontology is properly fixed.
graphrag_sdk/models/litellm.py (1)
258-260
: LGTM! Good practice for JSON output.Setting temperature to 0 for JSON output ensures deterministic and well-structured responses.
graphrag_sdk/kg.py (2)
118-120
: LGTM! Clear error message formatting.The multi-line formatting improves readability while maintaining a clear and descriptive error message.
229-238
: LGTM! Well-formatted constructor call.The multi-line formatting improves readability and makes the parameter passing clear.
config["temperature"] = 0 | ||
config["response_format"] = {"type": "json_object"} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid parameter response_format
in generation config
In the _get_generation_config
method, setting config["response_format"] = {"type": "json_object"}
may not be supported by the Azure OpenAI API. The chat.completions.create
method does not recognize a response_format
parameter, which could result in errors. Please verify the API documentation and remove or replace the invalid parameter.
Apply this diff to remove the unsupported parameter:
config["temperature"] = 0
- config["response_format"] = {"type": "json_object"}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
config["temperature"] = 0 | |
config["response_format"] = {"type": "json_object"} | |
config["temperature"] = 0 | |
|
||
# Credentials | ||
self.api_key = os.getenv("AZURE_OPENAI_API_KEY") | ||
self.azure_endpoint = os.getenv("AZURE_ENDPOINT") | ||
self.api_version = os.getenv("AZURE_API_VERSION") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initialize the client in the constructor to prevent NoneType
errors
The self.client
attribute is not initialized in the __init__
method, but methods like ask
rely on it being initialized. This could lead to a NoneType
error if ask
is called before with_system_instruction
. To prevent runtime errors, initialize self.client
in the constructor.
Apply this diff to initialize self.client
in the __init__
method:
if not self.api_key or not self.azure_endpoint or not self.api_version:
raise ValueError(
"Missing credentials in the environment: AZURE_OPENAI_API_KEY, AZURE_ENDPOINT, or AZURE_API_VERSION."
)
+ self._connect_to_model()
Committable suggestion skipped: line range outside the PR's diff.
graphrag_sdk/document_loaders/pdf.py
Outdated
__import__("pypdf") | ||
except ModuleNotFoundError: | ||
raise ModuleNotFoundError( | ||
"pypdf package not found, please install it with " "`pip install pypdf`" | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve exception chaining in ModuleNotFoundError handling
Use exception chaining to preserve the original error context. This helps with debugging by maintaining the full error traceback.
try:
__import__("pypdf")
- except ModuleNotFoundError:
+ except ModuleNotFoundError as err:
raise ModuleNotFoundError(
"pypdf package not found, please install it with " "`pip install pypdf`"
- )
+ ) from err
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
__import__("pypdf") | |
except ModuleNotFoundError: | |
raise ModuleNotFoundError( | |
"pypdf package not found, please install it with " "`pip install pypdf`" | |
) | |
) | |
__import__("pypdf") | |
except ModuleNotFoundError as err: | |
raise ModuleNotFoundError( | |
"pypdf package not found, please install it with " "`pip install pypdf`" | |
) from err |
🧰 Tools
🪛 Ruff (0.8.2)
21-23: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
raise ValueError( | ||
f"Error during completion request, please check the credentials - {e}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Preserve exception context when re-raising.
When re-raising exceptions, it's important to maintain the original error context using exception chaining.
except Exception as e:
- raise ValueError(
- f"Error during completion request, please check the credentials - {e}"
- )
+ raise ValueError(
+ f"Error during completion request, please check the credentials - {e}"
+ ) from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
raise ValueError( | |
f"Error during completion request, please check the credentials - {e}" | |
) | |
raise ValueError( | |
f"Error during completion request, please check the credentials - {e}" | |
) from e |
🧰 Tools
🪛 Ruff (0.8.2)
239-241: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
completion(model=model, messages=messages, max_tokens=10) | ||
return True | ||
except: | ||
except: # noqa: E722 | ||
return False | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve exception handling by preserving the error context.
The bare except clause silently swallows all exceptions. While the intention might be to catch all errors, it's better to preserve the original error context for debugging purposes.
try:
completion(model=model, messages=messages, max_tokens=10)
return True
-except: # noqa: E722
+except Exception as e:
+ logger.debug(f"Key validation failed: {e}")
return False
Committable suggestion skipped: line range outside the PR's diff.
Summary by CodeRabbit
KnowledgeGraphModelConfig
.