Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom overrites for package resolving and optional sys.path support #601

Merged
merged 9 commits into from
Feb 26, 2025
4 changes: 4 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,10 @@ uv run pytest tests/unit -n auto
uv run pytest tests/integration/codemod/test_codemods.py -n auto
```

> [!TIP]
>
> - If on Linux the error `OSError: [Errno 24] Too many open files` appears then you might want to increase your _ulimit_
## Pull Request Process

1. Fork the repository and create your branch from `develop`.
Expand Down
5 changes: 5 additions & 0 deletions docs/building-with-codegen/imports.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,11 @@ print(f"From file: {import_stmt.from_file.filepath}")
print(f"To file: {import_stmt.to_file.filepath}")
```

<Note>
With Python one can specify the `PYTHONPATH` environment variable which is then considered when resolving
packages.
</Note>

## Working with External Modules

You can determine if an import references an [ExternalModule](/api-reference/core/ExternalModule) by checking the type of [Import.imported_symbol](/api-reference/core/Import#imported-symbol), like so:
Expand Down
5 changes: 5 additions & 0 deletions src/codegen/cli/mcp/resources/system_prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -2858,6 +2858,11 @@ def validate_data(data: dict) -> bool:
print(f"To file: {import_stmt.to_file.filepath}")
```
<Note>
With Python one can specify the `PYTHONPATH` environment variable which is then considered when resolving
packages.
</Note>
## Working with External Modules
You can determine if an import references an [ExternalModule](/api-reference/core/ExternalModule) by checking the type of [Import.imported_symbol](/api-reference/core/Import#imported-symbol), like so:
Expand Down
2 changes: 2 additions & 0 deletions src/codegen/configs/models/codebase.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

class CodebaseConfig(BaseConfig):
def __init__(self, prefix: str = "CODEBASE", *args, **kwargs) -> None:
super().__init__(prefix=prefix, *args, **kwargs)

Check failure on line 8 in src/codegen/configs/models/codebase.py

View workflow job for this annotation

GitHub Actions / mypy

error: "__init__" of "BaseConfig" gets multiple values for keyword argument "prefix" [misc]

debug: bool = False
verify_graph: bool = False
Expand All @@ -17,6 +17,8 @@
disable_graph: bool = False
generics: bool = True
import_resolution_overrides: dict[str, str] = Field(default_factory=lambda: {})
py_resolve_overrides: list[str] = Field(default_factory=lambda: [])
py_resolve_syspath: bool = False
ts_dependency_manager: bool = False
ts_language_engine: bool = False
v8_ts_engine: bool = False
Expand Down
30 changes: 30 additions & 0 deletions src/codegen/sdk/python/import_resolution.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import annotations

import os
import sys
from typing import TYPE_CHECKING

from codegen.sdk.core.autocommit import reader
Expand Down Expand Up @@ -55,7 +56,7 @@

resolved_symbol = self.resolved_symbol
if resolved_symbol is not None and resolved_symbol.node_type == NodeType.FILE:
return self.alias.source

Check failure on line 59 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Item "None" of "Editable[Any] | None" has no attribute "source" [union-attr]
return None

@property
Expand All @@ -76,9 +77,9 @@
return []

if not self.is_module_import():
return [self.imported_symbol]

Check failure on line 80 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: List item 0 has incompatible type "Symbol[Any, Any] | ExternalModule | PyFile | Import[Any]"; expected "Exportable[Any]" [list-item]

return self.imported_symbol.symbols + self.imported_symbol.imports

Check failure on line 82 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Item "Symbol[Any, Any]" of "Symbol[Any, Any] | ExternalModule | PyFile | Import[Any]" has no attribute "symbols" [union-attr]

Check failure on line 82 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Item "ExternalModule" of "Symbol[Any, Any] | ExternalModule | PyFile | Import[Any]" has no attribute "symbols" [union-attr]

Check failure on line 82 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Item "Import[Any]" of "Symbol[Any, Any] | ExternalModule | PyFile | Import[Any]" has no attribute "symbols" [union-attr]

Check failure on line 82 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Item "Symbol[Any, Any]" of "Symbol[Any, Any] | ExternalModule | PyFile | Import[Any]" has no attribute "imports" [union-attr]

Check failure on line 82 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Item "ExternalModule" of "Symbol[Any, Any] | ExternalModule | PyFile | Import[Any]" has no attribute "imports" [union-attr]

Check failure on line 82 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Item "Import[Any]" of "Symbol[Any, Any] | ExternalModule | PyFile | Import[Any]" has no attribute "imports" [union-attr]

@noapidoc
@reader
Expand All @@ -104,9 +105,16 @@
base_path,
module_source.replace(".", "/") + "/" + symbol_name + ".py",
)
if file := self.ctx.get_file(filepath):

Check failure on line 108 in src/codegen/sdk/python/import_resolution.py

View workflow job for this annotation

GitHub Actions / mypy

error: Argument 1 to "get_file" of "CodebaseContext" has incompatible type "str"; expected "PathLike[Any]" [arg-type]
return ImportResolution(from_file=file, symbol=None, imports_file=True)

# =====[ Check if we are importing an entire file with custom resolve path or sys.path enabled ]=====
if len(self.ctx.config.py_resolve_overrides) > 0 or self.ctx.config.py_resolve_syspath:
# Handle resolve overrides first if both is set
resolve_paths: list[str] = self.ctx.config.py_resolve_overrides + (sys.path if self.ctx.config.py_resolve_syspath else [])
if file := self._file_by_custom_resolve_paths(resolve_paths, filepath):
return ImportResolution(from_file=file, symbol=None, imports_file=True)

filepath = filepath.replace(".py", "/__init__.py")
if file := self.ctx.get_file(filepath):
# TODO - I think this is another edge case, due to `dao/__init__.py` etc.
Expand All @@ -120,6 +128,14 @@
symbol = file.get_node_by_name(symbol_name)
return ImportResolution(from_file=file, symbol=symbol)

# =====[ Check if `module.py` file exists in the graph with custom resolve path or sys.path enabled ]=====
if len(self.ctx.config.py_resolve_overrides) > 0 or self.ctx.config.py_resolve_syspath:
# Handle resolve overrides first if both is set
resolve_paths: list[str] = self.ctx.config.py_resolve_overrides + (sys.path if self.ctx.config.py_resolve_syspath else [])
if file := self._file_by_custom_resolve_paths(resolve_paths, filepath):
symbol = file.get_node_by_name(symbol_name)
return ImportResolution(from_file=file, symbol=symbol)

# =====[ Check if `module/__init__.py` file exists in the graph ]=====
filepath = filepath.replace(".py", "/__init__.py")
if from_file := self.ctx.get_file(filepath):
Expand Down Expand Up @@ -148,6 +164,20 @@
# ext = ExternalModule.from_import(self)
# return ImportResolution(symbol=ext)

@noapidoc
@reader
def _file_by_custom_resolve_paths(self, resolve_paths: list[str], filepath: str) -> SourceFile | None:
"""Check if a certain file import can be found within a set sys.path

Returns either None or the SourceFile.
"""
for resolve_path in resolve_paths:
filepath_new: str = os.path.join(resolve_path, filepath)
if file := self.ctx.get_file(filepath_new):
return file

return None

@noapidoc
@reader
def _relative_to_absolute_import(self, relative_import: str) -> str:
Expand Down
5 changes: 5 additions & 0 deletions src/codegen/sdk/system-prompt.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2879,6 +2879,11 @@ print(f"From file: {import_stmt.from_file.filepath}")
print(f"To file: {import_stmt.to_file.filepath}")
```

<Note>
With Python one can specify the `PYTHONPATH` environment variable which is then considered when resolving
packages.
</Note>

## Working with External Modules

You can determine if an import references an [ExternalModule](/api-reference/core/ExternalModule) by checking the type of [Import.imported_symbol](/api-reference/core/Import#imported-symbol), like so:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import sys
from typing import TYPE_CHECKING

from codegen.sdk.codebase.factory.get_session import get_codebase_session
Expand Down Expand Up @@ -191,7 +192,7 @@ def update():
"consumer.py": """
from a.b.c import src as operations

def func_1():
def func():
operations.update()
""",
},
Expand All @@ -215,6 +216,180 @@ def func_1():
assert call_site.file == consumer_file


def test_import_resolution_file_syspath_inactive(tmpdir: str, monkeypatch) -> None:
"""Tests function.usages returns usages from file imports"""
# language=python
with get_codebase_session(
tmpdir,
files={
"a/b/c/src.py": """
def update():
pass
""",
"consumer.py": """
from b.c import src as operations

def func():
operations.update()
""",
},
) as codebase:
src_file: SourceFile = codebase.get_file("a/b/c/src.py")
consumer_file: SourceFile = codebase.get_file("consumer.py")

# Disable resolution via sys.path
codebase.ctx.config.py_resolve_syspath = False

# =====[ Imports cannot be found without sys.path being set and not active ]=====
assert len(consumer_file.imports) == 1
src_import: Import = consumer_file.imports[0]
src_import_resolution: ImportResolution = src_import.resolve_import()
assert src_import_resolution is None

# Modify sys.path for this test only
monkeypatch.syspath_prepend("a")

# =====[ Imports cannot be found with sys.path set but not active ]=====
src_import_resolution = src_import.resolve_import()
assert src_import_resolution is None


def test_import_resolution_file_syspath_active(tmpdir: str, monkeypatch) -> None:
"""Tests function.usages returns usages from file imports"""
# language=python
with get_codebase_session(
tmpdir,
files={
"a/b/c/src.py": """
def update():
pass
""",
"consumer.py": """
from b.c import src as operations

def func():
operations.update()
""",
},
) as codebase:
src_file: SourceFile = codebase.get_file("a/b/c/src.py")
consumer_file: SourceFile = codebase.get_file("consumer.py")

# Enable resolution via sys.path
codebase.ctx.config.py_resolve_syspath = True

# =====[ Imports cannot be found without sys.path being set ]=====
assert len(consumer_file.imports) == 1
src_import: Import = consumer_file.imports[0]
src_import_resolution: ImportResolution = src_import.resolve_import()
assert src_import_resolution is None

# Modify sys.path for this test only
monkeypatch.syspath_prepend("a")

# =====[ Imports can be found with sys.path set and active ]=====
codebase.ctx.config.py_resolve_syspath = True
src_import_resolution = src_import.resolve_import()
assert src_import_resolution
assert src_import_resolution.from_file is src_file
assert src_import_resolution.imports_file is True


def test_import_resolution_file_custom_resolve_path(tmpdir: str) -> None:
"""Tests function.usages returns usages from file imports"""
# language=python
with get_codebase_session(
tmpdir,
files={
"a/b/c/src.py": """
def update():
pass
""",
"consumer.py": """
from b.c import src as operations
from c import src as operations2

def func():
operations.update()
""",
},
) as codebase:
src_file: SourceFile = codebase.get_file("a/b/c/src.py")
consumer_file: SourceFile = codebase.get_file("consumer.py")

# =====[ Imports cannot be found without custom resolve path being set ]=====
assert len(consumer_file.imports) == 2
src_import: Import = consumer_file.imports[0]
src_import_resolution: ImportResolution = src_import.resolve_import()
assert src_import_resolution is None

# =====[ Imports cannot be found with custom resolve path set to invalid path ]=====
codebase.ctx.config.py_resolve_overrides = ["x"]
src_import_resolution = src_import.resolve_import()
assert src_import_resolution is None

# =====[ Imports can be found with custom resolve path set ]=====
codebase.ctx.config.py_resolve_overrides = ["a"]
src_import_resolution = src_import.resolve_import()
assert src_import_resolution
assert src_import_resolution.from_file is src_file
assert src_import_resolution.imports_file is True

# =====[ Imports can be found with custom resolve multi-path set ]=====
src_import = consumer_file.imports[1]
codebase.ctx.config.py_resolve_overrides = ["a/b"]
src_import_resolution = src_import.resolve_import()
assert src_import_resolution
assert src_import_resolution.from_file is src_file
assert src_import_resolution.imports_file is True


def test_import_resolution_file_custom_resolve_and_syspath(tmpdir: str, monkeypatch) -> None:
"""Tests function.usages returns usages from file imports"""
# language=python
with get_codebase_session(
tmpdir,
files={
"a/c/src.py": """
def update1():
pass
""",
"a/b/c/src.py": """
def update2():
pass
""",
"consumer.py": """
from c import src as operations

def func():
operations.update2()
""",
},
) as codebase:
src_file: SourceFile = codebase.get_file("a/b/c/src.py")
consumer_file: SourceFile = codebase.get_file("consumer.py")

# Ensure we don't have overrites and enable syspath resolution
codebase.ctx.config.py_resolve_overrides = []
codebase.ctx.config.py_resolve_syspath = True

# =====[ Import with sys.path set can be found ]=====
assert len(consumer_file.imports) == 1
# Modify sys.path for this test only
monkeypatch.syspath_prepend("a")
src_import: Import = consumer_file.imports[0]
src_import_resolution = src_import.resolve_import()
assert src_import_resolution
assert src_import_resolution.from_file.file_path == "a/c/src.py"

# =====[ Imports can be found with custom resolve over sys.path ]=====
codebase.ctx.config.py_resolve_overrides = ["a/b"]
src_import_resolution = src_import.resolve_import()
assert src_import_resolution
assert src_import_resolution.from_file is src_file
assert src_import_resolution.imports_file is True


def test_import_resolution_circular(tmpdir: str) -> None:
"""Tests function.usages returns usages from file imports"""
# language=python
Expand Down
Loading