Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running pants fmt with --changed-since changes result vs directly listing target #21871

Open
mattalbr opened this issue Jan 23, 2025 · 3 comments
Labels

Comments

@mattalbr
Copy link

Describe the bug

vscode ➜ /workspace (pipeline_next) $ pants lint datasci/pipeline/test/test_model_assets.py 
00:33:37.94 [INFO] Completed: Format with Black - black made no changes.
00:33:37.94 [INFO] Completed: Format with isort - isort made no changes.

✓ black succeeded.
✓ isort succeeded.
vscode ➜ /workspace (pipeline_next) $ pants fmt datasci/pipeline/test/test_model_assets.py 
00:33:48.51 [INFO] Completed: Format with Black - black made no changes.
00:33:48.51 [INFO] Completed: Format with isort - isort made no changes.

✓ black made no changes.
✓ isort made no changes.
vscode ➜ /workspace (pipeline_next) $ pants lint --changed-since=upstream/dev
00:34:19.84 [WARN] Completed: Format with isort - isort made changes.
  datasci/pipeline/delfina_data/model_assets/model_publishers.py
  datasci/pipeline/delfina_data/models.py
  datasci/pipeline/test/test_model_assets.py
00:34:19.84 [INFO] Completed: Format with Black - black made no changes.
00:34:19.84 [INFO] Completed: Format with Black - black made no changes.
00:34:19.84 [INFO] Completed: Format with isort - isort made no changes.
00:34:19.84 [INFO] Completed: Format with Black - black made no changes.
00:34:19.84 [INFO] Completed: Format with isort - isort made no changes.
00:34:19.84 [INFO] Completed: Format with Black - black made no changes.
00:34:19.84 [INFO] Completed: Format with Black - black made no changes.
00:34:19.84 [INFO] Completed: Format with isort - isort made no changes.
00:34:19.84 [INFO] Completed: Format with isort - isort made no changes.

✓ black succeeded.
✕ isort failed.

(One or more formatters failed. Run `pants fmt` to fix.)
vscode ➜ /workspace (pipeline_next) $ pants fmt --changed-since=upstream/dev
00:34:28.05 [INFO] Completed: Format with Black - black made no changes.
00:34:28.05 [INFO] Completed: Format with isort - isort made no changes.
00:34:28.05 [INFO] Completed: Format with Black - black made no changes.
00:34:28.05 [INFO] Completed: Format with isort - isort made no changes.
00:34:28.05 [INFO] Completed: Format with isort - isort made no changes.
00:34:28.05 [WARN] Completed: Format with isort - isort made changes.
  datasci/pipeline/delfina_data/model_assets/model_publishers.py
  datasci/pipeline/delfina_data/models.py
  datasci/pipeline/test/test_model_assets.py
00:34:28.05 [INFO] Completed: Format with Black - black made no changes.
00:34:28.05 [INFO] Completed: Format with Black - black made no changes.

✓ black made no changes.
+ isort made changes.

As you can see, linting the file and fmting the file both returned green, until running with --changed-since after which they made edits.

FWIW, I think the --changed-since output is correct, and I'm surprised the imports aren't getting reordered without it.

imports before:

from concurrent.futures import as_completed
from datetime import datetime
from typing import NamedTuple

import dagster as dg
import pandera.typing.polars as pt
import polars as pl
from backend import models as gaia_models
from backend import schemas as gaia_schemas
from google.cloud import pubsub_v1

from .. import decorators as dd
from ..models import make_dagster_type
from ..models import ModelResults
from ..models import PublishResults
from ..pipeline_checks.asset_checks import *

imports after:

from concurrent.futures import as_completed
from datetime import datetime
from typing import NamedTuple

import dagster as dg
import pandera.typing.polars as pt
import polars as pl
from google.cloud import pubsub_v1

from backend import models as gaia_models
from backend import schemas as gaia_schemas

from .. import decorators as dd
from ..models import make_dagster_type
from ..models import ModelResults
from ..models import PublishResults
from ..pipeline_checks.asset_checks import *

Pants version
2.23.1

OS
MacOS and Ubuntu

Additional info
[isort]
args = [
"--profile=black",
"-l",
"100",
"--force-single-line-imports",
"--force-alphabetical-sort-within-sections",
"--single-line-exclusions=typing",
]
interpreter_constraints = [
"CPython>=3.11,<4"
]
install_from_resolve = "python-default" # 5.13.2

@mattalbr mattalbr added the bug label Jan 23, 2025
@benjyw
Copy link
Contributor

benjyw commented Jan 23, 2025

I suspect this is the well-known and unfortunate issue where isort infers first- vs third-party modules by analyzing all the input path. So it is sensitive to which other files it knows about, regardless of which ones it's actually invoked on.

If so, the workaround is to explicitly set known_first_party in isort config

@mattalbr
Copy link
Author

My hunch was something around first vs third party, so what you're saying makes complete sense. Does pants batch the files passed to isort when running --changed-since, e.g. as a performance optimization? I feel like pants has the potential to pass the exact same input to isort regardless of how it's invoked, but I'm assuming there's some reasons why that doesn't work so well?

Also I'm sure this is much more effort than maybe makes sense, but given that Pants has source roots and can infer source locations, it would be cool if there were a way to leverage that when communicating with isort.

In any case, will use known_first_party for now! Thank you Benjy!

@benjyw
Copy link
Contributor

benjyw commented Jan 26, 2025

Did that work out? If so could you close this ticket?

In answer to your question - yes, Pants may batch the files, which can also cause problems, but even if it doesn't, the set of changed files (or more generally the set of files you choose to run isort on) may not contain all the information for deriving third- vs first-party package info. Pants could add more files to that set, but it would be quite confusing to a user to see logging related to import order in files they didn't ask us to run on.

I think the first/third party distinction could be inferred by Pants from source roots. The first-party packages could be set with --project on the isort cmd line, for example. There has been discussion of this, but it hasn't been picked up by anyone, and there may be unintended consequences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants