Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packages with an underscore in package name are never detected as installed. #23

Open
Peterdoo opened this issue Oct 30, 2024 · 4 comments

Comments

@Peterdoo
Copy link

Pip replaces underscores with dashes. So when the name of a package is for example my_package, pip will display it in the list command as my-package. IsInstalled will not find it and returns false.

I would recommend to make a small change to use dash for pip instead of underscore in PyPackage.Manager.Pip.pas:

constructor TPyPackageManagerPip.Create(const APackageName: TPyPackageName);
begin
inherited;
FDefs := TPyPackageManagerDefsPip.Create(APackageName.Replace('_', '-'));
FCmd := TPyPackageManagerCmdPip.Create(FDefs);
end;

@lmbelo
Copy link
Member

lmbelo commented Oct 30, 2024

@Peterdoo Why is that necessary? You only need to inform the correct package name.

@Peterdoo
Copy link
Author

Peterdoo commented Oct 31, 2024

Here is an example for the package charset-normalizer:

The import does not support dashes in names:

>>> import charset-normalizer
  File "<stdin>", line 1
    import charset-normalizer
                      ^
SyntaxError: invalid syntax

Replacing the dash with underscore, the import works fine. No error is reported:

>>> import charset_normalizer

So the only way to be able to import and use the package charset-normalizer in P4D is to set package name to charset_normalizer. When using the name with the dash as package name , it will not be imported.

Also the folder where it is installed contains undersocre Lib\site-packages\charset_normalizer. So this seems to be the correct name for the package.

However, pip converts all underscores to dashes, even though the folder where the package is installed is always created with the underscore:

python -m pip install charset_normalizer
Requirement already satisfied: charset_normalizer in c:\python\lib\site-packages (3.4.0)

The same when using dash in the package name:

python -m pip install charset-normalizer
Requirement already satisfied: charset-normalizer in c:\python\lib\site-packages (3.4.0)

When listing packages, pip displays a dash instead of an underscore:

python -m pip list
Package                 Version
----------------------- ------------------
aiohappyeyeballs        2.4.3
aiohttp                 3.10.10
aiosignal               1.3.1
alembic                 1.13.3
antlr4-python3-runtime  4.9.3
asteroid-filterbanks    0.4.0
attrs                   24.2.0
av                      12.3.0
catalogue               2.0.10
charset-normalizer      3.4.0
typing_extensions       4.12.2

TPyPackageManagerPip.IsInstalled() uses the following condition:
Result := LStdOut.Contains(FDefs.PackageName);

FDefs.PackageName is charset_normalizer, because that is the only way, the package can be used and imported.
However the output of pip list (LStdOut) contains charset-normalizer. IsInstalled always returns false.

As additional problem, I can just see that pip returns underscore for some packages like for example typing_extensions. Maybe the correct way would be to use the replacement function in the procedure TPyModuleBase.ImportModule instead of in the constructor TPyPackageManagerPip.Create.

  LImport := PyModuleName.Replace('-', '_');
  LPyParent := PyParent;
  while Assigned(LPyParent) do begin
    LImport := LPyParent.PyModuleNameReplace('-', '_') + '.' + LImport;
    LPyParent := LPyParent.PyParent;
  end;
  inherited ImportModule(LImport);

Then we would have:

For charset-normalizer:
PyModuleName: charset-normalizer
Import package name: charset_normalizer
Pip package name: charset-normalizer

For typing_extensions:
PyModuleName: typing_extensions
Import package name: typing_extensions
Pip package name: typing_extensions

@lmbelo
Copy link
Member

lmbelo commented Oct 31, 2024

But that's why we have separated properties in the components. The package name is distinct from the names used in PIP or Conda packages. See an example here:

https://github.com/Embarcadero/P4D-Data-Sciences/blob/551db5923fad9a9e8384169233dd64ddf9f87d12/src/PyTorch/PyTorch.pas#L38

procedure TPyTorch.Prepare(const AModel: TPyPackageModel);
begin
  inherited;
  with AModel do begin
    PackageName := 'torch';
    PackageManagers.Add(
      TPyPackageManagerKind.pip,
      TPyPackageManagerPip.Create('torch'));
      //torch is available as pytorch under conda
//    PackageManagers.Add(
//      TPyPackageManagerType.conda,
//      TPyPackageManagerConda.Create('pytorch'));
  end;
end;

You can set charset_normalizer as the package name and charset-normalizer in the PIP package manager.

@Peterdoo
Copy link
Author

Thank you. That indeed solves this case. However I have two more examples where it does not work:

Package name: pyannote.audio
Name for import: pyannote.audio
Installation command: python -m pip install pyannote.audio --extra-index-url https://download.pytorch.org/whl/cu124
Pip list output: pyannote.audio

Package name: en_core_web_sm
Name for import: en_core_web_sm
Installation command: python -m pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.8.0/en_core_web_sm-3.8.0-py3-none-any.whl
Pip list output: en_core_web_sm

Is there a way I can give pip different names for installation and for checking in IsInstalled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants