Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation issue and setup failed #6

Closed
wenchangzhou-qtx opened this issue Sep 13, 2024 · 7 comments
Closed

Installation issue and setup failed #6

wenchangzhou-qtx opened this issue Sep 13, 2024 · 7 comments

Comments

@wenchangzhou-qtx
Copy link

wenchangzhou-qtx commented Sep 13, 2024

Hey @fjclark,

Thanks for such a wonderful automation package! I did a quick run and encountered two issues so far:
The first one is after installation, I only can import the module inside the installation directory but cannot load the module outside, not sure if this is because of my installation.

The second one is after I copied over the input files from a3fe/a3fe/data/example_run_dir/input, and run up to calc.setup(), I then got error saying RuntimeError: Could not find slurm output file name in /home/softwares/a3fe/input/solvate_bound.sh, do I missing anything there?

@wenchangzhou-qtx wenchangzhou-qtx changed the title Installation failed Installation issue and setup failed Sep 13, 2024
@fjclark
Copy link
Collaborator

fjclark commented Sep 15, 2024

Hey @wenchangzhou-qtx,

No problem - hope it's useful!

Installation Issue: The steps to install (once you have GROMACS and SLURM) are:

git clone https://github.com/michellab/a3fe.git
cd a3fe
mamba env create -f environment.yaml
python -m pip install --no-deps .

It sounds like you've missed the last step - python -m pip install --no-deps ., which installs the package into your environment. To check this, run conda list | grep a3fe to make sure that it's installed. For example, I see:

a3fe                      0.1.1+2.g80436ae.dirty          pypi_0    pypi

If you don't see anything, run python -m pip install --no-deps . in the a3fe base directory.

RuntimeError: This sounds like an issue with run_somd.sh. The options in this are used as a template to create all of your other SLURM scripts, such as solvate_bound.sh. Could not find slurm output file name shows that you need to specify the SLURM output file name with #SBATCH -o in run_somd.sh. To fix this, add e.g. #SBATCH -o somd-array-gpu-%A.%a.out to run_somd.sh, and make sure that all of the other SLURM options at the top of the script are correct for your cluster.

Hope that helps!

@wenchangzhou-qtx
Copy link
Author

Hey @fjclark,

Thanks for the guide and suggestions!
Regarding the installation issue, I made sure I ran the last step and also got the exact output when I run conda list | grep a3fe, but still I cannot import a3fe outside the installation base directory. What else should I check?

When I import a3fe inside the installation directory, I can run the submission steps up to a3.setup() but still got the same error, this is my run_somd.sh for you to take a look

#!/bin/bash

#SBATCH --account=qt
#SBATCH --nodes=1
#SBATCH --time=24:00:00
#SBATCH --gres=gpu:1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=2
#SBATCH --exclude=node018
#SBATCH --mem=50G
#SBATCH --output=somd-array-gpu-%A.%a.out

lam=$1
echo "lambda is: " $lam

srun somd-freenrg -C somd.cfg -l $lam -p CUDA

fjclark added a commit that referenced this issue Sep 16, 2024
This partially addresses #6.
Now both = and spaces are correctly handled when reading the slurm
script to extract the output file name.
@fjclark
Copy link
Collaborator

fjclark commented Sep 16, 2024

Hey,

Thanks for the additional details. The issue with the SLURM file was a bug - apologies for this. I forgot to handle the case when "=", rather than spaces are used (e.g. #SBATCH -o somd-array-gpu-%A.%a.out works but #SBATCH --output=somd-array-gpu-%A.%a.out doesn't. I've now fixed this and added corresponding tests (see cb0d532). If you pull the latest changes then your input should work - alternatively you can reformat your run_somd.sh to use spaces. Thanks for raising this and sorry for the sloppy error!

For the installation, can you check:

  • That you are using the version of Python installed in your environment e.g. which python shows .../anaconda3/envs/mamba/envs/a3fe_test/bin/python for me.
  • That A3FE is correctly linked in the environment. Type pip show a3fe. For me, I see Location: .../anaconda3/envs/mamba/envs/a3fe_test/lib/python3.12/site-packages
  • That the install location is in the Python path. Start an ipython/python session and type
import sys
sys.path

I see .../anaconda3/envs/mamba/envs/a3fe_test/lib/python3.12/site-packages in the resultant list, matching the location given by pip show a3fe.

Hopefully this gives us a clue as to what's wrong! If it's still not working, please post the output of the above commands.

Thanks!

@wenchangzhou-qtx
Copy link
Author

Thanks @fjclark !

Regarding the installation, I found the cause, guess it's a silly one from my side. I did not switch on the a3fe environment before running the last step, now it's working.

Regarding the test run, I figured the slurm output file option has to be in the first line in run_somd.sh, everything else needs to put after that. I will continue from here :)

@fjclark
Copy link
Collaborator

fjclark commented Sep 16, 2024

No problem!

Ah, glad you found the issue. My fault - I should update the installation notes to be explicit with this.

The slurm output file option shouldn't have to be the first line in run_somd.sh, but before I fixed the bug it didn't work with and "=" in e.g. #SBATCH --output=somd-array-gpu-%A.%a.out. The updated version should now work with either "=" or a space (and I now have a test which uses the run_somd.sh file you provided to check that the output file name can be read.

Great, good luck and let me know if there are any more issues!

@fjclark
Copy link
Collaborator

fjclark commented Sep 20, 2024

Hey @wenchangzhou-qtx,

Did it work for you? If so, feel free to close the issue. Thanks!

@wenchangzhou-qtx
Copy link
Author

Yes, it's working for me now, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants