Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facing Multiple Issues While trying to run the code #3

Closed
Eshwar1502 opened this issue Feb 6, 2025 · 13 comments
Closed

Facing Multiple Issues While trying to run the code #3

Eshwar1502 opened this issue Feb 6, 2025 · 13 comments

Comments

@Eshwar1502
Copy link

Hey @GRAPH-0 We have created a docker container to run the code and have installed all the neccesary requirements and additional requirements too. We have tried Multiple times to try and run it but would face a different issue every single time. Right now we are currently stuck at :

 root@f590fcb:/CDGS# CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_qm9_cdgs.py --mode train --workdir exp/vpsde_qm9_cdgs
Traceback (most recent call last):
  File "main.py", line 3, in <module>
    import run_lib
  File "/CDGS/run_lib.py", line 9, in <module>
    from torch_geometric.loader import DataLoader, DenseDataLoader
ModuleNotFoundError: No module named 'torch_geometric.loader'

Please help me out with the error mentioned above, Also ps: i am a newbie, so it would be great if you could provide me with a docker image which has all the neccesary requirements to run the code :) or with a colab file where i can run it.

Thanks :)

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Feb 8, 2025

Thanks for your interest. It looks like pyg is not installed or has the wrong version.
Sorry that we didn't keep the relevant environment to make docker before.

@Eshwar1502
Copy link
Author

Hey @GRAPH-0 ,

Thank you for your reply. We have installed all the dependencies based on the PyTorch version mentioned in the README file (PyTorch 1.11). All dependencies were installed using the following Docker image Link , ensuring compatibility with PyTorch 1.11. Additionally, we downloaded the Dockerfile from the Moses GitHub repository

Despite these efforts, we keep encountering multiple errors when trying to run it. Every time we fix one issue, new errors keep appearing.

Could you please help us resolve this?

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Feb 8, 2025

pyg is pytorch_geometric instead of pytorch, plz check whether it is installed correctly.

@Eshwar1502
Copy link
Author

Eshwar1502 commented Feb 9, 2025

yes, we have downloaded the pytorch_geometric version according to the documentation and have downloaded the following:

For PyTorch 1.11.* and CUDA 11.3, type:

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install torch-geometric

But Even after donwloading the above requirements the errors mentioned below keeps popping up :

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.8/site-packages/torch_geometric/__init__.py", line 8, in <module>
    from .index import Index
  File "/opt/conda/lib/python3.8/site-packages/torch_geometric/index.py", line 461, in <module>
    @implements(aten.clone.default)
AttributeError: 'builtin_function_or_method' object has no attribute 'default'

And also while downloading molsets it always throws an error while trying to download its subpackage "Pomegranate" "pomegranate==0.12.0" :

 gcc: error: pomegranate/distributions/NeuralNetworkWrapper.c: No such file or directory
  error: command 'gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for pomegranate
  Running setup.py clean for pomegranate

Due to which i am unable to proceed further, so please help me out with this

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Feb 9, 2025

PYG may have changed some interfaces in subsequent versions, causing incompatibility. Downgrading the version can solve this problem (if you don't specify the version, pip should install the latest version). For the details, please refer to pyg-team/pytorch_geometric#9683.

Just modify the requirement in the source code to "pomegranate", refer to molecularsets/moses#104 .

@Eshwar1502
Copy link
Author

Thank you for your response, @GRAPH-0.

I have referred to the document, and installing the initial version of MolSets (molsets==0.1.0) seems to have worked. I am now able to train the model. However, during sampling, I encountered an issue with get_all_metrics. The error message is shown below:

  File "main.py", line 45, in main
    run_lib.evaluate(FLAGS.config, FLAGS.workdir, FLAGS.eval_folder)
  File "/workspace/CDGS-main/run_lib.py", line 360, in evaluate
    run_eval_dict[config.model_type](config, workdir, eval_folder)
  File "/workspace/CDGS-main/run_lib.py", line 327, in mol_sde_evaluate
    scores = get_all_metrics(gen=smile_list, k=len(smile_list), device=config.device, n_jobs=8,
TypeError: get_all_metrics() got an unexpected keyword argument 'train'

I have already checked the closed issues related to this topic but couldn't fully understand the resolution. Could you please help me resolve this issue?
Thank you for your patience.

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Feb 10, 2025

@GRAPH-0 GRAPH-0 closed this as completed Feb 12, 2025
@Eshwar1502
Copy link
Author

@GRAPH-0 Thank you so much for your assistance and patience. I have a small query regarding where to place the pretrained checkpoints. There seem to be two folders: one named checkpoints and the other checkpoints-meta. Which one should I use for sampling?

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Feb 12, 2025

checkpoints-meta saves the latest checkpoint, which is used to resume training.
checkpoints saves checkpoints regularly based on snapshot_freq.
Both could be used for sampling, but the longer training checkpoint may learn the distribution better.

@Eshwar1502
Copy link
Author

Thank you for your response, @GRAPH-0. I tried sampling using the existing checkpoint provided in the Drive Link provided, I saved the file in the checkpoints folder of vpsde_zinc_cdgs_256, which contains the following files:

(checkpoints/ , checkpoints-meta/ , eval/ , samples/ , stdout.txt , tensorboard/)

I stopped the training after 500 rounds and checked both the samples and eval folders for the generated molecules, but they appeared to be empty. :/ Could you please help me out with this?

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Feb 13, 2025

If you use the provided checkpoint, you don't need to train.
If you train a new model, based on config.training.eval_freq = 5000, you will obtain checkpoints with at least 5000 training iteraions.

@Eshwar1502
Copy link
Author

Eshwar1502 commented Feb 13, 2025

Thank you for your response @GRAPH-0 . The training seems to be working fine. I was referring to the sampling after running the following command:
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_zinc_cdgs.py --mode eval --workdir exp/vpsde_zinc_cdgs_256 --config.eval.begin_ckpt 250 --config.eval.end_ckpt 250 --config.eval.batch_size 16 --config.model.num_scales 200

After 500 rounds of sampling, I couldn't find the generated molecules. I checked the samples folder but couldn't find them there. Am i looking at the wrong place?

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Feb 14, 2025

Turn this flag on

CDGS/run_lib.py

Line 314 in 2d498aa

if config.eval.save_graph:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants