LatentGAN [1] with heteroencoder trained on ChEMBL 25 [2], which encodes SMILES strings into latent vector representations of size 512. A Wasserstein Generative Adversarial network with Gradient Penalty [3] is then trained to generate latent vectors resembling that of the training set, which are then decoded using the heteroencoder. This model uses the Deep-Drug-Coder heteroencoder implementation [4].
Currently, the Deep-Drug-Coder [4] and its dependency package molvecgen [5] are not available in pypi, these have to be installed from there respective repositories (links provided below).
The pretrained models of the LatentGAN are currently not shared in this repository due to file size constraints. These will be added in the near future.
[1] A De Novo Molecular Generation Method Using Latent Vector Based Generative Adversarial Network
[2] ChEMBL
[3] Improved training of Wasserstein GANs
[4] Deep-Drug-Coder
[5] molvecgen