You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As @brindakv mentioned, python-ihm currently does various sanity checks to ensure the generated mmCIF is self-consistent, e.g. checking the model sequence against that in the struct_ref table. However, we could do additional sanity checks (perhaps as part of the make-mmcif.py script, or another script used as part of the deposition pipeline) that validate external resources. (I would be reluctant to have these done as part of generating every file, since they would make multiple network connections, referenced files might not exist at modeling time, and many issues might be warnings rather than errors or would need manual intervention.) For example we could
Query UniProt and check to make sure that the struct_ref sequence matches (complication: may need to check multiple versions of the UniProt sequence since it does change).
Ping any DOI referenced in the file to make sure it exists.
Download any referenced external archive files and make sure that any files referenced inside those archives exist.
Look up any accessions (e.g. SASBDB, EMDB) to make sure that a) they exist, b) they have been released and c) they match the model (e.g. by checking model fit or checking that both model and data reference the same UniProt sequence).
Look up any PMIDs and make sure the citation matches.
The text was updated successfully, but these errors were encountered:
As @brindakv mentioned, python-ihm currently does various sanity checks to ensure the generated mmCIF is self-consistent, e.g. checking the model sequence against that in the
struct_ref
table. However, we could do additional sanity checks (perhaps as part of themake-mmcif.py
script, or another script used as part of the deposition pipeline) that validate external resources. (I would be reluctant to have these done as part of generating every file, since they would make multiple network connections, referenced files might not exist at modeling time, and many issues might be warnings rather than errors or would need manual intervention.) For example we couldstruct_ref
sequence matches (complication: may need to check multiple versions of the UniProt sequence since it does change).The text was updated successfully, but these errors were encountered: