Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update split list #89

Open
wants to merge 6 commits into
base: dev-dwp-ihm
Choose a base branch
from
Open

Update split list #89

wants to merge 6 commits into from

Conversation

j-s-135
Copy link

@j-s-135 j-s-135 commented Mar 5, 2025

Add hash and capitalization to the already great timestamp function.

@j-s-135 j-s-135 requested a review from piehld March 5, 2025 20:10
Copy link
Collaborator

@piehld piehld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @j-s-135! I appreciate you foreseeing the potential merge conflicts with my PR and like how you've proposed reconciling them. Overall I think things look good (nice shout out to @trumbullm in your PR description, by the way 😉 ), but just had a few suggestions.

Also, since this is a pretty small set of changes, can you change the PR base to my branch (dev-dwp-ihm)? (You should just be able to click "Edit" at the top of your PR page and select my branch instead of master.) That way, we can just incorporate this along with my changes in a single version bump with my PR.

Comment on lines +99 to +100
parser.add_argument("--outputContentType", action="store_true", default=False, help="Whether output path in downstream application has prepended content type (pdb, csm) before file name")
parser.add_argument("--outputHash", action="store_true", default=False, help="Whether output path in downstream application has prepended hash before file name")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the argument to be snakecase instead of camel, just to be consistent with the other CLI argument names? (I know, it's annoying that these CLI ones are snake while everything else is camel, but it's too late to change everything now...)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since these are flags, I might suggest prepending both arguments/variables with prepend, to make it more clear that they are flags and not input parameters, e.g.:

--prepend_output_content_type (corresponding var: prependOutputContentType)
--prepend_output_hash         (corresponding var: prependOutputHash)

return "csm"
return ""

def getTimeStampCheck(self, hD, targetFileDir, targetFileSuffix, databaseName, outputContentType=False, outputHash=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you didn't originally write this, but could you add a brief docstring describing what this does? From my understanding, it could say something like this (please correct me if I'm wrong)?:

Filters a list of structure identifiers to return for post-processing by comparing the
timestamp of the source file (based on holdings file information) with the timestamp
of the target file (based on the actual file properties). If the target file exists and
is newer than the source timestamp, then that identifier is removed from the returned
list for processing.

# experimental models are stored with lower case while csms are stored with upper case (except content type)
pdbid = key.lower()
contentTypePrefix = self.getContentTypePrefix(databaseName)
hashPath = self.getPdbHash(pdbid)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is specific for experimental PDBs, I'd add a conditional above here checking that, e.g. if databaseName == "pdbx_core", just like you do below for CSMs.

@@ -26,6 +26,7 @@
# Add support for logging output to a specific file
# 25-Apr-2024 - dwp Add support for remote config file loading; use underscores instead of hyphens for arg choices
# 22-Jan-2025 - mjt Add Imgs format option flags
# 5-Mar-2025 - james smith output hash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, my internal linting is kicking in and I can't control it

Suggested change
# 5-Mar-2025 - james smith output hash
# 5-Mar-2025 - js Add support for prepending content type and directory hash for splitIdList output

@j-s-135 j-s-135 changed the base branch from master to dev-dwp-ihm March 5, 2025 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants