-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow returning after only first match of pattern #119
Comments
Sounds like a reasonable request and a good feature to have IMO. Putting some timings here for reference:
Could be controlled via a Note that the Any thoughts @ahl27 ? Finally, and FWIW, a word of caution about
The fact that it discards matches that overlap with a previous match is not mentioned in the documentation, which is unfortunate because this makes it unfit for searching most biological sequences. H. sessionInfo()
|
Seems like a good idea to me. I'm thinking it might just be better to add a I can add this to my todos for this cycle, should be fairly easy to implement. |
Yep, Thanks! |
The
*matchPattern*
family of functions currently return as many matches as possible. However, sometimes you just want to know whether there is any match between two sequences, so it is inefficient to continue searching after a match is found.When searching a long list of sequences that are slightly longer than the pattern, vmatchPattern is about 2x slower than grepl(..., fixed=True), despite grepl having to translate to characters.
I need to eke out as much performance as possible because of the volume I am working with. It would be good to have an option, similar to grepRaw, to return early as soon as a match is found.
The text was updated successfully, but these errors were encountered: