-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terminator: support multi-byte termination bytes #158
Comments
Assign this to me |
I suggest that, instead of supporting multibytes as a terminator, generalize by supporting a rule as a terminator, so a multibyte constant sequence would be a particular case. |
I have been working to handle multi-bytes terminator. Let's suppose that the changes in Scala to change the type of terminator from int to Array[Byte] are made. (I just replace the int type for Array[Byte] and some minor changes but I would like to write a separate issue about that.) One good thing to know is KMP algorithm to find matches in O(N + M) where N is the length of the pattern and M is the length of the text. Because of this the complexity of 'read_bytes_term()' doesn't change. Here is the python-runtime commit with the changes: python-runtime |
Any progress on this? |
In JPEG Interchange Format (including JFIF and SPIFF), the
scan
segment includes compressed data for which a length is not known until the compressed data has been fully read from the file. It is possible however to look for a 0xFF byte in the compressed data, which would be followed by 0x00 if this marker is to be ignored (escaped), or another byte (which can be multiple values) to denote the next segment of the file.Ideally there would be a construct similar to:
Wildcard bytes, regular expressions, number ranges and other helpers could also be of assistance in defining terminators in other file formats.
The text was updated successfully, but these errors were encountered: