-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
legacy writer produces invalid output for large input #156
Comments
@pierrec is there any chance you can look at this issue? |
@anatol sorry for my slow follow up on this issue. |
Hi @pierrec, friendly ping on this issue. Is there any way the community could help with moving it forward? |
Hi @anatol, I am sorry that I still havent had time to look into this. In the meanwhile, I encourage anyone who can to do so :) |
@anatol sorry about this long delay. I am now looking into this but I cannot reproduce the issue in legacy mode. I have tried with multiple random dumps without any luck in doing so. I know the inputs have to be quite large, but any chance you could share one that fails on your side please? |
First generate a large input file as dd if=/dev/urandom of=testdata/bzImage_lz4_isolated bs=64M count=1 then run the test: go test -v -run TestWriterLegacy You'll see error message from lz4 tool: "Stream followed by undecodable data at position 8" Issue pierrec#156
@pierrec Yes I still see this issue at the head of
hope it helps |
There is still an issue, will look into it tomorrow. |
My booster test still fails with
|
github.com/pierrec/lz4 has numerous problems with its legacy writer (e.g. pierrec/lz4#156) that prevents it using for initramfs compression. Replace it with a cli tool ('lz4') wrapper. Fixes #117
github.com/pierrec/lz4 has numerous problems with its legacy writer (e.g. pierrec/lz4#156) that prevents it using for initramfs compression. Replace it with a cli tool ('lz4') wrapper. Fixes #117
There are two issues at play here: One is a bug in pierrec/lz4 when using the legacy framing format [1]. This bit us when we hit a broken size region with CL:2130, taking hours to debug. The other is the fact that the Linux LZ4 frame format has significant design issues [2], especially with concatenanted initrds. The first issue could be fixed by switching to a different LZ4 implementation (we do even have the reference impl in the monorepo) but there is no API to generate the legacy frame format and things like [3], a patch carried by Ubuntu to fix more edge cases just do not inspire confidence in such a solution. Thus, this CL switches over to using zstd for compressing initrds. Zstd is slower than LZ4 for decompressing, but it still decompresses at multiple GB/s per core while having a much better compression ratio. It also doesn't have any Linux-specific bits and Linux uses the reference implementation for decoding, which should make it much more robust. So overall I think this is a good tradeoff. [1] pierrec/lz4#156 [2] lz4/lz4#956 (comment) [3] https://launchpadlibrarian.net/507407918/0001-unlz4-Handle-0-size-chunks-discard-trailing-padding-.patch Change-Id: I69cf69f2f361de325f4b39f2d3644ee729643716 Reviewed-on: https://review.monogon.dev/c/monogon/+/2313 Tested-by: Jenkins CI Reviewed-by: Serge Bazanski <[email protected]>
Moving discussion from anatol/booster#117
If I feed a large input into legacy writer it produces output that neither Linux kernel nor
lz4
tool likes.To reproduce the problem generate a large input e.g. using
and then apply patch from #151 (comment) and you'll see output like
for smaller files the output looks fine.
The text was updated successfully, but these errors were encountered: