-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Dropped custom image handling, added full Pillow support #118
base: master
Are you sure you want to change the base?
Conversation
62102b0
to
fcb0802
Compare
@@ -33,6 +33,7 @@ | |||
url='http://code.google.com/p/pyfpdf', | |||
license='LGPLv3+', | |||
download_url="https://github.com/reingart/pyfpdf/tarball/%s" % fpdf.__version__, | |||
install_requires=['numpy>=1.15.4', 'Pillow>=5.3.0'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree with numpy depency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look here:
Line 1945 in 08e28af
re_c = re.compile('(...).'.encode("ascii"), flags=re.DOTALL) |
Here, bytes representing RGBA pixel values are processed using regular expressions. This is extremely slow. If you care about peformance while generating PDFs, this won't cut it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have all the deps you want in my fork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to this code?:
https://github.com/alexanderankin/pyfpdf/blob/master/fpdf/image_parsing.py#L199
In any case, the way the pixels are being processed (by means of regular expressions seperating RGBA into RGB and A) is really really slow. So, yes, numpy is an additional dependency, but at least image handling is much more performant now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh im saying ill accept such a pr on my fork thats all. i agree with this move.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually can you help me test it? I copied this bit into my branch and the way i have my tests set up is that it generates a pdf, computes hash, compares the hash and then os.unlink
s the generated pdf. However, with this proposed get_img_info
function it is inserting an extra object and stream for the thumbnail preview, which of course makes the test fail but i should probably just re-do all my tests? thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i updated mine to 2.0.1 with all these changes but it broke a lot of my tests :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed another fix to this branch to properly deal with unsupported extensions. Not sure about the inner workings of the test suite, but it seems to me that due to the Pillow changes the hashes of the resources changed.
>> Tests: 28
Test 1 / 28 : test_cache.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 2 / 28 : test_corebox.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 3 / 28 : test_e1252.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 4 / 28 : test_imgmask.py
HASHER SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP
Test 5 / 28 : test_invoice.py
HASHER FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL
Test 6 / 28 : test_issue14.py
HASHER FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL
Test 7 / 28 : test_issue33.py
HASHER SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP
Test 8 / 28 : test_issue35.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 9 / 28 : test_issue41.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 10 / 28 : test_issue60.py
SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP
Test 11 / 28 : test_issue62.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 12 / 28 : test_issue63.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 13 / 28 : test_issue70.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 14 / 28 : test_issue71.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 15 / 28 : test_issue78.py
HASHER SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP
Test 16 / 28 : test_issue82.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 17 / 28 : test_jpeg.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 18 / 28 : test_nbpages.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 19 / 28 : test_output.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 20 / 28 : test_page_orient.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 21 / 28 : test_page_size.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 22 / 28 : test_py3k.py
OK OK OK OK OK OK SKIP SKIP OK SKIP SKIP SKIP SKIP
Test 23 / 28 : test_simple.py
HASHER SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP
Test 24 / 28 : test_stretching.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 25 / 28 : test_template.py
OK OK OK OK OK OK OK OK OK OK OK OK OK
Test 26 / 28 : test_ttfonts.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 27 / 28 : test_unicode.py
NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES NORES
Test 28 / 28 : test_winfonts.py
SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP SKIP
fcb0802
to
3403fb7
Compare
@pennersr I tried your version and honestly It was a breakthrough. Usually it takes me 230s at best to generate a pdf and now It is only 7s!!! Many thanks! |
Note that image handling has improved a lot, using Pillow, in fpdf2. |
With this PR, images are processed using PIL. This adds support for any PIL supported image format. Additionally, numpy is used to speed up image processing. This PR effectively makes #90 and #117 obsolete.