Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Dropped custom image handling, added full Pillow support #118

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pennersr
Copy link

With this PR, images are processed using PIL. This adds support for any PIL supported image format. Additionally, numpy is used to speed up image processing. This PR effectively makes #90 and #117 obsolete.

@@ -33,6 +33,7 @@
url='http://code.google.com/p/pyfpdf',
license='LGPLv3+',
download_url="https://github.com/reingart/pyfpdf/tarball/%s" % fpdf.__version__,
install_requires=['numpy>=1.15.4', 'Pillow>=5.3.0'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree with numpy depency

Copy link
Author

@pennersr pennersr Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look here:

re_c = re.compile('(...).'.encode("ascii"), flags=re.DOTALL)

Here, bytes representing RGBA pixel values are processed using regular expressions. This is extremely slow. If you care about peformance while generating PDFs, this won't cut it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have all the deps you want in my fork

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to this code?:

https://github.com/alexanderankin/pyfpdf/blob/master/fpdf/image_parsing.py#L199

In any case, the way the pixels are being processed (by means of regular expressions seperating RGBA into RGB and A) is really really slow. So, yes, numpy is an additional dependency, but at least image handling is much more performant now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh im saying ill accept such a pr on my fork thats all. i agree with this move.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually can you help me test it? I copied this bit into my branch and the way i have my tests set up is that it generates a pdf, computes hash, compares the hash and then os.unlinks the generated pdf. However, with this proposed get_img_info function it is inserting an extra object and stream for the thumbnail preview, which of course makes the test fail but i should probably just re-do all my tests? thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i updated mine to 2.0.1 with all these changes but it broke a lot of my tests :/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed another fix to this branch to properly deal with unsupported extensions. Not sure about the inner workings of the test suite, but it seems to me that due to the Pillow changes the hashes of the resources changed.

>> Tests: 28
Test 1 / 28 : test_cache.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 2 / 28 : test_corebox.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 3 / 28 : test_e1252.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 4 / 28 : test_imgmask.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 5 / 28 : test_invoice.py
HASHER  FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    
Test 6 / 28 : test_issue14.py
HASHER  FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    FAIL    
Test 7 / 28 : test_issue33.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 8 / 28 : test_issue35.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 9 / 28 : test_issue41.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 10 / 28 : test_issue60.py
SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 11 / 28 : test_issue62.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 12 / 28 : test_issue63.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 13 / 28 : test_issue70.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 14 / 28 : test_issue71.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 15 / 28 : test_issue78.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 16 / 28 : test_issue82.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 17 / 28 : test_jpeg.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 18 / 28 : test_nbpages.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 19 / 28 : test_output.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 20 / 28 : test_page_orient.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 21 / 28 : test_page_size.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 22 / 28 : test_py3k.py
OK      OK      OK      OK      OK      OK      SKIP    SKIP    OK      SKIP    SKIP    SKIP    SKIP    
Test 23 / 28 : test_simple.py
HASHER  SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    
Test 24 / 28 : test_stretching.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 25 / 28 : test_template.py
OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      OK      
Test 26 / 28 : test_ttfonts.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 27 / 28 : test_unicode.py
NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   NORES   
Test 28 / 28 : test_winfonts.py
SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    SKIP    

@Alilino
Copy link

Alilino commented Feb 2, 2019

@pennersr I tried your version and honestly It was a breakthrough. Usually it takes me 230s at best to generate a pdf and now It is only 7s!!! Many thanks!

@Lucas-C
Copy link

Lucas-C commented Jan 6, 2021

Note that image handling has improved a lot, using Pillow, in fpdf2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants