Ticket #214 (closed: fixed)

Opened 2 years ago

Last modified 2 years ago

too many inline images can't be handled

Reported by: volker Owned by: volker
Priority: immediate Component: mwlib.rl
Severity: Keywords:
Cc:

Description

reportlab does not seem to close the filehandles of inline images while constructing the document. therefore rendering can fail if the system limit of max open files per user are exceeded. (check by ulimit -n)

Attachments

fd-leak.zip Download (2.0 MB) - added by ralf 2 years ago.
zip file generated from heiko's collection
math.png Download (3.2 KB) - added by volker 2 years ago.
lazy-reader.diff Download (1.5 KB) - added by ralf 2 years ago.
use lazy loading for images

Change History

comment:1 Changed 2 years ago by jojo

Looks like reportlab.lib.utils.ImageReader does never close the file handle which is referenced with self.fp in __init__().

comment:2 Changed 2 years ago by volker

  • Priority changed from minuscule to blocker

comment:3 Changed 2 years ago by heiko

  • Status changed from new to assigned

please check with the reportlab folks if there is anyway to solve this issue.

comment:4 Changed 2 years ago by ralf

since we distribute reportlab in mwlib.ext we could also fix this ourselves.

comment:5 Changed 2 years ago by heiko

example:

mw-render -w rl -o test.pdf --keep-zip test.zip -c http://en.labs.wikimedia.org/w/ --collectionpage 'User:He!ko/Collections/This quantum world'

comment:6 Changed 2 years ago by ralf

If I add a return None in mathutils._renderMathTexvc:

{{ def _renderMathTexvc(latex, output_path, output_mode='png'):

"""only render mode is png""" return None

}}

I can render with ulimit -n 32 (i.e. 32 open files). Note that I don't have texvc installed, so this function returns None anyway on my system.

Why did you think this has something to do with reportlab? (It could still be a second bug..)

Changed 2 years ago by ralf

zip file generated from heiko's collection

comment:7 Changed 2 years ago by ralf

I've added the zip file from heiko's collection and double checked with python 2.5 (i'm using 2.6).

returning None makes it work with ulimit -n 32. Without, it fails with my standard limit of 1024 file descriptors.

comment:8 Changed 2 years ago by ralf

"Popen pipe file descriptor leak on OSError in init":  http://bugs.python.org/issue1751245

comment:9 Changed 2 years ago by ralf

here's another bug affecting our use of subprocess.Popen:  http://bugs.python.org/issue2791

comment:10 Changed 2 years ago by ralf

I'm back to other work, you should have enough info to fix it.

comment:11 follow-up: ↓ 12 Changed 2 years ago by volker

I am pretty sure this is a reportlab issue. The number of open file handles before and after rendering the formula with renderMath does not change - there does not seem to be an fd leak here.

With the following sample code the problem can be reduced to reportlab inline images:

from reportlab.platypus.paragraph import Paragraph
from reportlab.platypus.doctemplate import SimpleDocTemplate
from reportlab.lib.styles import ParagraphStyle

import os 
doc = SimpleDocTemplate('test.pdf')
elements = []

for count in range(1024):
    print len(os.listdir("/proc/%d/fd" % os.getpid()))
    p = Paragraph('text <img src="math.png" width="2cm" height="0.5cm" />' , ParagraphStyle('Normal'))
    elements.append(p)
doc.build(elements) 

ouput:

...
1021
1022
1023
1024
Traceback (most recent call last):
  File "test_fd.py", line 13, in <module>
OSError: [Errno 24] Too many open files: '/proc/32376/fd'

I submitted this issue to the reportlab mailinglist

Changed 2 years ago by volker

comment:12 in reply to: ↑ 11 Changed 2 years ago by ralf

Replying to volker:

I am pretty sure this is a reportlab issue. The number of open file handles before and after rendering the formula with renderMath does not change - there does not seem to be an fd leak here.

then we have two issues here. it leaks filedescriptors when texvc is not installed.

comment:13 Changed 2 years ago by ralf

btw. os.listdir("/proc/self/fd")

Changed 2 years ago by ralf

use lazy loading for images

comment:14 Changed 2 years ago by ralf

please try and test the above patch. I can render fd-leak.zip with ulimit -n 64 with it.

comment:15 Changed 2 years ago by volker

  • Status changed from assigned to closed
  • Resolution set to fixed

works now

Note: See TracTickets for help on using tickets.