Ticket #214 (closed: fixed)

Opened 20 months ago

Last modified 19 months ago

too many inline images can't be handled

Reported by: volker Owned by: volker
Priority: immediate Milestone:
Component: mwlib.rl Severity:
Keywords: Cc:

Description

reportlab does not seem to close the filehandles of inline images while constructing the document. therefore rendering can fail if the system limit of max open files per user are exceeded. (check by ulimit -n)

Attachments

fd-leak.zip Download (2.0 MB) - added by ralf 19 months ago.
zip file generated from heiko's collection
math.png Download (3.2 KB) - added by volker 19 months ago.
lazy-reader.diff Download (1.5 KB) - added by ralf 19 months ago.
use lazy loading for images

Change History

  Changed 20 months ago by jojo

Looks like reportlab.lib.utils.ImageReader does never close the file handle which is referenced with self.fp in __init__().

  Changed 20 months ago by volker

  • priority changed from minuscule to blocker

  Changed 20 months ago by heiko

  • status changed from new to assigned

please check with the reportlab folks if there is anyway to solve this issue.

  Changed 19 months ago by ralf

since we distribute reportlab in mwlib.ext we could also fix this ourselves.

  Changed 19 months ago by heiko

example:

mw-render -w rl -o test.pdf --keep-zip test.zip -c http://en.labs.wikimedia.org/w/ --collectionpage 'User:He!ko/Collections/This quantum world'

  Changed 19 months ago by ralf

If I add a return None in mathutils._renderMathTexvc:

{{ def _renderMathTexvc(latex, output_path, output_mode='png'):

"""only render mode is png""" return None

}}

I can render with ulimit -n 32 (i.e. 32 open files). Note that I don't have texvc installed, so this function returns None anyway on my system.

Why did you think this has something to do with reportlab? (It could still be a second bug..)

Changed 19 months ago by ralf

zip file generated from heiko's collection

  Changed 19 months ago by ralf

I've added the zip file from heiko's collection and double checked with python 2.5 (i'm using 2.6).

returning None makes it work with ulimit -n 32. Without, it fails with my standard limit of 1024 file descriptors.

  Changed 19 months ago by ralf

"Popen pipe file descriptor leak on OSError in init":  http://bugs.python.org/issue1751245

  Changed 19 months ago by ralf

here's another bug affecting our use of subprocess.Popen:  http://bugs.python.org/issue2791

  Changed 19 months ago by ralf

I'm back to other work, you should have enough info to fix it.

follow-up: ↓ 12   Changed 19 months ago by volker

I am pretty sure this is a reportlab issue. The number of open file handles before and after rendering the formula with renderMath does not change - there does not seem to be an fd leak here.

With the following sample code the problem can be reduced to reportlab inline images:

from reportlab.platypus.paragraph import Paragraph
from reportlab.platypus.doctemplate import SimpleDocTemplate
from reportlab.lib.styles import ParagraphStyle

import os 
doc = SimpleDocTemplate('test.pdf')
elements = []

for count in range(1024):
    print len(os.listdir("/proc/%d/fd" % os.getpid()))
    p = Paragraph('text <img src="math.png" width="2cm" height="0.5cm" />' , ParagraphStyle('Normal'))
    elements.append(p)
doc.build(elements) 

ouput:

...
1021
1022
1023
1024
Traceback (most recent call last):
  File "test_fd.py", line 13, in <module>
OSError: [Errno 24] Too many open files: '/proc/32376/fd'

I submitted this issue to the reportlab mailinglist

Changed 19 months ago by volker

in reply to: ↑ 11   Changed 19 months ago by ralf

Replying to volker:

I am pretty sure this is a reportlab issue. The number of open file handles before and after rendering the formula with renderMath does not change - there does not seem to be an fd leak here.

then we have two issues here. it leaks filedescriptors when texvc is not installed.

  Changed 19 months ago by ralf

btw. os.listdir("/proc/self/fd")

Changed 19 months ago by ralf

use lazy loading for images

  Changed 19 months ago by ralf

please try and test the above patch. I can render fd-leak.zip with ulimit -n 64 with it.

  Changed 19 months ago by volker

  • status changed from assigned to closed
  • resolution set to fixed

works now

Note: See TracTickets for help on using tickets.