Memory error

Recommend this page to a friend!

Memory error

Subject:	Memory error
Summary:	I get an Allowed memory size of exhausted error
Messages:	6
Author:	Carin
Date:	2016-07-18 12:16:14

1. Memory error

Report abuse

Carin - 2016-07-18 12:16:14

Hi,

I get this error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 7080 bytes) in PdfToText.phpclass on line 3446

It's got something to do with the images in my pdf.

How can I exclude images from the conversion? I only need the text to search through.

Thanks for this great code!

2. Re: Memory error

Report abuse

Christian Vigh - 2016-07-18 12:28:49 - In reply to message 1 from Carin

wow ! this is a realy interesting case, I (naively) made the assumption that everything would fit into memory.

Your suggestion of not processing images is really good and I will introduce an explicit flag to say that images have to be processed (the default behavior will be : do not process the images).

However, to allow me to test my modifications, would it be possible for you to send me your pdf file at the following address :

[email protected]

it would really be of great help to me !

Christian.

3. Re: Memory error

Report abuse

Carin - 2016-07-18 12:49:01 - In reply to message 2 from Christian Vigh

Thank you! I just emailed you.

4. Re: Memory error

Report abuse

Rolf Kellner - 2016-07-20 16:16:57 - In reply to message 3 from Carin

This is not a ConvertCharset.class issue. I already converted 25 M Byte documents without problems. Independent from text and image size. But the converter requires a lot of memory while converting PDFs
Your php.ini script is the bottleneck. Enlarge the value of
memory_limit=128M
in this script. Afterwards restart your server.

5. Re: Memory error

Report abuse

Christian Vigh - 2016-07-20 16:56:22 - In reply to message 4 from Rolf Kellner

Rolf,

Yes, that's exactly the workaround I suggested until I fix the issue .

In fact there are 2 issues :
- The first one is that the previous version of my class tried to automatically extract jpeg images ; before PHP 5.6, the gdlib extension regularly complained with a memory allocation error when you tried to handle jpeg images greater than (approximately) 2Mb. The pdf files of Carin have images between 2 and 3Mb, and she is using PHP 5.5.12. So I disabled by default automatic image extraction.
- The second issue is due to really big character map tables in the pdf file AND to my way of storing them, that can in turn cause a memory allocation error. In this particular case, changing the memory_limit setting is the right solution.

The really good new for you, Rolf, is that the sample Carin sent to me and the issue it revealed helped me to find out why sometimes there are bad Unicode conversions (well, I hope so...).

I just have to completely rework my way of handling character maps, as well as my way to handle Unicode codepoints, which is inappropriate in some cases for european languages, and almost failing for most of the middle- and far-east languages.

A new version (which I'll call V1.3, and not V1.2.x) should be available within one or two weeks.

6. Re: Memory error

Report abuse

Christian Vigh - 2016-08-03 20:17:48 - In reply to message 3 from Carin

Hello Carin,

I have finally managed to optimize memory usage for samples such as the one you provided to me (�31 August 1992.pdf�).

Initially it required that the memory_limit setting in php.ini be greater than 128Mb. Now it can run even if this setting is less than
32Mb.

This modification is available in the latest release, 1.2.29.

If you still have issues with files much greater than the one you sent to me, please send me a sample, I will be happy to have a look at it.

About us

Advertise on this site

For more information send a message to info at phpclasses dot org.