[TriLUG] To The Oracle:

Brian McCullough bdmc at buadh-brath.com
Tue Apr 29 21:42:21 EDT 2014


Greetings, all.

Once again, I have what I hope is an interesting question that some or
many of you, can help with.


Last fall, I learned about creating PDFs from PHP code, now I need to go
the other way, and extract data from PDFs.

I have found more than one method in PHP for reading PDFs, but,
unfortunately, even the newest methods don't seem to be able to deal
with "modern" PDFs, version 1.4.

Here, instead of text with other markup, as we see in older PDFs, there
seem to be blocks of binary code intermixed with markup.


Does anybody have any suggestions for dealing with this new version of
PDF?

Although I would like to do this in PHP, I will take other languages if
necessary.



Thanks,
Brian



More information about the TriLUG mailing list