The Reader class serves as an entry point for parsing a PDF file. There are three ways to kick off processing - which one you pick will be based on personal preference and the situation.
For all examples, assume the receiver variable contains an object that will respond to various callbacks. Refer to the README and PDF::Reader::Content for more information on receivers.
PDF::Reader.file("somefile.pdf", receiver)
This is useful for processing a PDF that is already in memory
PDF::Reader.string(pdf_string, receiver)
This can be a useful alternative to the first 2 options in some situations
pdf = PDF::Reader.new pdf.parse(File.new("somefile.pdf"), receiver)
Both PDF::Reader#file and PDF::Reader#string accept a third argument that specifies which parts of the file to process. By default, all options are enabled, so this can be useful to cut down processing time if you‘re only interested in say, metadata.
As an example, the following call will disable parsing the contents of pages in the file, but explicitly enables processing metadata.
PDF::Reader.new("somefile.pdf", receiver, {:metadata => true, :pages => false})
Available options are currently:
:metadata :pages :raw_text
Parse the file with the given name, sending events to the given receiver.