com.lowagie.text.pdf.parser
public class PdfTextExtractor extends Object
Since: 2.1.4
| Field Summary | |
|---|---|
| SimpleTextExtractingPdfContentStreamProcessor | extractionProcessor The processor that will extract the text. |
| PdfReader | reader The PdfReader that holds the PDF file. |
| Constructor Summary | |
|---|---|
| PdfTextExtractor(PdfReader reader)
Creates a new Text Extractor object. | |
| Method Summary | |
|---|---|
| byte[] | getContentBytesForPage(int pageNum)
Gets the content stream of a page. |
| String | getTextFromPage(int page)
Gets the text from a page. |
Parameters: reader the reader with the PDF
Parameters: pageNum the page number of page you want get the content stream from
Returns: a byte array with the content stream of a page
Throws: IOException
Parameters: page the page number of the page
Returns: a String with the content as plain text (without PDF syntax)
Throws: IOException