PDFBox Extracting Image - PDFBox

What is PDFBox - Extracting Image?

In the previous section, we went through on how to merge multiple PDF documents into single document. In this section, we will learn how to extract an image from a page of the PDF document.

Generating an Image from a PDF Document

We make use of PDFRenderer class provided by PDFBox library. It renders a PDF document into an AWT BufferedImage.

Below steps should be followed to generate an image from PDF document.

Step 1: Loading an Existing PDF Document

Step 2: Instantiating the PDFRenderer Class

PDFRenderer class should be instantiated in order to render a PDF document into an AWT BufferedImage. This class accepts document object as a parameter as depicted below.

Step 3: Rendering Image from the PDF Document

Once the class is instantiated, we need to render the image from the page using renderImage() method of the Renderer class. Pass the index of the page from which we need to extract the image.

Step 4: Writing the Image to a File

Once the image is extracted, we need to write the image to a file by using write() method. This method accepts three parameters-

The rendered image object.

  • String representing the type of the image (jpg or png).
  • File object to which you need to save the extracted image.

Step 5: Closing the Document

Lastly, to close the document use close() method of the PDDocument class as depicted below.

Example

Let us consider a PDF document by name sample.pdf in the path C:/PdfBox_Examples/ which contains an image in the first page as depicted below.

PDFBox  Extracting Image

Example below explains on how to convert above mentioned PDF document into an image file. We will extract the image in the first page of the document and save the image as myimage.jpg. Save the above code in a file by name PdfToImage.java.

Once the file is saved, compile and execute it from command prompt using the below commands-

Above program will extract the image from the given PDF document. Below message is rendered upon execution-

Verify the path given while saving the document and you will see that an image is created with name myimage.jpg as depicted below.

PDFBox  Extracting Image

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

PDFBox Topics