Hot questions for Using PDFBox in eclipse


Trying to implement pdfbox in eclipse but I'm getting this error when I run it.

>Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory

>   at org.apache.pdfbox.pdfparser.BaseParser.<clinit>(

>   at com.pdf.util.PDFTextParser.<init>(

>   at com.pdf.util.PDFTextParser.main(

>Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory

>   at Source)

>   at java.lang.ClassLoader.loadClass(Unknown Source)

>   at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)

>   at java.lang.ClassLoader.loadClass(Unknown Source)

>   ... 3 more

The program stops at this line of code:

parser = new PDFParser (new FileInputStream(file));

PDFParser comes from pdfbox.

I'm guessing there's something wrong with how I've attached the JAR files?

  • I moved all the jar files to a folder I created called "lib" which is part of the project.
  • Went into project Properties -> Java Build Path, and clicked "Add External JARs" for every JAR file
  • After doing this I noticed that it said "Source attachment: none" for each of the JARs, so I clicked edit and set the destination to its location in the lib folder.
  • When I go into Run Configuration, under Classpath, I can see the JAR files are there underneath my project.


PDFBox requires Commons Logging (see this dependencies page from the project's website). You need to reference that Jar in the classpath along with the PDFBox Jar. If you use a build tool like Maven, it should automatically download it for your project.


I am using PDFbox as an external library in Java in eclipse, every time I use some class/method from PDFBox, a java execution window would appear, just like my program calls another java program (it is the same java window when I use PDFBox in terminal.)

But this does not happen when I use other libraries and I feel like this process slows down my program (maybe not true). And I just do not like it? Anyone has idea why this happens and how to control it?

See the rightmost icon? It appears every time I run my program with PDFbox involved.

Here is a piece of code I used to extract text,

    PDDocument document = PDDocument.load(file_name);
    PDFTextStripper stripper = new PDFTextStripper();
    int num_of_pages = document.getNumberOfPages();
    int begin_page = num_of_pages - (num_of_pages/5+1);
    String all_text = stripper.getText(document);


I know this is ancient but just in case you're like me and Googling around trying to solve this, just update your configuration with the following under VM Options:

-Djava.awt.headless=true -Djava.awt.headlessLib=true


I am working on a plain Java project in eclipse juno using jre6/jdk6 as runtime/compiler. I wish to use apache pdfbox to generate some pdfs. i have downloaded and added pdfbox 1.8.9 to my build path. now i took a code sample from here, and used it in my application, but it is giving me multiple error which i think is related to some environment problems.

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
public class TestPdf {

PDDocument document = new PDDocument();
// Create a new blank page and add it to the document
PDPage blankPage = new PDPage();
document.addPage( blankPage );
// Save the newly created document"BlankPage.pdf");
// finally make sure that the document is properly
// closed.

These are the errors i am getting :

Syntax error on token "blankPage", VariableDeclaratorId expected after this token
Syntax error on token ""BlankPage.pdf"", delete this token
Syntax error on token "close", Identifier expected after this token


You should create a method and move some of the code inside the method :

public class TestPdf {

    PDDocument document = new PDDocument();
    // Create a new blank page and add it to the document
    PDPage blankPage = new PDPage();

    public void createDocument()throws Exception {
        // Save the newly created document"BlankPage.pdf");
        // finally make sure that the document is properly
        // closed.

The code that you posted in your question is against the syntax rules of the Java language. You can read more about the structure of a class here


I am using pdfbox-0.7.3.jar. I know missing related class files belongs to JAR pdfbox-0.7.3 but when i attach the source file. keep showing missing .class files. i am seeking for suggestions on the below error.

    import org.pdfbox.cos.COSDocument;
    import org.pdfbox.pdfparser.PDFParser;
    import org.pdfbox.pdmodel.PDDocument;
    import org.pdfbox.util.PDFTextStripper;
    import java.lang.NoClassDefFoundError;
    import java.util.Scanner;
        public class ggg{
        public static void main(String args[]) {
           // PDFTextStripper pdfStripper = null;
               // PDDocument pdDoc = null;
           // COSDocument cosDoc = null;
            File file = new File("C:\\Users\\firstfile.pdf");
            try {
                PDFParser parser = new PDFParser(new FileInputStream(file));
                COSDocument   cosDoc = parser.getDocument();
                PDFTextStripper   pdfStripper = new PDFTextStripper();
                PDDocument pdDoc = new PDDocument(cosDoc); 
                String parsedText = pdfStripper.getText(pdDoc);
            } catch (IOException e) {
                // TODO Auto-generated catch block
Exception in thread "main" java.lang.NoClassDefFoundError: org/fontbox/afm/FontMetric
    at org.pdfbox.pdmodel.font.PDFont.getAFM(
    at org.pdfbox.pdmodel.font.PDSimpleFont.getFontHeight(
    at org.pdfbox.util.PDFStreamEngine.showString(
    at org.pdfbox.util.operator.ShowTextGlyph.process(
    at org.pdfbox.util.PDFStreamEngine.processOperator(
    at org.pdfbox.util.PDFStreamEngine.processSubStream(
    at org.pdfbox.util.PDFStreamEngine.processStream(
    at org.pdfbox.util.PDFTextStripper.processPage(
    at org.pdfbox.util.PDFTextStripper.processPages(
    at org.pdfbox.util.PDFTextStripper.writeText(
    at org.pdfbox.util.PDFTextStripper.getText(
    at ggg.main(


Seems that you are not using any build tool.

Unfortunately, this library has additional dependencies.

org.fontbox.afm.FontMetric is a class that is located in fontbox-0.1.0.jar

You can go to Maven Central - PDF Box and download and add all libraries mentioned in dependencies to your project.

What else you can do is to setup a maven project. And add this dependency to your pom.xml. To do this you need:

  1. Install maven
  2. Create a project using maven command line command

    mvn -B archetype:generate \ -DarchetypeGroupId=org.apache.maven.archetypes \ \ -DartifactId=my-app

  3. Add maven PDF dependency to pom.xml file to the section <dependendencies>

    <dependency> <groupId>pdfbox</groupId> <artifactId>pdfbox</artifactId> <version>0.7.3</version> </dependency>

  4. Open your generated project as a Maven project inside your IDE (in your case it is Eclipse)

  5. Refresh project in IDE and let Eclipse download library with all dependencies for you.


I have these imports (among others):

import org.apache.pdfbox.*;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;

I have this dependency in my pom.xml:


I see this line in my eclipse maven dependencies:

pdfbox-2.0.4.jar - C:\Users\Paul\.m2\repository\org\apache\pdfbox\pdfbox\2.0.4\pdfbox-2.0.4.jar

I check the build path in eclipse, and see pdfbox-2.0.4.jar in the Maven Dependencies part.

I run mvn clean compile in a command prompt (Windows).

I get the error "package org.apache.pdfbox does not exist"

I run mvn dependency:build-classpath -Dmdep.outputFile=cp.txt

The following lines are listed in the class path (at the front of the class path):


I look in C:\Users\Paul.m2\repository\org\apache\pdfbox\pdfbox\2.0.4\ and I see pdfbox-2.0.4.jar

So what am I missing? Why is the pdfbox jar not being found?


remove this line:

import org.apache.pdfbox.*;

because that package does indeed not exist. The other ones (with deeper levels) are OK.