Hot questions for Using PDFBox in annotations

Top Java Programmings / PDFBox / annotations

Question:

I would like to remove all the existing annotations in the PDF file. I could not find any direct method or API in PDFBox Annotaions API. Please provide any pointers to resolve the issue.

Thanks in advance for your help.


Answer:

Assuming that you have a PDPage object, just do this:

pdPage.setAnnotations(null);

Here's the full code for the 1.8.* versions of PDFBox:

PDDocument document = PDDocument.loadNonSeq(new File(pdfFilename), null);
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
for (PDPage pdPage : pdPages)
{ 
    pdPage.setAnnotations(null);
}
document.save(new File(...));
document.close();

Question:

I want to add the polygon in the PDF at the given coordinates, I referred this link for adding the annotation of circle and rectangle, but it does not contain anything for polygon. Does anyone know how to do it? Or does anyone know where do I get all documentation about PDFBox annotation.

Here I am sharing what I'vs done until now. But I couldn't proceed further.

import java.io.IOException;
import java.io.File;
import java.io.FileReader;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
import org.json.simple.parser.ParseException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDRectangle; 
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.pdfbox.pdmodel.graphics.color.PDColor;
import org.apache.pdfbox.pdmodel.graphics.color.PDDeviceRGB;
import org.apache.pdfbox.pdmodel.interactive.action.PDActionGoTo;
import org.apache.pdfbox.pdmodel.interactive.action.PDActionURI;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLine;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationText; 
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationMarkup;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationSquareCircle;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationTextMarkup;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDBorderStyleDictionary;
import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageDestination;
import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDPageFitWidthDestination;

 public class Polygon{

public static void main(String[] args) throws IOException {
    // TODO Auto-generated method stub

    // Loading the PDF File
    File file = new File("abc.pdf");
    PDDocument document = PDDocument.load(file);
    System.out.println("PDF Loaded.");

    PDPage page = document.getPage(0);
    List<PDAnnotation> polygon = page.getAnnotations();
    // Color of polygon
    PDColor color = new PDColor(new float[] {0, 0, 1}, PDDeviceRGB.INSTANCE);
    // Define border thickness
    PDBorderStyleDictionary thickness = new PDBorderStyleDictionary();
    thickness.setWidth((float)2);

    float[] vertices = {418, 110, 523, 110, 522, 132, 419, 133};

    PDAnnotationSquareCircle lines = new PDAnnotationSquareCircle(PDAnnotationSquareCircle.SUB_TYPE_POLYGON);
    lines.setColor(color);
    lines.setBorderStyle(thickness);

    /*****************
     * 
     *  ????
     *  *************************************/

    // Save annotations
    document.save(file);

    // Close document
    document.close();
}
}

As far I have seen, There isn't any method for adding vertices in polygon in PDAnnotation jar. So is there any way we can draw polygon here?

Thanks.


Answer:

Here's some code that will soon be added to the AddAnnotations.java example from the source code download:

static final float INCH = 72;
float pw = page1.getMediaBox().getUpperRightX();
float ph = page1.getMediaBox().getUpperRightY();

PDAnnotationMarkup polygon = new PDAnnotationMarkup();
polygon.getCOSObject().setName(COSName.SUBTYPE, PDAnnotationMarkup.SUB_TYPE_POLYGON);
position = new PDRectangle();
position.setLowerLeftX(pw - INCH);
position.setLowerLeftY(ph - INCH);
position.setUpperRightX(pw - 2 * INCH);
position.setUpperRightY(ph - 2 * INCH);
polygon.setRectangle(position);
polygon.setColor(blue); // border color
polygon.getCOSObject().setItem(COSName.IC, red.toCOSArray()); // interior color
COSArray triangleVertices = new COSArray();
triangleVertices.add(new COSFloat(pw - INCH));
triangleVertices.add(new COSFloat(ph - 2 * INCH));
triangleVertices.add(new COSFloat(pw - INCH * 1.5f));
triangleVertices.add(new COSFloat(ph - INCH));
triangleVertices.add(new COSFloat(pw - 2 * INCH));
triangleVertices.add(new COSFloat(ph - 2 * INCH));
polygon.getCOSObject().setItem(COSName.VERTICES, triangleVertices);
polygon.setBorderStyle(borderThick);

annotations.add(polygon);

to adjust your own code, you need to adjust the rectangle and pass your vertices:

position.setLowerLeftX(418);
position.setLowerLeftY(110);
position.setUpperRightX(523);
position.setUpperRightY(133);
polygon.setRectangle(position);
float[] vertices = {418, 110, 523, 110, 522, 132, 419, 133};
COSArray verticesArray = new COSArray();
for (float v : vertices)
    verticesArray.add(new COSFloat(v));
polygon.getCOSObject().setItem(COSName.VERTICES, verticesArray);

This is for 2.0 only. In 3.0 there will be a PDAnnotationPolygon type with appropriate methods. That version will also support the construction of appearance streams, i.e. you will be able to show the PDF with other viewers than Adobe Reader. Most viewers, e.g. PDF.js and PDFBox don't build missing appearances so you'll see nothing.

If you need the appearance for 2.0 you can try with the code in the file ShowAnnotation-6.java in https://issues.apache.org/jira/browse/PDFBOX-3353 .

To test with the 3.0 version, get the jar here: https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.0-SNAPSHOT/

To build the appearance, call polygon.constructAppearances();

Question:

I want to add freehand ink annotation on pdf page. The freehand annotation is getting added but it is not getting displayed on the pdf page. I didn't get what is the issue here. I am sharing the code I have done.

import java.io.IOException;
import java.io.File;
import java.util.Arrays;
import java.util.List;
import org.apache.pdfbox.cos.COSArray;
import org.apache.pdfbox.cos.COSFloat;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.pdmodel.graphics.color.PDColor;
import org.apache.pdfbox.pdmodel.graphics.color.PDDeviceRGB;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotation;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationMarkup;
import org.apache.pdfbox.pdmodel.interactive.annotation.PDBorderStyleDictionary;

public class Freehand {

public static void main(String[] args) throws IOException {
    // TODO Auto-generated method stub

    File file = new File("C:/Users/sinssb/Documents/07904660.pdf");
    PDDocument document = PDDocument.load(file);
    System.out.println("PDF loaded.");

    try {
        PDPage page = document.getPage(0);
        List<PDAnnotation> annotations = page.getAnnotations();
        PDColor color = new PDColor(new float[] {0, 0, 1}, PDDeviceRGB.INSTANCE);
        PDBorderStyleDictionary thickness = new PDBorderStyleDictionary();
        thickness.setWidth((float)2);

        PDAnnotationMarkup freehand = new PDAnnotationMarkup();
        freehand.getCOSObject().setName(COSName.SUBTYPE, PDAnnotationMarkup.SUB_TYPE_INK);
        freehand.setColor(color);
        freehand.setBorderStyle(thickness);

        float[] coordinates = new float[] {86,140,85,140,83,140,81,139,79,137,76,135,73,133,71,131,69,129,68,127,67,125,67,123,67,122,67,120,67,119,67,116}; 
        PDRectangle points = new PDRectangle();

        float[] allX = new float[coordinates.length / 2];
        float[] allY = new float[coordinates.length / 2];

        int k = 0, l = 0;
        for (int j = 0; j < coordinates.length; j++) {
            if (j % 2 == 0) {
                allX[k] = coordinates[j];
                k++;
            }
            else {
                allY[l] = coordinates[j];
                l++;
            }               
        }

        Arrays.sort(allX);
        Arrays.sort(allY);

        float smallestX = allX[0];
        float smallestY = allY[0];
        float largestX = allX[allX.length - 1];
        float largestY = allY[allY.length - 1];

        points.setLowerLeftX(smallestX);
        points.setLowerLeftY(smallestY);
        points.setUpperRightX(largestX);
        points.setUpperRightY(largestY);
        freehand.setRectangle(points);
        System.out.println(points);
        freehand.setContents("Hello");

        COSArray verticesArray = new COSArray();

        for (int i = 0; i < coordinates.length; i++) {
            verticesArray.add(new COSFloat(coordinates[i]));
        }

        freehand.getCOSObject().setItem(COSName.INKLIST, verticesArray);
        annotations.add(freehand);
        System.out.println("Freehand is added.");
    } catch (Exception e) {
        // TODO: handle exception
        e.printStackTrace();
    }

    // Save the file
    document.save(file);

    // Close the document
    document.close();
}

}

This code adds the annotation as I can see the annotation and comments in the comment section of the Acrobat Reader but the I cannot see the drawing on the page.

Thanks in the advance.


Answer:

The inklist is an array of arrays (because one annotation can have several lines), so change your code like this:

COSArray verticesArray = new COSArray();

for (int i = 0; i < coordinates.length; i++) {
    verticesArray.add(new COSFloat(coordinates[i]));
}

// new / changed
COSArray verticesArrayArray = new COSArray();
verticesArrayArray.add(verticesArray);
freehand.getCOSObject().setItem(COSName.INKLIST, verticesArrayArray);

Question:

I need to merge comments taken from many versions of the same pdf file but with different comments, into one PDF file containing all comments.

I take all the comments from the pages and create an arrayList of them, then I simply set this array of comments on the new pdf file and it works pretty well.

The problem is that I also need to create an Excel with all the comments found and together with their "status" (accepted, cancelled, rejected, ecc...).

The status seems to be managed as a separate annotation/comment from PDFBox and I can't find any relation between a comment and its status.

Example:

I have a PDAnnotation object with content "COMMENT 1".

And I have another PDAnnotation object with content "Accepted by user XX" (the status of COMMENT 1).

On Acrobat Reader I see the comment "COMMENT 1" with the status set on "Accepted", so there must be a relation between the two objects, but I can't find it.

Any ideas?


Answer:

Using the PDFDebugger is a good suggestion, it should give to you an overview of how objects (including PDAnnotations) are linked to each other

Anyway, check if in your child PDAnnotation in the COSDictionary you have a COSBase{IRT} key, that key should contain as value the parent COSObject

So if you do something link this:

COSDictionary parentDict = (COSDictionary) childDict.getDictionaryObject("IRT");

You should get the parent PDAnnotation dictionary and you can take all the data you need

Please notice the cast is necessary since getDictionaryObject returns a COSBase, but the object returned for the IRT key is actually a COSDictionary

Question:

I ran into a very tough issue. We have forms that were supposed to be filled out, but some people used annotation freeform text comments in foxit instead of filling the form fields, so the annotations never flatten. When our render software generates the final document annotations are not included.

The solution I tried is to basically go through the document, get the annotation text content and write it to the pdf so it is on the final document then remove the actual annotation, but I run into an issue where I don't know the font the annotation is using, line space, etc so cannot find out how to get it from a pdfbox to recreate exacactly as the annotation looks on the unflattened form. Basically I want to flatten annotatations that are freeform created in foxit (The typewriter comment feature) Here is the code. It is working, but again I am struggling with figuring out how to get the annotations to write to my final pdf document. Again flatten on the acroform is not working because these are not acroform fields! The live code filters out anything that is not a freetext type annotation, but below code should show my issue.

    public static void main(String [] args)
{
        String startDoc = "C:/test2/test.pdf";
     String  finalFlat = "C:/test2/test_FLAT.pdf";

    try {
        // for testing
        try {
            //BasicConfigurator.configure();
            File myFile = new File(startDoc);
            PDDocument pdDoc = PDDocument.load( myFile );
            PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
            PDAcroForm pdAcroForm = pdCatalog.getAcroForm();

            // set the NeedApperances flag
            pdAcroForm.setNeedAppearances(false);

            // correct the missing page link for the annotations
            for (PDPage page : pdDoc.getPages()) {

                for (PDAnnotation annot : page.getAnnotations()) {
                    System.out.println(annot.getContents());
                    System.out.println(annot.isPrinted());
                    System.out.println(annot.isLocked());

                    System.out.println(annot.getAppearance().toString());
                    PDPageContentStream contentStream = new PDPageContentStream(pdDoc, page, PDPageContentStream.AppendMode.APPEND,true,true);
                    int fontHeight = 14; 
                    contentStream.setFont(PDType1Font.TIMES_ROMAN, fontHeight);

                    float height = annot.getRectangle().getLowerLeftY();

                    String s  = annot.getContents().replaceAll("\t", "    ");

                    String ss[] = s.split("\\r");
                    for(String sss : ss)
                    {
                        contentStream.beginText();  
                        contentStream.newLineAtOffset(annot.getRectangle().getLowerLeftX(),height );
                      contentStream.showText(sss);
                      height = height + fontHeight * 2 ;

                      contentStream.endText();
                    }
                      contentStream.close();
                    page.getAnnotations().remove(annot);                    
                }
            }               
            pdAcroForm.flatten();
            pdDoc.save(finalFlat);
            pdDoc.close();
        }
        catch (Exception e) {
            e.printStackTrace();
        }   

    }
    catch (Exception e) {
        System.err.println("Exception: " + e.getLocalizedMessage());
    }
}

Answer:

This was not a fun one. After a million different tests, and I STILL do not understand all the nuances, but this is the version that appeas to flatten all pdf files and annotations if they are visible on PDF. Tested about half a dozen pdf creators and if an annotation is visible on a page this hopefully flattens it. I suspect there is a better way by pulling the matrix and transforming it and what not, but this is the only way I got it to work everywhere.

public static void flattenv3(String startDoc, String endDoc) {

  org.apache.log4j.Logger.getRootLogger().setLevel(org.apache.log4j.Level.INFO);
  String finalFlat = endDoc;


  try {

   try {
    //BasicConfigurator.configure();
    File myFile = new File(startDoc);
    PDDocument pdDoc = PDDocument.load(myFile);
    PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
    PDAcroForm pdAcroForm = pdCatalog.getAcroForm();

    if (pdAcroForm != null) {
     pdAcroForm.setNeedAppearances(false);
     pdAcroForm.flatten();
    }

    // set the NeedApperances flag

    boolean isContentStreamWrapped;
    int ii = 0;

    for (PDPage page: pdDoc.getPages()) {
     PDPageContentStream contentStream;
     isContentStreamWrapped = false;
     List < PDAnnotation > annotations = new ArrayList < > ();

     for (PDAnnotation annotation: page.getAnnotations()) {



      if (!annotation.isInvisible() && !annotation.isHidden() && annotation.getNormalAppearanceStream() != null)

      {
       ii++;
       if (ii > 1) {
        // contentStream.close();
        // continue;

       }


       if (!isContentStreamWrapped) {
        contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true, true);
        isContentStreamWrapped = true;
       } else {
        contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true);
       }

       PDAppearanceStream appearanceStream = annotation.getNormalAppearanceStream();

       PDFormXObject fieldObject = new PDFormXObject(appearanceStream.getCOSObject());

       contentStream.saveGraphicsState();


       boolean needsTranslation = resolveNeedsTranslation(appearanceStream);



       Matrix transformationMatrix = new Matrix();
       boolean transformed = false;

        float lowerLeftX = annotation.getNormalAppearanceStream().getBBox().getLowerLeftX();
        float lowerLeftY = annotation.getNormalAppearanceStream().getBBox().getLowerLeftY();
        PDRectangle bbox = appearanceStream.getBBox();
        PDRectangle fieldRect = annotation.getRectangle();

        float xScale = fieldRect.getWidth() - bbox.getWidth();

        transformed = true;

        lowerLeftX = fieldRect.getLowerLeftX();
        lowerLeftY = fieldRect.getLowerLeftY();
        if (bbox.getLowerLeftX() <= 0 && bbox.getLowerLeftY() < 0 && Math.abs(xScale) < 1) //BASICALLY EQUAL TO 0 WITH ROUNDING
        {


         lowerLeftY = fieldRect.getLowerLeftY() - bbox.getLowerLeftY();
         if (bbox.getLowerLeftX() < 0 && bbox.getLowerLeftY() < 0) //THis is for the o
         {

          lowerLeftX = lowerLeftX - bbox.getLowerLeftX(); 

         }

        } else if (bbox.getLowerLeftX() == 0 && bbox.getLowerLeftY() < 0 && xScale >= 0) {

         lowerLeftX = fieldRect.getUpperRightX();

        } else if (bbox.getLowerLeftY() <= 0 && xScale >= 0) {

         lowerLeftY = fieldRect.getLowerLeftY() - bbox.getLowerLeftY() - xScale;

        } else if (bbox.getUpperRightY() <= 0) {

         if (annotation.getNormalAppearanceStream().getMatrix().getShearY() < 0) {
          lowerLeftY = fieldRect.getUpperRightY();
          lowerLeftX = fieldRect.getUpperRightX();

         }

        } else {

        }



        transformationMatrix.translate(lowerLeftX,
         lowerLeftY);
        contentStream.transform(transformationMatrix);


       contentStream.drawForm(fieldObject);
       contentStream.restoreGraphicsState();
       contentStream.close();
      }
     }
     page.setAnnotations(annotations);
    }


    pdDoc.save(finalFlat);
    pdDoc.close();
    File file = new File(finalFlat);

    // Desktop.getDesktop().browse(file.toURI());


   } catch (Exception e) {
    e.printStackTrace();
   }

  } catch (Exception e) {
   System.err.println("Exception: " + e.getLocalizedMessage());
  }
 }

}