Core Java
Imal Perera  

Create PDF from HTML – java

Spread the love

IReport is one of the good tools that I found for creating PDF documents dynamically but most recently I got an requirement to create a PDF file exactly same to a dynamically generated html page, so rather than looking to create a template using Ireport I just curious to know is there any java libraries that can convert html dom structure to pdf xml structure.

Finally i found that using IText and Xmlworker jar files it is possible

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;

import org.xhtmlrenderer.pdf.ITextRenderer;

import com.lowagie.text.DocumentException;

public class FirstDoc {

	public static void main(String[] args) throws IOException, DocumentException {

		String filecontent = readFile("C:\\signeddoc.html", StandardCharsets.UTF_8);
		/*
		 * In this example you can see that i'm reading a local html file which
		 * is always the same, but lets say you are in a situation where page is
		 * hosted and according to the parameters you pass to the url content
		 * changes
		 * 
		 * such a situation you can send a request and get the html content
		 * generated accordingly like this
		 * 
		 * String findinfoabout = "google";
		 * URL domain = new URL("http://en.wikipedia.org/wiki/"+findinfoabout);
		 * URLConnection domainconnection = domain.openConnection();
		 * BufferedReader in = new BufferedReader(new InputStreamReader(domainconnection.getInputStream()));
		 * String row = null;
		 * StringBuilder builder = new StringBuilder();
		 * while ((row = in.readLine()) != null){
		 *         builder.append(row);
		 * }
		 * in.close();
		 * System.out.println(builder.toString());
		 * filecontent = builder.toString();
		 * 
		 * above will use wikipedia to find something and return the article 
		 * in html format
		 * 
		 */

		String outputFile = "C:\\final.pdf";
		OutputStream os = new FileOutputStream(outputFile);

		ITextRenderer renderer = new ITextRenderer();
		renderer.setDocumentFromString(filecontent);
		renderer.layout();
		renderer.createPDF(os);
		os.close();

		System.out.println("PDF Document Created");
	}

	static String readFile(String path, Charset encoding) throws IOException {
		byte[] encoded = Files.readAllBytes(Paths.get(path));
		return encoding.decode(ByteBuffer.wrap(encoded)).toString();
	}
}

What you have to remember is your html content should validate against W3C validation

Here are the required Jar files

Leave A Comment