Create PDF from HTML – java
IReport is one of the good tools that I found for creating PDF documents dynamically but most recently I got an requirement to create a PDF file exactly same to a dynamically generated html page, so rather than looking to create a template using Ireport I just curious to know is there any java libraries that can convert html dom structure to pdf xml structure.
Finally i found that using IText and Xmlworker jar files it is possible
import java.io.FileOutputStream; import java.io.IOException; import java.io.OutputStream; import java.nio.ByteBuffer; import java.nio.charset.Charset; import java.nio.charset.StandardCharsets; import java.nio.file.Files; import java.nio.file.Paths; import org.xhtmlrenderer.pdf.ITextRenderer; import com.lowagie.text.DocumentException; public class FirstDoc { public static void main(String[] args) throws IOException, DocumentException { String filecontent = readFile("C:\\signeddoc.html", StandardCharsets.UTF_8); /* * In this example you can see that i'm reading a local html file which * is always the same, but lets say you are in a situation where page is * hosted and according to the parameters you pass to the url content * changes * * such a situation you can send a request and get the html content * generated accordingly like this * * String findinfoabout = "google"; * URL domain = new URL("http://en.wikipedia.org/wiki/"+findinfoabout); * URLConnection domainconnection = domain.openConnection(); * BufferedReader in = new BufferedReader(new InputStreamReader(domainconnection.getInputStream())); * String row = null; * StringBuilder builder = new StringBuilder(); * while ((row = in.readLine()) != null){ * builder.append(row); * } * in.close(); * System.out.println(builder.toString()); * filecontent = builder.toString(); * * above will use wikipedia to find something and return the article * in html format * */ String outputFile = "C:\\final.pdf"; OutputStream os = new FileOutputStream(outputFile); ITextRenderer renderer = new ITextRenderer(); renderer.setDocumentFromString(filecontent); renderer.layout(); renderer.createPDF(os); os.close(); System.out.println("PDF Document Created"); } static String readFile(String path, Charset encoding) throws IOException { byte[] encoded = Files.readAllBytes(Paths.get(path)); return encoding.decode(ByteBuffer.wrap(encoded)).toString(); } }
What you have to remember is your html content should validate against W3C validation
Here are the required Jar files