Create PDF from HTML – java
IReport is one of the good tools that I found for creating PDF documents dynamically but most recently I got an requirement to create a PDF file exactly same to a dynamically generated html page, so rather than looking to create a template using Ireport I just curious to know is there any java libraries that can convert html dom structure to pdf xml structure.
Finally i found that using IText and Xmlworker jar files it is possible
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import org.xhtmlrenderer.pdf.ITextRenderer;
import com.lowagie.text.DocumentException;
public class FirstDoc {
public static void main(String[] args) throws IOException, DocumentException {
String filecontent = readFile("C:\\signeddoc.html", StandardCharsets.UTF_8);
/*
* In this example you can see that i'm reading a local html file which
* is always the same, but lets say you are in a situation where page is
* hosted and according to the parameters you pass to the url content
* changes
*
* such a situation you can send a request and get the html content
* generated accordingly like this
*
* String findinfoabout = "google";
* URL domain = new URL("http://en.wikipedia.org/wiki/"+findinfoabout);
* URLConnection domainconnection = domain.openConnection();
* BufferedReader in = new BufferedReader(new InputStreamReader(domainconnection.getInputStream()));
* String row = null;
* StringBuilder builder = new StringBuilder();
* while ((row = in.readLine()) != null){
* builder.append(row);
* }
* in.close();
* System.out.println(builder.toString());
* filecontent = builder.toString();
*
* above will use wikipedia to find something and return the article
* in html format
*
*/
String outputFile = "C:\\final.pdf";
OutputStream os = new FileOutputStream(outputFile);
ITextRenderer renderer = new ITextRenderer();
renderer.setDocumentFromString(filecontent);
renderer.layout();
renderer.createPDF(os);
os.close();
System.out.println("PDF Document Created");
}
static String readFile(String path, Charset encoding) throws IOException {
byte[] encoded = Files.readAllBytes(Paths.get(path));
return encoding.decode(ByteBuffer.wrap(encoded)).toString();
}
}
What you have to remember is your html content should validate against W3C validation
Here are the required Jar files