

OUR SERVICES
UAS & C-UAS Assurance, Training & Intelligence
By Ka Wing Ho, Associate Security Consultant at PrivasecRED
In this blog, I explain the dangers of using server-side PDF generation technologies without properly sanitising user input.
Have you ever used a web application with an “Export to PDF” function? Not to be confused with “Print to PDF” by the way, that’s just something your web browser offers (ie. Ctrl+P).
Chances are that you have used this functionality at some point — It could be exporting search results, or if you’re an administrative user, it could be exporting logs and statistics surrounding your users and traffic. Similar functionality has existed for quite a while now in the form of “Export to CSV”. Users would generate on-demand plaintext reports with variable data such as dates, columns, number of items etc.
Why use PDF then? Well… CSV isn’t exactly the most readable format:
Exporting to PDF allows users to distribute and archive data in a productive manner, trading import capability for readability.
As is the norm for our current technological age, if you’ve got a problem, you’ll probably find numerous software solutions for it! Here are a handful of PDF generation technologies that have cropped up in the past decade or so:
Software/Library | Type |
EO.PDF | Library |
DinkToPDF | Library |
PDFKit | Library |
Node-html-pdf | Library |
Go-wkhtmltopdf | Library |
Flying Saucer | Library |
DomPDF | Library |
WeasyPrint | Library |
wkhtmltopdf | Headless Browser |
Headless Chrome | Headless Browser |
PhantomJS | Headless Browser |
MicroStrategy | Standalone Software |
PrinceXML | Standalone Software |
These technologies operate by filling in some form of HTML/XML template with whatever data you’ve queried from the backend database and then generating a nicely formatted PDF report for your end users or admins to look at! Great!
<iframe src=/etc/passwd>
or an External XML Entity (XXE), and the backend PDF generation software will happily attach external resources to your generated PDF. In a twist of irony, CSV doesn’t seem like such a bad export format after all… The impact of such functionality abuse depends on various factors such as:
Here is a list of possible impacts:
Unfortunately, this isn’t something that developers can (or will) easily fix, as there are legitimate use cases for some of these HTML/XML tags that cannot be ignored. The current approach involves offloading the responsibility onto the user to control what content can be embedded, rather than stopping the embedding of content altogether.
Let’s put it this way: Car manufacturers build cars with seatbelts in them. If you got injured because you didn’t wear your seatbelt whilst driving, do you blame the car manufacturer?
--no-network
and --no-local-files
options when executing Prince--disable-javascript
flag to prevent JavaScript executionRegardless of which PDF generation software you use, the following recommendations should always be followed:
Corben Leo – Discovery of vulnerability in PriceXML products
https://www.corben.io/XSS-to-XXE-in-Prince/
Triskele Labs – Discovery of vulnerability in MicroStrategy products
https://triskelelabs.com/extracting-your-aws-access-keys-through-a-pdf-file/
AWS – Introduced the IMDSv2 to mitigate SSRF attacks accessing AWS secrets
https://aws.amazon.com/blogs/security/defense-in-depth-open-firewalls-reverse-proxies-ssrf-vulnerabilities-ec2-instance-metadata-service/
PrinceXML Documentation: Security Best Practices
https://www.princexml.com/doc/server-integration/#security
MicroStrategy: Security Best Practices
https://community.microstrategy.com/s/article/Securing-PDF-and-Excel-Export-with-Whitelists
The Daily Swig – Benchmarks against several PDF generation libraries
https://portswigger.net/daily-swig/html-to-pdf-converters-open-to-denial-of-service-ssrf-directory-traversal-attacks
Cookie | Duration | Description |
---|---|---|
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |