Skip to main content

Overview

XML External Entity (XXE) is a vulnerability that allows an attacker to interfere with an application’s processing of XML data. If a poorly configured XML parser processes user-supplied XML that contains a reference to an external entity, the attacker can exploit it to read sensitive files from the server, perform network scans of the internal network (SSRF), or cause a denial of service (DoS).

Business Impact

XXE can be a critical vulnerability, leading to the complete disclosure of server-side files, including source code, configuration files with credentials, and sensitive OS files. It effectively gives an attacker read-access to the server’s file system, which can be a stepping stone for full system compromise.

Reference Details

CWE ID: CWE-611 OWASP Top 10 (2021): A05:2021 - Security Misconfiguration Severity: High

Framework-Specific Analysis and Remediation

Most modern XML parsers have been made secure by default against XXE. Vulnerabilities typically exist in older applications or when a developer explicitly enables risky features like DTD (Document Type Definition) processing to support legacy formats. The universal fix is to ensure all XML parsers are configured to disable DTDs and disallow the resolution of external entities.
  • Python
  • Java
  • .NET(C#)
  • PHP
  • Node.js
  • Ruby

Framework Context

Python’s standard library xml.etree.ElementTree is not vulnerable to XXE. However, the more powerful and commonly used third-party library lxml is vulnerable by default. Django applications that parse XML must ensure lxml is configured securely.

Vulnerable Scenario 1: Processing a SOAP Request

A Django API view uses lxml to parse an incoming SOAP request from a legacy system.
# api/views.py
from lxml import etree
from rest_framework.views import APIView
from rest_framework.response import Response

class SoapProcessorView(APIView):
    # Using a parser that does not handle XML safely
    parser_classes = [XMLParser] 

    def post(self, request):
        # DANGEROUS: The default lxml parser resolves external entities.
        # An attacker can submit a payload like:
        # <?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>
        try:
            root = etree.fromstring(request.body)
            # ... process the SOAP message ...
            return Response({"status": "success"})
        except etree.XMLSyntaxError:
            return Response({"error": "Invalid XML"}, status=400)

Vulnerable Scenario 2: A Document Upload Feature

A feature allows users to upload an XML-based document (e.g., for data import), which is then parsed on the server.
# documents/forms.py
class DocumentForm(forms.Form):
    xml_file = forms.FileField()

# documents/views.py
def upload_document(request):
    # ... form validation ...
    xml_content = request.FILES['xml_file'].read()
    # DANGEROUS: Using the default, unsafe lxml parser.
    root = etree.fromstring(xml_content)
    # ... logic to import data from the XML tree ...
    return HttpResponse("Document processed.")

Mitigation and Best Practices

When using lxml, always instantiate a parser with entity resolution explicitly disabled. This is the only guaranteed way to make parsing safe.

Secure Code Example

# api/views.py (Secure Version)
from lxml import etree

class SoapProcessorView(APIView):
    # ...
    def post(self, request):
        # SAFE: Create a parser that explicitly disables DTDs and entity resolution.
        # This prevents XXE attacks while still allowing well-formed XML to be parsed.
        safe_parser = etree.XMLParser(resolve_entities=False, no_network=True, dtd_validation=False)
        
        try:
            root = etree.fromstring(request.body, parser=safe_parser)
            # ... process the SOAP message ...
            return Response({"status": "success"})
        except etree.XMLSyntaxError:
            return Response({"error": "Invalid XML"}, status=400)

Testing Strategy

Write an integration test that uploads an XML file containing a malicious XXE payload. The test should assert that the application returns a controlled error (e.g., a 400 Bad Request due to invalid XML) and does not attempt to resolve the external entity. Mocking filesystem access can confirm that no unauthorized file reads occurred.
# documents/tests.py
from django.test import TestCase
from django.core.files.uploadedfile import SimpleUploadedFile

class XXETest(TestCase):
    def test_xxe_payload_is_rejected(self):
        xxe_payload = b'<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>'
        uploaded_file = SimpleUploadedFile("test.xml", xxe_payload, content_type="application/xml")
        
        # A vulnerable application might hang, crash, or return file content.
        # A secure one should reject the DTD or entity.
        response = self.client.post(reverse('upload-document'), {'xml_file': uploaded_file})
        
        self.assertEqual(response.status_code, 200) # Or 400 depending on error handling
        # Assert that the content of /etc/passwd is not in the response
        self.assertNotContains(response, "root:x:0:0")