XML Entity Expansion, often called an “XML Bomb” or “Billion Laughs Attack,” is a Denial of Service (DoS) vulnerability. It occurs when an XML parser attempts to resolve nested or recursive entity references defined within a Document Type Definition (DTD). An attacker can craft a small XML document with internal entities that refer to each other exponentially. When the parser tries to expand these entities, it consumes a massive amount of memory and CPU resources, potentially crashing the parser, the application, or even the entire server. 💣💥
Like XXE (CWE-611), this vulnerability lies in the XML parser library configuration. Parsers that process DTDs and expand internal entities are potentially vulnerable.Key Remediation Principles:
Disable DTD Processing: This is the most effective defense, as it prevents the parser from processing the entity definitions in the first place. This also prevents most XXE attacks.
Limit Entity Expansion: If DTDs must be processed, configure the parser to limit the total size or number of entity expansions. Many modern parsers have built-in limits or flags for this.
Use Secure Parser Defaults: Keep XML parsing libraries updated, as newer versions often have safer defaults.
Resource Limits: Implement general resource limits (memory, CPU) at the application or server level as a defense-in-depth measure.
Vulnerable Scenario 1: ElementTree with DTD Processing
While ElementTree is generally safer against external entities by default, its handling of internal entity expansion can vary. Explicitly disabling DTDs is best.
Vulnerable Scenario 2: lxml without huge_tree=True protection (less common)
While lxml has protections, extremely large expansions or specific configurations might still pose a risk if limits aren’t hit. Disabling DTDs is still preferred.
ElementTree / minidom: Explicitly disable entity resolution using XMLParser(resolve_entities=False). This generally prevents DTD-based entity expansion.
lxml: Rely on defaults which are generally safe. For extra safety, use etree.XMLParser(resolve_entities=False). lxml also has built-in protection against quadratic expansion and limits total entity size, often preventing the classic “Billion Laughs”.
Submit the “Billion Laughs” XML payload (or variations with fewer nested entities if the full one is blocked by infrastructure) to all XML parsing endpoints. Observe the application’s response time and server resource usage (CPU, memory). A vulnerable application will likely hang, crash, or become unresponsive. Secure parsers should reject the input quickly with an error related to DTDs, entities, or resource limits.
Explicitly disable DTD processing using factory features. This is the primary defense against both XXE and XML bombs. If DTDs are needed, additionally disable general entity expansion (setExpandEntityReferences(false)) or rely on FEATURE_SECURE_PROCESSING.
Submit the “Billion Laughs” XML payload to endpoints parsing XML. Monitor server CPU and memory usage closely. A vulnerable application will show a significant spike and likely become unresponsive. A secure application should reject the input quickly with a parsing error (e.g., DTDs disallowed, entity limits exceeded) without consuming excessive resources.
Using System.Xml.XmlDocument, System.Xml.XmlReader, System.Xml.Linq.XDocument. Modern .NET versions have secure defaults that prohibit DTDs (DtdProcessing.Prohibit) and limit entity expansion. Vulnerabilities occur in older versions or if defaults are overridden insecurely.
Vulnerable Scenario 2: XmlReader with High MaxCharactersFromEntities
Even with DTDs prohibited, if entities are resolved from other sources, limits matter.
// parser/XmlReaderParser.csusing System.Xml;using System.IO;public void ProcessXmlReaderUnsafeLimits(string xmlData){ var settings = new XmlReaderSettings(); settings.DtdProcessing = DtdProcessing.Ignore; // DTDs ignored, but entities might still expand // DANGEROUS: Setting an extremely high limit for entity expansion size. settings.MaxCharactersFromEntities = 0; // 0 might mean unlimited in some contexts, or use long.MaxValue using (var reader = XmlReader.Create(new StringReader(xmlData), settings)) { // Parsing might still consume huge resources if internal entities expand massively. while (reader.Read()) { /* ... */ } }}
Rely on the secure defaults in modern .NET: DtdProcessing = DtdProcessing.Prohibit and XmlResolver = null. Avoid overriding these unless absolutely necessary and understood. Ensure MaxCharactersFromEntities retains its default, reasonably small limit.
Submit the “Billion Laughs” payload. Verify that the parser (using secure defaults or explicit settings like DtdProcessing.Prohibit) rejects the input quickly with an appropriate XmlException (e.g., “DTD is prohibited”) without consuming excessive memory or CPU. Check .NET framework version being used.
Explicitly disable external entity loading using libxml_disable_entity_loader(true) before parsing any untrusted XML. This prevents the processing of DTDs necessary for the attack. Use options like LIBXML_NONET and potentially LIBXML_NOENT (with caution) for defense-in-depth.
Submit the “Billion Laughs” payload to XML parsing endpoints. Verify that libxml_disable_entity_loader(true) is called before parsing. The parser should reject the input quickly (often returning false or throwing an exception related to DTDs/entities) without high resource usage. Check PHP version.
Explicitly configure the parser to disable DTD loading and entity expansion. Check library documentation for specific flags (dtdload, noent, doctype, etc.).
Submit the “Billion Laughs” payload. Check library documentation for secure parsing options (dtdload, noent, etc.) and ensure they are used. Verify the parser rejects the malicious input quickly without consuming excessive resources.
Vulnerable Scenario 2: Nokogiri with DTD Loading Enabled
Explicitly enabling DTD loading might reintroduce risk if internal limits are bypassed.
# parser/nokogiri_parser.rbrequire 'nokogiri'def parse_nokogiri_unsafe(xml_data) begin # DANGEROUS: Explicitly enabling DTD loading, although Nokogiri # has some internal limits, this increases risk surface. doc = Nokogiri::XML(xml_data) do |config| config.strict.dtdload # Enable DTD loading end # ... process doc ... rescue Nokogiri::XML::SyntaxError => e # ... endend
Submit the “Billion Laughs” payload to XML parsing endpoints. Verify REXML (with limit 0) or Nokogiri (by default) rejects the input quickly with errors related to entities or resource limits, without consuming excessive CPU/memory.