Deserialization of Untrusted Data

Overview

Insecure Deserialization occurs when an application deserializes data from an untrusted source without proper validation. Deserialization is the process of converting a byte stream or structured text (like XML/YAML) back into a live object in memory. If an attacker can control this serialized data, they can craft a malicious object that, when instantiated, can execute arbitrary code, bypass logic, or cause a denial of service.

Business Impact

This is often one of the most critical vulnerabilities, frequently leading directly to Remote Code Execution (RCE) on the application server. Exploitation involves “gadget chains”—leveraging pieces of existing, legitimate code in the application in unexpected ways to perform malicious actions during the deserialization process.

Reference Details

CWE ID: CWE-502 OWASP Top 10 (2021): A08:2021 - Software and Data Integrity Failures Severity: Critical

Framework-Specific Analysis and Remediation

The universal and most effective mitigation is to never deserialize data from untrusted sources using native, object-oriented serialization formats. Instead, use safe, data-only formats like JSON for all data interchange. If a native format is absolutely required, use features that restrict which classes can be instantiated or use a digital signature to verify the integrity and authenticity of the serialized data before processing.

Python
Java
.NET(C#)
PHP
Node.js
Ruby

Framework Context

Python’s pickle module is the primary mechanism for native object serialization, and it is notoriously insecure. The official documentation explicitly warns against unpickling data from untrusted sources. PyYAML’s load() function is also unsafe.A web application stores a user’s session object as a pickled, base64-encoded string in a cookie.

# middleware/session.py
import pickle
import base64

class PickleSessionMiddleware:
    def process_request(self, request):
        session_data = request.COOKIES.get('session')
        if session_data:
            # DANGEROUS: An attacker can replace their cookie with a malicious
            # pickled object that executes code upon deserialization.
            request.session = pickle.loads(base64.b64decode(session_data))

Vulnerable Scenario 2: Processing Data from a Task Queue

A Celery worker receives a task whose arguments include a YAML-serialized object.

# tasks.py
import yaml

@shared_task
def process_report(report_data_yaml):
    # DANGEROUS: The default yaml.load() can construct any Python object
    # and even execute arbitrary functions.
    report = yaml.load(report_data_yaml, Loader=yaml.Loader)
    # ... process report ...

Mitigation and Best Practices

Never use pickle or yaml.load() for data that has passed through an untrusted environment. Use json for all data interchange. For YAML, always use yaml.safe_load(). Django’s built-in session framework is secure and uses a signed JSON-based backend by default; rely on it instead of rolling your own.

Secure Code Example

# middleware/session.py (Secure Version)
import json
from django.core.signing import Signer, BadSignature

class JsonSessionMiddleware:
    def process_request(self, request):
        session_data = request.COOKIES.get('session')
        signer = Signer()
        if session_data:
            try:
                # SAFE: 1. Verify the signature to prevent tampering.
                #       2. Use json.loads(), which only creates simple data types.
                unsigned_data = signer.unsign(session_data)
                request.session = json.loads(unsigned_data)
            except (BadSignature, json.JSONDecodeError):
                request.session = {} # Handle tampered/invalid data

Testing Strategy

Testing for this is complex. It involves creating a known RCE payload for pickle (using a tool like ysoserial.net) and submitting it to the vulnerable endpoint. The test would then check for the side-effect of the code execution (e.g., a file being created on the server, or a network callback).

# A conceptual test
def test_session_deserialization_rce(self):
    # 1. Generate a malicious pickle payload that creates a file '/tmp/pwned'
    malicious_payload = generate_pickle_rce_payload("touch /tmp/pwned")
    encoded_payload = base64.b64encode(malicious_payload).decode()
    
    # 2. Set the cookie and make a request
    self.client.cookies['session'] = encoded_payload
    self.client.get('/')
    
    # 3. Assert that the file was NOT created
    self.assertFalse(os.path.exists('/tmp/pwned'))

Getting Started

Product Guides

Integrations

Supported Languages

Vulnerability Reference

User Management

Releases & Roadmap

Overview

Business Impact

Reference Details

Framework-Specific Analysis and Remediation

Framework Context

Vulnerable Scenario 2: Processing Data from a Task Queue

Mitigation and Best Practices

Secure Code Example

Testing Strategy

Getting Started

Product Guides

Integrations

Supported Languages

Vulnerability Reference

User Management

Releases & Roadmap

​Overview

​Business Impact

Reference Details

​Framework-Specific Analysis and Remediation

​Framework Context

​Vulnerable Scenario 1: Unpickling a Session Cookie

​Vulnerable Scenario 2: Processing Data from a Task Queue

​Mitigation and Best Practices

​Secure Code Example

​Testing Strategy

Overview

Business Impact

Framework-Specific Analysis and Remediation

Framework Context

Vulnerable Scenario 1: Unpickling a Session Cookie

Vulnerable Scenario 2: Processing Data from a Task Queue

Mitigation and Best Practices

Secure Code Example

Testing Strategy