> ## Documentation Index
> Fetch the complete documentation index at: https://guide.codepure.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Static Application Security Testing (SAST)

> Find vulnerabilities in your source code before they reach production. 22 languages. Zero blind spots.

## For Executives

Codepure SAST scans your source code for security vulnerabilities before they reach production. It works with **22 programming languages**, including enterprise platforms like SAP ABAP, COBOL, Salesforce Apex, and Solidity smart contracts — languages that most security tools cannot read.

Every finding includes a complete attack trace that shows exactly how user input reaches a dangerous operation, so your developers know what to fix and why. The engine ships with built-in compliance mapping for NCA ECC, SAMA, PCI DSS, GDPR, and HIPAA, so your audit reports are ready from day one.

* **22 languages** — from JavaScript and Python to ABAP, COBOL, and Solidity
* **Complete attack traces** — every finding shows source to sink, not just a pattern match
* **Built-in compliance** — NCA ECC, SAMA, PCI DSS, GDPR, HIPAA mapped automatically
* **Zero infrastructure** — one binary, no dependencies, runs on-premises or air-gapped
* **Secret detection included** — 34 pattern types scan all file types for hardcoded credentials

***

## What SAST Does

Static Application Security Testing analyzes your application's source code without running it. It reads every file, builds a structured understanding of the code, tracks how data moves through the application, and flags any path where untrusted user input reaches a dangerous operation without being sanitized first.

This catches SQL injection, command injection, cross-site scripting, path traversal, server-side request forgery, insecure deserialization, XXE, hardcoded secrets, and dozens of other vulnerability types — before the code is ever deployed.

## How Codepure SAST Works

Codepure uses a taint-propagation analysis engine. Instead of looking for code that *resembles* a vulnerability, it follows the actual flow of data through the application.

**Step 1 — Source Detection**

The engine identifies where untrusted data enters the application: HTTP request parameters and headers, file uploads, environment variables, database reads, API responses, message queue payloads, CLI arguments.

**Step 2 — Taint Propagation**

Tainted data is tracked through variable assignments, function calls, object properties, array indices, method chains, and return values. The engine follows data across file and module boundaries using inter-procedural analysis.

**Step 3 — Sanitizer Recognition**

When tainted data passes through a validation or encoding function — `htmlspecialchars()` in PHP, `$wpdb->prepare()` in WordPress, `PreparedStatement` in Java, parameterized queries in .NET — the taint is cleared. The engine understands language-specific and framework-specific sanitizers.

**Step 4 — Sink Matching**

The engine identifies dangerous operations where tainted data causes harm: SQL execution, command execution, file system operations, outbound HTTP requests, XML parsing, deserialization. Only findings where unsanitized data reaches a sink are reported.

**Step 5 — Attack Trace Generation**

Every finding includes a complete source-to-sink chain with line numbers, code snippets, and variable names at each step. This tells developers exactly where the vulnerability is and how to fix it.

## Vulnerability Categories

Codepure detects the following vulnerability types across its supported languages. Each category includes multiple detection rules tuned for specific frameworks and patterns.

| Category                               | CWE               | Description                                                                                                                      |
| -------------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------- |
| **SQL Injection**                      | CWE-89            | Unsanitized input in SQL queries, ORM raw queries, EF Core FromSqlRaw, Sequelize raw queries, dynamic SOQL/SOSL                  |
| **NoSQL Injection**                    | CWE-943           | User input in MongoDB/Mongoose queries, Sequelize where clauses, DynamoDB expressions                                            |
| **Command Injection**                  | CWE-78            | Unsanitized input in OS command execution — `system()`, `exec()`, `ProcessBuilder`, `child_process`, `Runtime.exec()`            |
| **Code Injection**                     | CWE-94            | User input passed to `eval()`, `Function()` constructor, `call_user_func()`, `include` with variable paths, `source` in shell    |
| **Cross-Site Scripting (XSS)**         | CWE-79            | Reflected, stored, and DOM-based XSS through unsanitized response output, `innerHTML`, `document.write()`, framework equivalents |
| **Path Traversal**                     | CWE-22            | Directory traversal through unsanitized file paths in file read/write/stream operations                                          |
| **Server-Side Request Forgery (SSRF)** | CWE-918           | User-controlled URLs in outbound HTTP requests, `URL.openStream()`, HTTP client methods, `curl`/`wget` with variable input       |
| **XML External Entity (XXE)**          | CWE-611           | XML parsing with DTD processing enabled — `XmlReader`, `DocumentBuilder`, `libxml`, `XElement.Parse`                             |
| **Insecure Deserialization**           | CWE-502           | Deserialization of untrusted data — `ObjectInputStream`, `BinaryFormatter`, `pickle.loads()`, `unserialize()`, `Yaml.load()`     |
| **LDAP Injection**                     | CWE-90            | Unsanitized input in LDAP query filters through `DirectorySearcher`, `LdapConnection`, Spring LDAP templates                     |
| **Remote File Inclusion**              | —                 | Dynamic file inclusion with user-controlled paths in PHP, .NET, and Java                                                         |
| **Local File Inclusion**               | —                 | File read operations with user-controlled paths allowing access to sensitive system files                                        |
| **Open Redirect**                      | CWE-601           | User-controlled URLs in redirect operations — `PageReference`, `redirect()`, `Response.Redirect()`                               |
| **Hardcoded Credentials**              | CWE-798           | Passwords, API keys, tokens, and secrets written directly in source code                                                         |
| **Broken Cryptography**                | CWE-327           | Use of deprecated algorithms — DES, 3DES, SHA-1, MD5 — and weak encryption modes like ECB                                        |
| **Weak Random Number Generation**      | CWE-338           | `Math.random()`, `java.util.Random`, `mt_rand()` used in security-sensitive contexts                                             |
| **Cleartext Transmission**             | CWE-319           | HTTP instead of HTTPS, unencrypted network requests, cleartext storage of sensitive data                                         |
| **Sensitive Data Exposure**            | CWE-200           | Stack traces in responses, environment variables in logs, sensitive data to clipboard, unmasked output                           |
| **Log Information Leakage**            | CWE-532           | Passwords, tokens, and sensitive data written to application logs                                                                |
| **Buffer Overflow**                    | CWE-120           | `strcpy`, `strcat`, `memcpy` without bounds checking in C/C++                                                                    |
| **Out-of-Bounds Read/Write**           | CWE-125 / CWE-787 | Unsafe array access and assignment in C/C++ kernel code                                                                          |
| **Use After Free**                     | CWE-416           | Accessing memory after `kfree`/`vfree` in C kernel code                                                                          |
| **Double Free**                        | CWE-415           | Calling `free()` twice on the same pointer                                                                                       |
| **Format String Vulnerability**        | CWE-134           | User-controlled format strings in `printf`/`fprintf`                                                                             |
| **Type Confusion**                     | CWE-843           | Unsafe type casts in C/C++ kernel code                                                                                           |
| **Privilege Escalation**               | CWE-732           | `sudo`/`su` usage in shell scripts with variable input                                                                           |
| **Insecure Direct Object Reference**   | CWE-639           | Unauthorized data access through manipulated object identifiers                                                                  |
| **Broken Access Control**              | CWE-284           | Missing authorization checks on insert/update/delete operations (Apex), unprotected administrative endpoints                     |
| **JWT Weaknesses**                     | CWE-347           | Hardcoded JWT secrets, algorithm confusion (HS256 vs RS256), `jwt.decode()` without verification                                 |
| **Insecure Certificate Validation**    | CWE-295           | Disabled TLS certificate verification in network requests                                                                        |
| **Insecure Biometric Authentication**  | CWE-287           | Weak biometric auth implementations in mobile apps (Dart/Flutter)                                                                |
| **Insufficient Session Expiration**    | CWE-613           | Sessions that do not expire or use weak expiration logic                                                                         |
| **CRLF Injection**                     | CWE-93            | User input injected into HTTP headers through carriage return/line feed sequences                                                |
| **Prototype Pollution**                | CWE-1321          | User-controlled property names in JavaScript object assignment                                                                   |
| **Server-Side Template Injection**     | CWE-94            | User input rendered directly in template engines — Twig, Jinja2, Blade, Thymeleaf, EJS                                           |
| **SOQL/SOSL Injection**                | —                 | Dynamic SOQL/SOSL queries with string concatenation in Salesforce Apex                                                           |
| **ABAP Native SQL Injection**          | —                 | Unsanitized input in `EXEC SQL` and dynamic Open SQL WHERE clauses                                                               |
| **ABAP Missing Authority Checks**      | —                 | Database operations without `AUTHORITY-CHECK` in ABAP programs                                                                   |
| **COBOL Arithmetic Overflow**          | —                 | Arithmetic operations without `ON SIZE ERROR` handling                                                                           |
| **COBOL Buffer Overflow**              | —                 | `MOVE` without length check, `UNSTRING` overflow                                                                                 |
| **CI/CD Pipeline Misconfiguration**    | —                 | Unpinned actions, hardcoded credentials in workflows, insecure OIDC config, running as root, missing approval gates              |

## Supported Languages and Frameworks

Codepure scans 22 programming languages with framework-specific detection rules built from real-world code patterns.

### JavaScript / TypeScript

Node.js, Express, Fastify, Koa, Hapi, NestJS, Next.js, Nuxt, React, Angular, Vue.js, Socket.IO, Apollo GraphQL, Sequelize, Mongoose, Prisma, TypeORM

### Python

Django, Flask, FastAPI, Pyramid, Bottle, Tornado, Celery, SQLAlchemy, Pandas, Jinja2

### PHP

Laravel, Symfony, WordPress, Drupal, Magento, CodeIgniter, Slim, CakePHP, Yii, PHPMailer

### Java

Spring Boot, Spring MVC, Spring Security, Spring Data/JPA, Jakarta EE, Apache Struts, JSF (JavaServer Faces), Hibernate, JPA, MyBatis, Micronaut, Quarkus, Eclipse Vert.x, Apache Camel

### C# / .NET

ASP.NET Core MVC, ASP.NET Core Web API, ASP.NET Core Minimal APIs, ASP.NET Web Forms, ASP.NET Framework (Legacy), Blazor, Entity Framework Core, Entity Framework (Legacy), Dapper, WCF (Windows Communication Foundation), SQLite (System.Data.SQLite)

### Go

net/http (standard library), Gin, Echo, Fiber, Gorilla Mux, Chi, GORM, sqlx, gRPC

### Ruby

Ruby on Rails, Sinatra, Padrino, Devise, ActiveRecord, Sidekiq

### Kotlin

Spring Boot (Kotlin), Ktor, Android SDK, Exposed, Anko

### Scala

Play Framework, Akka, Slick, Cats, ZIO

### Groovy

Grails, Gradle Plugins, Jenkins Shared Libraries, Spock, Groovy Templates

### Swift

iOS SDK, SwiftUI, UIKit, Vapor, Kitura, Combine

### Dart / Flutter

Flutter, Dio, http package, sqflite, Firebase, Provider, Riverpod, WebView (webview\_flutter)

### Rust

Axum, Actix, Rocket, Warp, Tokio, Serde, Diesel, sqlx

### C

Linux Kernel, POSIX/glibc, Embedded Systems

### C++

STL, Boost, Qt, Embedded Systems

### Solidity

Ethereum Smart Contracts, OpenZeppelin, Hardhat, Truffle

### Salesforce Apex

Salesforce Core, Lightning Web Components (LWC), Aura Components, DML Operations

### SAP ABAP

SAP ECC, S/4HANA, BSP (Business Server Pages), Web Dynpro, RFC/BAPI

### COBOL

IBM Mainframe, CICS, DB2, VSAM

### Bash / Shell

Bash, sh, zsh, CGI Scripts, Cron Jobs

### YAML (CI/CD)

GitHub Actions, GitLab CI, Jenkins, Azure Pipelines

### Secrets (All File Types)

Configuration files, environment files, documentation, source code, commit history

## Compliance Mapping

Every Codepure finding is automatically mapped to the compliance frameworks your organization and auditors care about. The SARIF output includes control IDs and descriptions for each vulnerability.

* **Saudi NCA ECC** — Essential Cybersecurity Controls for input validation, XSS, path traversal, XXE, deserialization, SSRF, and credential management
* **Saudi SAMA** — Cybersecurity Framework for secure coding, access control, XML processing, encryption, and cryptographic algorithms
* **PCI DSS** — Payment Card Industry requirements for injection flaws, cross-site scripting, strong cryptography, and credential management
* **GDPR** — General Data Protection Regulation requirements for personal data protection, encryption in transit, and access controls
* **HIPAA** — Health Insurance Portability and Accountability Act requirements for protected health information, encryption, and audit controls

## Using Codepure SAST

### Quick Start

#### 1. Initiate Scan

Configure and launch your SAST analysis:

<Frame>
  <img src="https://mintcdn.com/codepure-033ac3c2/hhAOIxN6H9SU03Pv/assets/sast2.PNG?fit=max&auto=format&n=hhAOIxN6H9SU03Pv&q=85&s=49076456895e50158c6713fb064ab796" style={{ borderRadius: '0.5rem' }} width="680" height="321" data-path="assets/sast2.PNG" />
</Frame>

* Navigate to your project dashboard
* Select the repository or upload code
* Choose **SAST** from the scanner options
* Configure scan parameters:
  * Select rule sets (OWASP, CWE, Custom)
  * Set severity thresholds
  * Choose full or targeted scan
* Click **Start Scanning** to begin analysis

#### 2. Review Results

Access detailed vulnerability findings with contextual information:

<Frame>
  <img src="https://mintcdn.com/codepure-033ac3c2/oD0Y7wym7_JoEUVv/assets/sast-1.png?fit=max&auto=format&n=oD0Y7wym7_JoEUVv&q=85&s=004c34e13c33c4a63b58c57462eb7de0" style={{ borderRadius: '0.5rem' }} width="1918" height="1018" data-path="assets/sast-1.png" />
</Frame>

Each finding includes:

* **Severity** — Critical, High, Medium, Low, Informational
* **CWE classification** — Common Weakness Enumeration mapping
* **OWASP Top 10 category** — alignment with industry standards
* **Compliance tags** — NCA ECC, SAMA, PCI DSS, GDPR, HIPAA relevance
* **Complete attack trace** — source-to-sink chain showing exactly how user input reaches the dangerous operation, with line numbers and code at each step

Filter results by severity, vulnerability type, compliance standard, file, team, or introduction date.

#### 3. Remediate Vulnerabilities

Get actionable fix guidance with code examples:

<Frame>
  <img src="https://mintcdn.com/codepure-033ac3c2/bDuBqFJGeaQD-Edg/assets/sast-sqli.png?fit=max&auto=format&n=bDuBqFJGeaQD-Edg&q=85&s=5191ddcc0a509acc6127033a16fe2ac3" style={{ borderRadius: '0.5rem' }} width="1918" height="1013" data-path="assets/sast-sqli.png" />
</Frame>

Each finding includes a detailed description of the vulnerability, a realistic attack scenario showing how it could be exploited, vulnerable code alongside the secure pattern, and specific step-by-step remediation guidance. Re-scan to confirm the fix.

### Built-In Capabilities

#### Secret Detection

34 pattern types scan for hardcoded credentials, API keys, tokens, and private keys across all file types — not just source code. Finds secrets in configuration files, environment files, documentation, and commit history. Runs as part of the standard SAST scan — no separate tool needed.

Detects: AWS access keys and secrets, Azure connection strings and tenant IDs, database connection strings and JDBC URLs, Docker passwords, GitHub tokens and OAuth tokens, GitLab tokens, Google Cloud API keys and service accounts, Heroku API keys, JWT secrets, Mailchimp and Mailgun API keys, OpenAI API keys, RSA and SSH private keys, Slack tokens and webhooks, Stripe publishable and secret keys, Twilio API keys and account SIDs, and generic password and secret assignment patterns.

#### CI/CD Pipeline Scanning

53 patterns check CI/CD configuration files for security misconfigurations in GitHub Actions, GitLab CI, Jenkins, and Azure Pipelines:

* Unpinned action versions
* Hardcoded credentials in workflow files
* Insecure OIDC configurations
* Containers running as root
* Missing approval gates for production deployments
* Untrusted third-party actions
* Variables exposed in logs
* Insecure artifact uploads
* `curl | bash` and `wget | bash` patterns
* Missing SAST scan steps
* World-writable permissions
* chmod 777 usage
* Dangerous rm -rf patterns
* sudo usage in pipelines

#### SARIF Output

Export findings in OASIS SARIF 2.1.0 format with `--format=sarif`. The output includes full rule definitions with security severity levels, complete result details with locations, partial fingerprints for deduplication, custom compliance mappings with NCA ECC/SAMA/PCI DSS control IDs, and taint traces showing the complete attack chain.

Compatible with GitHub Advanced Security, GitLab SAST, Azure DevOps, and all SARIF consumers.

#### Inter-Procedural Analysis

Codepure tracks tainted data across file and module boundaries. When user input enters through a controller, flows through a service layer, and reaches a SQL query in a data access object, Codepure connects all three files into a single finding with a complete attack trace.

#### AST Caching

Parsed abstract syntax trees are cached to avoid redundant re-parsing. When the same file is scanned multiple times — common in CI/CD pipelines — Codepure reuses the cached AST, reducing scan time on repeated and incremental runs.
