Vulnerability Analysis with Large Language Models
This project addresses a critical bottleneck in software security: the “semantic gap” between the heuristic-based pattern matching of Large Language Models (LLMs) and the rigorous requirements of program analysis. We are building a hybrid framework that integrates LLMs with the mathematical precision of Code Property Graphs (CPG), LLVM Intermediate Representation, Symbolic Execution, and Binary Analysis.
Our goal is to move beyond simple bug detection toward autonomous systems capable of root cause analysis, exploit generation, and proactive code hardening.
The Methodology: Hybrid Intelligence:
We combine the intuitive reasoning of LLMs with structured program analysis tools to provide a comprehensive security audit of complex, repository-scale software.
- Context-Aware Slicing: Using CPG-guided techniques to reduce code noise by ~90% while preserving inter-procedural data flows.
- Formal Verification at Scale: Bridging the “realism gap” by expanding real-world code seeds into formally verified, compilable datasets via Bounded Model Checking.
- Multi-Layer Reasoning: Analyzing programs across the entire stack from high-level source code to LLVM IR and compiled binaries using angr.
- Agentic Exploration: Designing Model Context Protocol (MCP) servers to allow LLMs to autonomously navigate codebases, track taints, and perform semantic queries.
The Horizon: Autonomous Defense:
The project is expanding into proactive and generative security tasks:
- Autonomous Root Cause Analysis: Tracing crashes back to fundamental design flaws using symbolic execution and LLM-guided reasoning.
- Automated Exploit Generation: Generating proof-of-concept exploits to validate the severity of discovered vulnerabilities.
- Proactive Open-Source Rewriting: Systematically refactoring critical infrastructure to replace legacy, memory-unsafe patterns with hardened equivalents.
- Zero-Shot Patching: Generating “correct-by-construction” patches that address the underlying vulnerability logic without introducing regressions.
Security Discoveries:
- CVE-2025-6491 (php-src)
- CVE-2025-6021 (libxml2)
- CVE-2025-6170 (libxml2)
Team