ZeroLeaks/zeroleaks
↗ GitHubAI Security Scanner - Test your AI systems for prompt injection and extraction vulnerabilities
539
Stars
86
Forks
9
Watchers
0
Open Issues
Safety Rating A
No hardcoded secrets, malicious code patterns, or dependency vulnerabilities are evident from the repository content. The project is a legitimate, research-backed security testing tool for LLM systems analogous to penetration testing frameworks. The README contains no prompt injection attempts targeting analysts. The attack techniques documented are standard academic and CVE-referenced methods used in defensive security research. The dual-use nature (simulating attacks) is intentional and clearly scoped to authorized testing of one's own systems.
ℹAI-assisted review, not a professional security audit.
AI Analysis
ZeroLeaks is an autonomous AI security scanner built in TypeScript that tests LLM-based systems for prompt injection and system prompt extraction vulnerabilities. It uses a multi-agent architecture (Strategist, Attacker, Evaluator, Mutator, Inspector, Orchestrator) and implements research-backed attack techniques such as Tree of Attacks with Pruning (TAP), Crescendo, Many-Shot, Chain-of-Thought Hijacking, Policy Puppetry, and TombRaider patterns to simulate real-world adversarial attacks against AI systems.
Use Cases
- Testing LLM applications for system prompt extraction vulnerabilities before deployment
- Performing red-team assessments on AI chatbots and assistants
- Integrating automated prompt injection testing into CI/CD pipelines
- Defense fingerprinting to identify specific guardrail systems in production AI
- Researching and benchmarking LLM security posture using standardized attack techniques
Tags
Project Connections
Decepticon
→Decepticon conducts autonomous full-chain penetration tests at the network and application layer; ZeroLeaks specializes in AI-layer attacks — prompt injection, system prompt extraction, and adversarial LLM manipulation. They cover complementary attack surfaces in modern AI-integrated systems.
Strix
→Both use multi-agent architectures for automated security testing with a developer-first CLI. Strix targets traditional application vulnerabilities in web apps, APIs, and codebases with auto-fix; ZeroLeaks targets LLM-specific threats using adversarial attack chains like TAP, Crescendo, and Policy Puppetry.
Kavach
→Kavach monitors and restrains AI agents at the operating system level with cryptographic audit logs and honeypot tripwires. ZeroLeaks tests AI systems for prompt-level vulnerabilities. Together they provide defense in depth: prompt-layer vulnerability assessment paired with runtime behavioral enforcement.