SuperagentLM
The Defender Model

A small language model trained specifically to defend against unsafe AI-generated code that LLMs can't stop.

Apache 2.0 License • Hugging Face

Malicious Code Detection — Accuracy

98%

Superagent-lm-270m

97%

Gemini 2.5 pro

94.5%

GPT-5

37%

Sonnet-4

24.5%

Opus 4.1

98%

Superagent-lm-270m

97%

Gemini 2.5 pro

94.5%

GPT-5

37%

Sonnet-4

24.5%

Opus 4.1

98%

Superagent-lm-270m

97%

Gemini 2.5 pro

94.5%

GPT-5

37%

Sonnet-4

24.5%

Opus 4.1

Benchmark — higher is better

Built for Defense

SuperagentLM powers the AI Firewall, providing reasoning-driven protection against the three core AI security threats.

Reasoning

Detects unsafe AI-generated code — prompt injections, leaks, and backdoors — by reasoning.

Fast

Just 100-500ms of security analysis — minimal overhead for comprehensive protection.

Up to date

Continuously fine-tuned on proprietary attack data to stay ahead of new exploits.

Frequently Asked Questions

Everything you need to know about SuperagentLM

What is SuperagentLM?

SuperagentLM is a 270M parameter language model fine-tuned specifically for detecting unsafe AI-generated code. It powers the Superagent AI Firewall and can identify prompt injections, backdoors, and data leaks through reasoning rather than pattern matching.

What is the model architecture?

SuperagentLM is built on the unsloth/gemma-3-270m-it foundation with 270 million parameters, fine-tuned using LoRA (Low-Rank Adaptation) on proprietary security-focused data. The model is deployed as a GGUF format with Q8_0 quantization at just 292 MB, making it compatible with llama.cpp and Ollama for CPU-only inference on standard hardware.

How does it differ from other security models?

Unlike rule-based systems, SuperagentLM uses reasoning to understand context and intent. This allows it to catch novel attack patterns and sophisticated threats that static filters miss, while maintaining sub-50ms response times.

What are the hardware requirements?

SuperagentLM is optimized for CPU-only inference with minimal memory requirements (292 MB). It can run on standard servers, edge devices, and even consumer hardware without GPU acceleration.

How do I integrate SuperagentLM?

You can use SuperagentLM directly via HuggingFace with llama.cpp or Ollama. For production use, the hosted AI Firewall provides a managed solution with automatic scaling.

What license is SuperagentLM under?

SuperagentLM is released under the Apache 2.0 license, allowing both commercial and non-commercial use. You can freely use, modify, and distribute the model in your applications.

How does SuperagentLM compare to Azure Content Safety or Llama Guard?

Azure Content Safety and Llama Guard are content moderation tools for conversational AI—they classify prompts and responses for harmful content (hate, violence, sexual content). SuperagentLM protects AI agents by analyzing their actions—catching malicious tool calls and data leaks. Read more in our comparison blog post.

How is the model kept up to date?

SuperagentLM is continuously fine-tuned on proprietary attack data to stay ahead of new exploits and attack patterns. The model is regularly updated to maintain effectiveness against evolving AI security threats.

Defend your AI with SuperagentLM

Deploy the defender model in your infrastructure or try it in production with our hosted AI Firewall.

Open Source • MIT License