A person's hands on a laptop and tablet, overlaid with digital interfaces and the letters ‘AI,’ visualize the security analysis and penetration testing of AI and LLM systems.

Pentest of AI/LLM Systems: How to Secure Enterprise AI Systems

22. April 2026

AI | News | Pentests & Security Analyses | usd HeroLab

Artificial intelligence is already widely used across many organizations. Chatbots access internal knowledge, AI supports decision‑making or controls automated processes. As adoption increases, so does the attack surface, because AI applications behave fundamentally differently from traditional software.

From a technical perspective, such applications do not consist of a language model alone. They combine prompt processing, retrieval mechanisms, (multi-)agent functionality, integrated tools, and downstream processing of model outputs in connected systems. As a result, attacks often do not target a single code‑level vulnerability, but instead aim to manipulate system behavior through direct inputs or indirect context manipulation.

Florian Kimmes, Senior Security Analyst at usd AG and an expert in AI security analysis at usd HeroLab, explains why AI and LLM systems require their own assessment approaches, which risks are particularly relevant in practice, and how these risks can be reliably assessed.

Why Do AI/LLM Systems Raise New Security Issues for Companies?

AI‑based applications and large language models (LLMs) are becoming integral to operational processes and increasingly take on tasks with direct business and regulatory relevance. They generate externally visible content, process sensitive information, or trigger automated actions. As a result, they become part of the production IT environment with direct impact on data, processes, and business risks.

This role fundamentally changes the security landscape. AI systems do not behave deterministically; their decisions are context‑dependent and probabilistic. For organizations, the key question therefore is how far a system can be influenced: how effectively the behavior of an LLM can be technically constrained, made transparent, and rendered verifiable. This is where traditional security testing approaches quickly reach their limits.

These differences stem from the technical architecture of AI applications:

The system’s control logic is heavily shaped by the stochastic nature of the LLM. The model makes the critical decisions within the application, increasing productivity but also risk.
All connected data sources, such as those accessed through retrieval mechanisms, are directly available to the model and therefore potentially reachable by attackers.
Complex agents with extended toolsets often hold far‑reaching privileges within the IT environment.

In these architectures, attackers exploit fewer classic vulnerabilities. Instead, they influence the system through inputs, context, and interactions in order to steer the model toward unintended behavior. What matters, therefore, is not the language model in isolation, but the interaction between the model, prompting, data sources, connected tools, APIs, and the downstream processing of generated outputs.

What Are Common Vulnerabilities in AI/LLM Systems?

From an architectural perspective, recurring attack patterns and weaknesses emerge in practice. They do not occur at isolated points, but along the entire technical processing chain. This is exactly where a pentest of AI/LLM systems comes into play.

Attackers use prompt injections to bypass safeguards or to steer the model toward unintended actions or responses.

Through retrieval mechanisms, LLMs are often granted access to sensitive corporate data. Attackers exploit this by deliberately manipulating external or internal content in order to exfiltrate sensitive information via the LLM.

The situation becomes particularly critical when AI agents are granted excessive privileges. In such cases, the model can not only generate content but actively trigger actions, such as API calls, system modifications, or automated workflows. This allows attackers to expand their access within the IT environment.

In addition, resource‑based attacks target the budgets or capacity limits of compute‑intensive LLM infrastructures. Carefully crafted inputs can rapidly exhaust these resources, significantly impacting costs or availability. Since AI/LLM applications are often provided as web, mobile, or API‑based services, classic vulnerabilities also remain a relevant factor. Especially in this rapidly evolving environment, development teams operate under high time pressure, increasing the likelihood of security‑critical mistakes.

How Can Risks in AI/LLM Systems Be Made Measurable?

Assessing the severity of vulnerabilities in AI/LLM systems requires more than documenting isolated findings. Depending on the vulnerability category, it is often crucial how stable and reproducible unwanted behavior can be triggered. This is where a quantitative approach comes into play.

A pentest of AI/LLM systems therefore complements qualitative analyses with targeted measurements. Attacks are systematically repeated and varied in order to evaluate how reliably they actually work. Among other aspects, the following are measured:

How frequently an attack is successful
How consistently unwanted behavior can be reproduced

Metrics such as the Attack Success Rate (ASR) indicate the likelihood with which attacks succeed. This is particularly relevant when a vulnerability can be leveraged to target other users; in such cases, the attack must succeed with near‑absolute certainty at the moment the victim interacts with it. pass@k describes the probability that an attack will succeed at least once across multiple attempts. This metric is relevant for all severe technical vulnerabilities where attackers have many attempts available, for example, when using LLM output as a stepping stone to compromise a downstream system.

These metrics make risks comparable, prioritizable, and assessable over time. They enable realistic risk evaluation within vulnerability management and support targeted mitigation.

How Does Threat Modeling Lead to a Realistic, Verifiable Attack Scenario?

To ensure that a pentest of AI/LLM systems is not based on assumptions, it typically begins with extended threat modeling tailored specifically to AI architectures. Development around LLM‑based applications is moving rapidly, and many organizations still struggle to assess risks and define relevant testing objectives. A purely “black‑box” approach would be particularly inefficient given the stochastic nature of these models. Instead, we jointly analyze the actual application architecture and derive concrete, realistic attack paths from it.

As part of threat modeling, we first define the key boundary conditions and attack surfaces, such as application exposure, data sources, and integrated tools. Building on this, we conduct a data‑flow analysis across the processing chain:

Which components accept external input (for example user input, external content, or documents)?
Where is this information stored or enriched?
And at which points is it incorporated into prompts?

In practice, these trust boundaries are where viable attack scenarios emerge, such as prompt injections via external content, data leakage through retrieval mechanisms, or the misuse of tools by over‑privileged agents. Based on the identified attack paths, we derive concrete test cases and prioritize any discovered vulnerabilities according to business context and technical risk. This enables targeted, transparent, and reproducible testing.

How Should Terms Such as AI Red Teaming and Pentests of AI/LLM Systems Be Understood?

Whether referred to as AI Red Teaming, LLM pentesting, pentests of AI/LLM systems or security assessments of AI systems, the core objective remains the same: identifying real vulnerabilities, making risks transparent, and enabling the secure operation of AI applications.

Internationally, the term AI Red Teaming is becoming established to describe targeted security testing of AI systems. In German‑speaking regions, similar assessments are most commonly framed as pentests.

The difference lies less in the goal than in the approach. A pentest of AI/LLM systems follows a structured and verifiable methodology with clearly defined procedures and reproducible results. It provides a reliable foundation for security decision‑making, risk management, and audit preparation in the context of production‑grade AI use.

Looking to assess the real risks in your AI/LLM systems? Get in touch with us to discuss the right testing approach for your specific use case.

Also interesting:

A5: BSI Publishes New Audit Framework for Trustworthy AI

Jul 9, 2026

The German Federal Office for Information Security (BSI) has published the Community Draft of the AI Audit and Assurance Assessment Architecture (A5) (currently available in German only), thus starting the consultation phase. A5 is intended to create a standardized...

AI Vulnerability Storm: When a Lack of Speed Becomes a Risk

Jul 3, 2026

Current developments in AI systems show that vulnerabilities are found more quickly, suitable exploits are developed more quickly, and attacks are increasingly implemented automatically. This significantly reduces the time between discovery and exploitation. What used...

Bafin Publishes 9th Amendment to MaRisk

Jul 1, 2026

On 30 June 2026, all credit institutions and financial services institutions in Germany received an important circular: Following the consultation phase, the German Federal Financial Supervisory Authority (Bafin) published the 9th amendment to its Minimum Requirements...

Pentest of AI/LLM Systems: How to Secure Enterprise AI Systems

Why Do AI/LLM Systems Raise New Security Issues for Companies?

What Are Common Vulnerabilities in AI/LLM Systems?

How Can Risks in AI/LLM Systems Be Made Measurable?

How Does Threat Modeling Lead to a Realistic, Verifiable Attack Scenario?

How Should Terms Such as AI Red Teaming and Pentests of AI/LLM Systems Be Understood?

Also interesting:

A5: BSI Publishes New Audit Framework for Trustworthy AI

AI Vulnerability Storm: When a Lack of Speed Becomes a Risk

Bafin Publishes 9th Amendment to MaRisk

Categories

Categories

usd AG

News

A5: BSI Publishes New Audit Framework for Trustworthy AI

AI Vulnerability Storm: When a Lack of Speed Becomes a Risk

Bafin Publishes 9th Amendment to MaRisk

Follow Us