TL;DR

AI-specific attacks are not theoretical. Model theft, prompt injection, and data poisoning are happening now. This post covers the four threat categories every business leader needs to understand, with specific tools, real attack examples, and the governance framework to protect your AI investments.

The New Threat Landscape

Your organisation just spent $2 million fine-tuning a model on proprietary data. That model is now a piece of intellectual property sitting on a server, accessible through an API, and probably not protected by anything more than a rate limiter. Attackers know this. They are not coming for your databases. They are coming for your models.

Four AI-specific threat categories now sit at the top of every security risk register. Here they are, with what actually happens and what to do about it.

1. AI-Powered Phishing and Deepfake Social Engineering

The attack: Criminals are using generative AI to clone voices, generate convincing phishing emails at scale, and impersonate executives on video calls. These are not grainy, laggy fakes from 2023. Current deepfake tools produce real-time video with lip sync accurate enough to fool finance teams.

Real example: In February 2024, a multinational firm in Hong Kong lost $25 million after an employee attended a video call with what appeared to be the CFO and several colleagues. Every participant on the call was a deepfake. The employee authorised the transfer.

The numbers: The FBI's Internet Crime Complaint Center reported that business email compromise losses exceeded $2.9 billion in 2023, with AI-generated content now accelerating the volume and sophistication of these attacks. Deepfake-related fraud incidents grew 3,000% between 2022 and 2024 according to identity verification provider Onfido.

What to do:

  • Implement out-of-band verification for any financial transfer over $10,000. A phone call to a known number, not the one in the email.
  • Deploy deepfake detection tools like Reality Defender or Intel's FakeCatcher for video call verification on sensitive meetings.
  • Train finance and HR teams specifically on AI-generated phishing. Traditional phishing training does not cover generative AI tactics.

2. Prompt Injection and AI Agent Security

The attack: Your company deploys an AI agent that reads emails, summarises documents, or accesses internal systems. An attacker sends a carefully crafted message that overrides the agent's instructions, making it exfiltrate data or execute unauthorised actions.

Real example: In 2024, a researcher demonstrated that simply embedding invisible text in a webpage, white text on a white background, could cause AI assistants reading that page to inject malicious instructions. Multiple production AI agents were shown to be vulnerable to this class of attack. The technique works because the model sees all text equally regardless of rendering.

The numbers: The OWASP Top 10 for LLM Applications lists prompt injection as the number one vulnerability. Indirect prompt injection, where poisoned data sits in documents the AI later retrieves, is listed as a separate entry because the attack surface is entirely different from direct chat injection.

What to do:

  • Never give an AI agent access to systems it does not strictly need. If an agent only needs to read a database, it gets read-only credentials.
  • Implement input and output guardrails using tools like NVIDIA NeMo Guardrails or Guardrails AI. These sit between the model and the world, validating both what comes in and what goes out.
  • Treat every piece of data the agent ingests, emails, web pages, documents, as potentially hostile. Sanitise before the model sees it.

3. Model Theft and Intellectual Property Extraction

The attack: Attackers query your model's API thousands of times and use the responses to train a clone. This is not theoretical. Model extraction attacks have been demonstrated against commercial APIs from OpenAI, Anthropic, and others. The cloned model performs similarly to the original but costs the attacker nothing to own.

Real example: In 2023, researchers extracted a functional clone of a production language model using fewer than $1,000 worth of API queries. The technique, called model stealing via query-based distillation, required no internal access. Just the public API.

The numbers: Training a frontier model costs between $10 million and $100 million. Fine-tuning a specialised model on proprietary data can cost $100,000 to $500,000. An attacker can extract a useful clone for under $5,000 in API costs using systematic querying techniques. The economics of theft heavily favour the attacker.

What to do:

  • Implement query-level monitoring with anomaly detection. A single API key making 10,000 queries in an hour with systematically varying prompts is not a user. It is an extraction attempt.
  • Use response watermarking or fingerprinting where feasible. Tools like model watermarking embed detectable patterns in model outputs that survive distillation.
  • Rate limit aggressively and log every query. If you cannot detect extraction, you cannot stop it.

4. Data Poisoning and Supply Chain Attacks

The attack: Your model is only as good as its training data. Attackers poison public datasets, compromise third-party fine-tuning services, or inject malicious examples that create backdoors in the model's behaviour. When a specific trigger phrase appears, the poisoned model behaves in attacker-controlled ways.

Real example: In 2024, researchers demonstrated that poisoning just 0.01% of a training dataset could create reliable backdoors in image classification models. For language models, poisoning instruction-tuning data with as few as 100 malicious examples created persistent unwanted behaviours that survived subsequent fine-tuning.

The numbers: The cost to poison a moderately popular open dataset, through submitting malicious contributions to public repositories, has been estimated at under $500. The cost to remediate a discovered poisoned model can exceed $100,000 in retraining and validation alone.

What to do:

  • Vet every data source. If you are fine-tuning on scraped web data, you are fine-tuning on attacker-controlled data. Use curated, verified datasets where possible.
  • Implement data provenance tracking. Know where every training example came from and maintain the ability to trace model behaviour back to its source data.
  • Run adversarial validation on training data. Tools like TextFooler and the Adversarial Robustness Toolbox from IBM can help detect poisoning attempts before they reach training.

5. The Governance Framework Businesses Actually Need

The problem: Most organisations have no AI-specific security governance. Their existing infosec policies were written before language models existed. The gap is not theoretical. It is already being exploited.

The framework: The NIST AI Risk Management Framework, released in January 2023 and updated through 2025, provides the most practical starting point. It organises AI risk into four functions: Govern, Map, Measure, and Manage. Pair it with the OWASP Top 10 for LLM Applications for the technical controls.

What a minimum viable AI security program looks like:

  • An inventory of every AI model in the organisation, including shadow AI where employees use unapproved tools. If you do not know it exists, you cannot secure it.
  • A risk assessment for each model covering the four threat categories above. Not a checkbox exercise. Actual assessment by someone who understands the attacks.
  • Technical controls: API rate limiting with anomaly detection, input/output filtering, data provenance tracking, and out-of-band verification for high-risk actions.
  • An incident response plan that covers AI-specific scenarios. If your model is stolen tomorrow, who gets called and what do they do?

The cost reality: A basic AI governance program for a mid-market company, including tooling, assessment, and process implementation, runs $30,000 to $80,000. The average cost of a data breach in Australia, according to IBM's 2024 Cost of a Data Breach report, is $4.2 million. The maths is straightforward.

FAQ

Q: Our company is not building AI models. Do we still need to worry about these threats?

Yes. If your employees use ChatGPT, Copilot, or any AI tool, you face prompt injection and data exfiltration risks. If your executives appear in public videos, they are vulnerable to deepfake cloning. AI security is not just for AI companies. It is for any company whose employees use AI, which is now every company.

Q: How do we know if someone is trying to steal our model through the API?

Monitor query patterns. Extraction attacks look different from normal usage. They involve systematic variation of prompts, high query volumes, and attempts to elicit maximal information from each response. If you see these patterns, investigate immediately.

Q: What is the single most impactful thing we can do this week?

Create an inventory of every AI tool and model in your organisation. Include shadow AI. You cannot protect what you do not know exists. This is a one-day exercise for most companies and it surfaces risks that are invisible to leadership.

Q: Are there insurance products for AI-specific risks?

Yes. Major cyber insurers, including AIG, AXA XL, and Beazley, now offer AI-specific endorsements covering model theft, AI-driven social engineering fraud, and algorithmic liability. Premiums are evolving as the risk is new, but coverage exists. Ask your broker specifically about AI endorsements.

Conclusion

AI-specific attacks are not a future problem. Model theft is happening now. Prompt injection is trivial to execute. Deepfake social engineering has already caused multi-million-dollar losses. The tools to defend against these threats exist. The governance frameworks exist. What is missing in most organisations is awareness and action.

Start with the inventory. Assess your exposure across the four threat categories. Implement the technical controls that match your risk level. Build the incident response plan before you need it.

Visit consult.lil.business for a free cybersecurity assessment. We will help you map your AI attack surface and build the controls to protect your models, your data, and your business.

References

  1. NIST AI Risk Management Framework
  2. OWASP Top 10 for LLM Applications
  3. IBM Cost of a Data Breach Report 2024
  4. FBI Internet Crime Report 2023 — Business Email Compromise
  5. ACSC Guidelines for Secure AI System Development

Ready to strengthen your security?

Talk to lilMONSTER. We assess your risks, build the tools, and stay with you after the engagement ends. No clipboard-and-leave consulting.

Get a Free Consultation