Someone in your company connected an AI tool to internal data this week. There was no approval and no bad intent. They found a plugin that promised to save an afternoon, installed it, and it worked. The part worth worrying about is everything it did beyond the part they noticed.
Adoption of AI assistants, skills, and plugins is running well ahead of any ability to check them. These tools are being wired into email, documents, code, and customer records at a pace that leaves no room for the question that matters most. "What else can this thing reach, and who decided it was safe?"
The software supply chain already taught this lesson
Modern software is assembled from parts other people wrote. A single application can depend on thousands of free components, almost none of which anyone on the team has read. The arrangement usually holds and occasionally fails badly. In September 2025, attackers phished the maintainers of several enormously popular code packages and slipped in malicious code that reached billions of weekly downloads before anyone caught it.
That episode was the warm-up. AI repeats the same pattern, except the components are far more capable and the people adding them sit further from any understanding of what they do.
AI tools fail in ways ordinary software does not
A compromised software library hides its danger in code, and code can at least be read by someone with the right skills. AI tools strip away even that protection, in two ways worth understanding.
1. Malicious instructions can hide inside a tool's description
AI assistants act on instructions written in plain language. A hostile tool does not need to smuggle in clever code. It can place hidden instructions in its own description, which the assistant then reads and obeys, often without anyone deliberately running it. Researchers call this 'tool poisoning'. A 2025 study of roughly 1,900 of these AI connectors found that about one in eighteen already carried the flaw. Controlled testing went further. Across a benchmark built on 45 live servers and 353 genuine tools, 20 leading AI agents fell for poisoned instructions in 36.5% of attempts on average, and in more than 60% of attempts for several widely used models. There is no reviewing your way out of a threat that lives in the instructions themselves.
2. An approved tool can change its behaviour after installation
A tool can be vetted, approved, and then quietly altered later. Security researchers documented precisely this in 2025. A connector for the Postmark email service behaved normally across fifteen releases and earned the trust of the developers using it. Its sixteenth release added a single line that copied every email it handled to an address the attacker controlled . The technique has a name now, 'the rug pull', and a catalogued vulnerability under CVE-2025-54136 . Whatever your team checked is not guaranteed to be what runs tomorrow.
The people installing AI tools cannot evaluate them
Here is the part that should give any leader pause. The appeal of these tools is that using them takes no technical skill. The same quality leaves the person adding them least able to judge whether they are safe.
A new AI skill gets judged the way a phone app does. Does it look useful. Does it have good reviews. Does it run. Where it came from and what it can reach go unexamined. To make the tool work, people grant it access to files, logins, and the open internet, because refusing means friction and the danger stays invisible. The access being handed over dwarfs that of an ordinary piece of software, and it is handed over on faith.
Warnings and policies will not stop this
The reflex is to issue guidance. Connect only approved tools. Check before you install. Guidance of this kind fails for the same reason posted speed limits do not end speeding. It asks people to make expert security calls many times a week, while moving quickly, with no way to see the risk in front of them.
A warning is not a control. It is a wish. Safety placed in the hands of people who cannot see what they are deciding will not hold. It has to sit somewhere they cannot accidentally override.
Safety has to be built into the environment
The useful shift is from policing behaviour to constraining the environment. Rather than trusting each person and each tool to act well, you place a floor beneath everything done with AI. That floor governs which data can be touched, what any tool is allowed to do, and what gets recorded. It applies automatically, whoever installed whatever.
The common term for this floor is 'guardrails', though the term matters less than the principle behind it. A guardrail is a control that works without anyone remembering it. The distinction is between asking drivers to stay on the road and building a barrier that keeps them on it.
This is the conclusion we reached operating AI systems for other companies, and the reason our own approach is built around an enforced boundary rather than around any particular model or tool. It runs inside the company's own cloud accounts and treats the safety floor as the product rather than an extra. The mechanics matter less than the idea. The boundary is the asset you own, not the tools passing through it.
How an enforced boundary protects company data
The concrete risk is that customer data, intellectual property, and reputation leave through a tool nobody understood. A properly built boundary closes the common routes, and the mechanics are worth grasping even for those who never operate them.
- Data is protected before any AI sees it
Sensitive information is classified, encrypted, and tagged as it enters, so it is guarded at the door rather than after an exposure. - Tools and agents are restricted to what they need
Instead of granting a plugin whatever it asks for, the boundary fixes exactly what each tool may read and do, replacing open access with a deliberate and narrow one. - Prompts and proprietary data stay out of public models
What staff type into AI, and what comes back, is kept clear of the public models other organisations train on. Proprietary work does not quietly become a competitor's advantage. - Activity is logged and monitored
Every prompt, response, and action is recorded and reviewable, so unusual behaviour surfaces before it becomes an incident rather than after a breach. - Control can be demonstrated
When the controls map to recognised standards and the evidence exists on request, the safety question from a board, a regulator, or an acquirer earns a documented answer rather than an anxious one.
Why being first to adopt is a risk
There is a quieter discipline underneath all of this, and it costs almost nothing to apply. Do not be the first to install the newest version of anything.
Most malicious releases are opportunistic. An attacker compromises a package or a tool, publishes a tampered version, and relies on automatic updates pulling it into thousands of systems before anyone notices. What makes this profitable is a short window, because security scanners and the wider community tend to catch and remove these releases within hours of publication. The organisations that update the instant a new version appears are the ones who absorb the attack. The organisations that wait let everyone else find the problem first.
The practice has a name in software circles. 'A cooldown', or minimum release age, tells your systems to ignore any version published more recently than a set window. The tooling now ships with it and some of it now applies a multi-day cooldown to npm packages by default rather than as something a team has to switch on. The original rule of thumb was around twelve hours, enough to filter the fastest grab-and-run attacks. Our view is that twelve hours is too short. A window measured in days, closer to twelve, catches far more, because not every compromise is found in an afternoon. The Postmark tool described earlier sat trusted across fifteen releases before it turned, and patient attacks of that kind are exactly the ones a few hours will miss.
None of this means leaving known vulnerabilities unpatched. A genuine flaw under active exploitation can justify moving at once. The point is to change the default. Newness should not be confused with safety, and being last to install a compromised release is a fine place to be.
The same logic carries straight into AI. A skill or a connector published yesterday has had no time to be examined by anyone. Waiting before adopting a new tool, and pinning the version already approved so it cannot quietly advance, is the AI equivalent of the same cooldown. It sits alongside the enforced boundary rather than replacing it.
AI governance is becoming a condition of doing business
The pressure here is no longer only technical. Investors increasingly read AI governance the way they read financial controls, as a measure of whether a company is run seriously. Enterprise buyers now put AI safety questions into procurement. Regulators are moving quickly. Companies that can show their AI runs inside a controlled boundary will win contracts and capital more easily than those that cannot.
Building on ground that will still be there
Frontiers get settled. The law arrives, and the businesses that built on solid ground carry on while the rest retrofit under pressure. The advantage was never in moving slowly. It came from moving quickly on top of a boundary already in place, so that speed and safety stopped competing. The question worth holding onto is not whether to use AI. It is whether the ground beneath it will hold once the rules catch up.
Where base2Services leads on this
base2Services builds and operates that boundary as a managed service, and has been ahead of most of the market in treating governed AI as an operations problem rather than a model problem. The AI Factory runs AI inside a company's own AWS accounts, with more than sixty enforced guardrails, continuous monitoring, and a complete audit trail of every AI action. Its Guardrails layer maps to the Australian Voluntary AI Safety Standard and is operated against ISO 27001, SOC 2, and APRA CPS 234, with the controls held in version control and evidence available the moment a board or regulator asks. Sensitive data, prompts, and intellectual property are kept out of public model training, and every tool and agent is held to exactly what it is permitted to touch. This is the same discipline base2Services has applied to managed cloud and security operations for years, now carried into AI.
Build the Boundary Before AI Breaks Through
If AI is entering your business faster than you can govern it, the boundary is worth putting in place before an incident puts it there for you. See how base2Services builds that boundary with AI Factory and Guardrails, or talk to the team about what governed AI should look like in your environment.
With thanks to The Assistant Factory, whose conversation sparked this piece.
References
- base2Services, AI Factory (2026): https://www.base2services.com/artificialintelligence/aifactory/, Guardrails (2026) https://www.base2services.com/artificialintelligence/aifactory/guardrails/
- CISA, "Widespread Supply Chain Compromise Impacting npm Ecosystem" (2025): https://www.cisa.gov/news-events/alerts/2025/09/23/widespread-supply-chain-compromise-impacting-npm-ecosystem
- "Tool poisoning prevalence across open-source MCP servers," arXiv 2509.06572 (2025): https://arxiv.org/abs/2509.06572
- "MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers," arXiv 2508.14925 (2025): https://arxiv.org/abs/2508.14925
- The Register, "Fake Postmark MCP npm package stole emails with one-liner" (2025): https://www.theregister.com/2025/09/29/postmark_mcp_server_code_hijacked/
- OWASP, "MCP Security Cheat Sheet": https://cheatsheetseries.owasp.org/cheatsheets/MCP_Security_Cheat_Sheet.html
- Christian Schneider, "Dependency cooldowns: a simple supply chain fix": https://christian-schneider.net/blog/dependency-cooldowns-supply-chain-defense/