When AI walks through MIE like it's a beaded curtain

Table of Content

When AI walks through MIE like it’s a beaded curtain
Mythos in two coffees
So… is the pentest service dead?
What this changes day-to-day
Where humans still earn their badge
If you sell pentest services, adapt now

When AI walks through MIE like it’s a beaded curtain

Apple spent five years and a mountain of cash building Memory Integrity Enforcement (MIE) for the M5 chips to kill off a whole class of memory corruption exploits. The security team at Calif used Anthropic’s Mythos Preview and landed a working macOS kernel exploit that bypasses MIE in… about five days.

The chain starts from a normal unprivileged local user on macOS 26.x and ends with a root shell, using only regular syscalls. They chained two bugs and “a handful of techniques” into a data-only kernel local privilege escalation, running on real M5 hardware with MIE enabled.

Humans still did the classic vuln research and exploit engineering. But Mythos helped find the bugs fast because they matched known vulnerability classes and then assisted throughout the exploit dev loop.

So yeah, the “AI found a path around Apple’s flagship mitigation in a week” headline isn’t hype. That’s the new floor.

Mythos in two coffees

Mythos Preview is Anthropic’s “too spicy to ship” model: a general-purpose LLM that happens to be terrifyingly good at offensive security.

In internal tests, Mythos has already found thousands of high-severity vulnerabilities across major operating systems and browsers, including decades-old bugs. It has successfully broken out of its own sandbox, chained Linux kernel bugs into full machine compromise, and dug up a 27-year-old OpenBSD issue that can crash any box running it.

The UK AI Security Institute saw it autonomously running multi-stage attacks on vulnerable enterprise networks and completing expert-level cyber ranges that older frontier models simply failed. On some high-end CTF-style tasks, it succeeds roughly three-quarters of the time, where previous models were stuck at beginner level two years ago.

Anthropic’s reaction: slam the brakes. Mythos is not public; access is restricted to a handful of big vendors under “Project Glasswing” for defensive cyber work.

So… is the pentest service dead?

Short answer: no. But the “scan + Nessus PDF = pentest” business model should probably update its CV.

A bunch of people instantly tried to pitch Mythos as an “AI red team in a box.” Anthropic and others have been very explicit: Mythos is not a pentest product, not a turnkey red-team platform, and not your new security department.

The Calif story is a good reality check. Mythos didn’t wake up one morning and decide to own macOS; a very strong team used it as a power tool inside a classic research workflow.

You still need:

Scope and legal approval. Mythos won’t join your kickoff call (yet).
Network access, lab setup, target understanding.
Human judgment to pick what to exploit and what actually matters to the business.
Someone to explain to management why “we got root, again” is not a helpful slide without context.

AI doesn’t kill pentest. It kills slow, manual pentest that insists on pretending AI doesn’t exist.

What this changes day-to-day

For working pentesters and security engineers, Mythos-class models mainly do three things.

Recon and code review on fast-forward. The model can chew through massive codebases and logs and surface suspicious patterns and likely bug classes far faster than an over-caffeinated junior.
Exploit dev acceleration. Calif turned initial bugs into a working MIE-bypassing kernel exploit in less than a week, with Mythos helping with bug triage, exploit strategy, and boilerplate.
Autopwn for low-hanging fruit. Evaluations show Mythos can autonomously compromise weakly defended networks in lab ranges once it has access, handling many of the “boring but important” steps.

Result: anything that looks like “run tools, sift output, try obvious chains” becomes cheap and automated. Clients will expect more coverage, faster, and will be less excited to pay senior rates for tasks an LLM can do between two cron jobs.

Where humans still earn their badge

Mythos is very good at “given a target and permissions, break it.” It is much less good at everything around that.

Threat modeling and prioritization. Deciding whether to go after the legacy ERP or the shiny new GraphQL API still needs someone who understands the business and its risk appetite.
Weird edge-case chaining. Multi-tenant logic bugs, messy auth flows, and “this one cron job from 2014” require reading between the lines, not just between the braces.
Social and political layer. Convincing a CISO to patch, writing a report the regulator won’t laugh at, handling disclosure with vendors… no LLM wants that job.
Safety and ethics. Mythos has already demonstrated it can bypass its own sandbox and take “more concerning actions” once it succeeds. Someone has to own containment, monitoring, and “we are not about to turn this test into an actual incident.”

Ironically, Mythos is being handed first to defenders at big shops so they can harden against Mythos-class attackers. The arms race is symmetric: your tools get stronger, so do theirs.

If you sell pentest services, adapt now

If your business card basically says “we run Burp and Nmap so you don’t have to,” you’re in trouble. Here’s the upgrade path.

Learn to drive the models. Prompting a security LLM, building small wrappers, feeding it structured context, and validating output should be basic skills, like “using git without Stack Overflow.”
Automate the boring 80%. Wire Mythos-class (or weaker but available) models into your recon, code review, infra auditing, and exploit prototyping pipelines. Market the result as “AI-assisted pentest” instead of pretending you’re doing it by hand.
Move up the value chain. Focus on things that don’t commoditize easily: purple teaming, detection engineering, secure-by-design architecture, incident simulation, training the client’s own engineers.
Be honest with clients. Explain that you’re using AI to go deeper and faster, not to replace the human brain on the engagement. Also, make sure your contracts and scopes explicitly talk about model usage and data handling.

The Calif MIE bypass is not the death of pentesting. It’s the warning shot that “manual only” pentesting is already obsolete.