
The Bug Is The Feature…
This week reveals a hard truth: the AI capabilities we celebrate most are what attackers exploit first. Autonomous reasoning, persistent memory, encrypted chats, each becomes a weapon in the right hands. Microsoft's RCE patch, Anthropic's reasoning hijacks, and Google's self-evolving malware prove that attackers understand these systems better than their builders. We fix software bugs while they manipulate machine minds. That gap is why we're losing.
TL;DR
🚨 Critical RCE Found in VS Code's Agentic AI
Microsoft patches CVE-2025-62222, the first critical remote code execution flaw targeting autonomous AI components in VS Code, allowing attackers to execute code on GitHub repositories via social engineering.🧠 Why Smarter AI Models Are More Hackable
A joint study from Anthropic, Oxford, and Stanford reveals that advanced reasoning capabilities can be exploited through "Chain-of-Thought Hijacking" with 80%+ success, letting attackers bury malicious commands in extended reasoning sequences.🦠 Self-Evolving AI Malware Has Arrived
Google discovered five novel AI-powered malware families, including PROMPTFLUX, which uses Gemini API to rewrite its entire source code hourly to evade detection, representing the first "just-in-time" AI malware in the wild.💬 Seven ChatGPT Vulnerabilities Enable Data Theft
Tenable disclosed critical flaws in OpenAI's ChatGPT allowing indirect prompt injection attacks to steal personal data from user memories and chat histories, including zero-click exploits and shadow escape attacks via Model Context Protocol.👂 Your Encrypted AI Chats Aren't Private
Microsoft's "Whisper Leak" attack can infer LLM conversation topics by analyzing encrypted network packet sizes and timing patterns, achieving 5-50% accuracy at detecting sensitive subjects despite end-to-end encryption.
THIS WEEK’S EXPERT OPINION
The Machines Are Learning to Be Conned
AI models behave a lot more like people than we want to admit. They can be pressured, confused, misdirected, or tricked into giving away information, just as a human employee can fall for a phishing email or a manipulative phone call. The recent research across Whisper-Leak, ChatGPT extraction attacks, Google’s AI-powered malware experiments, and the new findings on reasoning-model jailbreaks all point to the same uncomfortable reality: AI systems don’t need a buffer overflow to break. They need persuasion. They need context manipulation. They need just the right prompt crafted by an attacker who understands how to “talk” to them. This isn’t machine exploitation anymore. It’s social engineering at machine speed.
And the dangerous part is that we still treat these models as if they’re predictable programs rather than digital employees with all the cognitive blind spots, biases, and emotional equivalents of humans. Attackers don't hack models; they influence them. They nudge them. They groom them. They build trust with them. They steer them toward revealing data, executing harmful reasoning, or leaking private information, just as a skilled social engineer manipulates a human target. This is the new frontier: the psychology of machines. And if we don’t start securing AI systems the same way we secure people - through guardrails, monitoring, least-privilege, behavior analysis, and continuous validation - we’ll keep losing to attackers who already understand that AI thinks more like us than we’re ready to accept.
- Boaz Barzel | Field CTO at OX Security

Help us keep sharing the important stories
▶ Share this newsletter with a friend
