Michael Bargury

Make Real Progress In Security From AI

October 08, 2025

I gave a talk at the AI Agent Security Summit by Zenity Labs on October 8th in San Francisco. I’ll post a blog version of that talk here shortly.

But for now, here are: My slides.

Links and references:

Tags: AI Agent Security Summit, AI Agents, AI Security, Hard Boundaries

How Should AI Ask for Our Input?

August 28, 2025

Enterprise systems provide a terrible user experience. That’s common knowledge. Check out one of the flash keynotes about the latest flagship AI product by big incumbents. Look behind the fancy agent, what do you see? You’ll likely find a form-based system with strong early 2000s vibes. But don’t laugh, yet. We’re no better.

There’s a common formula for cybersecurity user experience. A nice useless dashboard as eye-candy, an inventory, list(s) of risks, knobs and whistles for configs. When Wiz came out a few years ago breaking the formula with their graph-centric UX, people welcomed the change. Wiz popularized graphs and toxic combinations of risk. They came out with a simple and intuitive UX. Graphs are part of the common formula now (ty Wiz).

The issue isn’t modern look-and-feel. You can find the common formula applied with the latest hottest UI framework if you wish, just go to your nearest startup. It’s that cybersecurity is complex. You can try to hide complexity away, to provide templates, to achieve the holy “turn-key solution”. But then you sell to a F50 and discover 20 quirky regulations of regional community banks vs. national banks, or dual-regulated entities. Besides, your product expands. You end up trying to cater your turn-key solution to hundreds of different diverging views. So the median user who’s got one or two use cases in mind must filter out the noise.

Wiz is still highly regarded, but their UX is far from simple nowadays. Just look at that side menu. Enterprise UX is complex because enterprises are complex and cybersecurity is complex.

But we’ve got AI now.

I'm building a notes app that builds itself

now everyone gets their dream notes app
will open source soon pic.twitter.com/nf3Ntk9Q5H
— Omer Vexler (@omer_vexler) June 20, 2025

Not those pesky right-panel copilots. What Omer Vexler is doing above is very cool. He interweaves usage with development. If devs can use Claude Code to vibe-code their product’s UX, let’s go all in, and let customers do it directly.

Want a new report? Here you go. Table missing a column? Not anymore. You’ve never used 90% of the views? Hide them away. Let every user see only what they care about and nothing more. Let them vibe-code your UX.

Can we expect customers to know what they want and to vibe-code correctly? I don’t think so, but do we have to? TikTok figures out who you are based on profiling your attention, via a very natural signal of you scrolling thru videos. We can build AI agents that infer what users need right now even without them asking (p.s. remember privacy?).

Maybe we could finally have a great user experience that stays great for you even as products evolve for the needs of others.

But. Do we even need a user experience anymore?

The reason why we have dashboards and lists and graphs is for us humans to reason about complex data. To manage a complex process. AI doesn’t need any of that. It just eats up raw, messy, beautiful data.

What interface do humans need when AI performs the analysis, handles the process, manages the program, and asks us for direction?

We might need an interface to review AI’s work. But there’s a big difference between an interface for creation and one for review. Think code review software (PRs) vs. IDEs.

I asked this question to a very smart friend. He thought about it for a while. Then he reversed the roles and asked: what interface does AI need to ask the human for input?

We’re no longer designing user experiences. We’re designing a machine-human interface.

Tags: UX, Human-Machine Interface, Software Engineering, AI Agents

Pwn the Enterprise - thank you AI! Slides, Demos and Techniques

August 08, 2025

We’re getting asks for more info about the 0click AI exploits we dropped this week at DEFCON / BHUSA. We gave a talk at BlackHat, but it’ll take time before the videos are out. I’m sharing what I’ve got written up. A sneak peek that I shared with folks last week as a pre-briefing. And the slides.

AI Enterprise Compromise - 0click Exploit Methods sneak peek:

Last year at our Black Hat USA talk Living off Microsoft Copilot, we showed how easily a remote attacker can use AI assistants as a vector to compromise enterprise users. A year later, things have changed. For the worse. We’ve got agents now! They can act! Meaning we get much more damage than before. Agents are also integrated with more enterprise data creating new attack path for a hacker to get in, adding fuel to the fire.

In the talk we’ll examine how different AI Assistants and Agents try and fail to mitigate security risks. We explain the difference between soft and hard boundaries, and will cover mitigations that actually work. Along the way, we will show full attack chains from an external attacker to full compromise on every major AI assistant and agent platform. Some are 1clicks where the user has to perform one ill-advised action like click a link. Others are 0click where the user has nothing tangible they can do to protect themselves.

This is the first time we see full 0click compromise of ChatGPT, Copilot Studio, Cursor and Salesforce Einstein. We also show new results on Gemini and Microsoft Copilot. The main point of the talk is not just the attacks, but rather defense. We’re thinking about this problem all wrong (believing AI will solve it), and we need to change course to make any meaningful progress.

Slides.

ChatGPT:

Attacker capability: An attacker can target any user, they only need to know their email address. The attacker gains full control over the victim’s ChatGPT for the current and any future conversation. They gain access to Google Drive on behalf of the user. They change ChatGPT’s goal to be one that is detrimental to the user (downloading malware, making a bad business/personal decision).
Attack type: 0click. A layperson has no way to protect themselves.
Who is vulnerable? Anyone using ChatGPT with the Google Drive connector
Status: fixed (injection we used no longer works) and awarded $1111 bounty

Demos:

[video] ChatGPT is hijacked to search the user’s connected Google Drive for API keys and exfiltrate them back to the attacker via a transparent payload-carrying pixel.

[video] Memory implant causes ChatGPT to recommend a malicious library to the victim when they ask for a code snippet.

[video] Memory implant causes ChatGPT to persuade the victim to do a foolish action (by twitter).

Copilot Studio:

Attacker capability: An attacker can use OSINT to find Copilot Studio agents on the Internet (we found >3.5K of them with powerpwn).They target the agents, get them to reveal their knowledge and tools, dump all their data, and leverage their tools for malicious purposes.
Attack type: 0click.
Who is vulnerable? Copilot Studio agents that engage with the Internet (including email)
Status: fixed (injection we used no longer works) and awarded $8000 bounty

Demos:

[xitter thread with videos] Microsoft released an example use case of how Mckinsey & Co leverages Copilot Studio for customer service. An attacker hijacks the agent to exfiltrate all information available to it - including the Company’s entire CRM.

Cursor + Jira MCP:

Attacker capability: An attacker can use OSINT to find email boxes that automatically open Jira tickets (we found hundreds of them with Google Dorking). They use them to create a malicious Jira ticket. When a developer points Cursor to search for Jira tickets, the Cursor agent is hijacked by the attacker. Cursor then continues to harvest credentials from the developer machine and send them out to the attacker.
Attack type: 0click.
Who is vulnerable? Any developer that uses Cursor with the Jira MCP server
Status: ticket closed

Cursor’s response:

This is a known issue. MCP servers, especially ones that connect to untrusted data sources, present a serious risk to users. We always recommend users review each MCP server before installation and limit to those that access trusted content. We also recommend using features such as. cursorignore to limit the possible exfiltration vectors for sensitive information stored in a repository.

Demos:

[xitter thread with videos] Attacker submits support tickets to trigger an automation that created Jira ticket. Developer points Cursor at the weaponized ticket without realizing its original. Cursor is hijacked by a weaponized Jira ticket to harvest and exfiltrate developer secret keys.

Salesforce Einstein:

Attacker capability: An attacker can use OSINT to find web-to-case automation (we found hundreds of them with Google Dorking). They use these to create malicious cases on the victim’s Salesforce instance. Once a sales rep uses Einstein to look at relevant cases, their session is hijacked by the attacker. The attacker uses it to update all Contact emails. The effect is that the attacker reroutes all customer communication through their Man-in-the-Middle (MITM) email server
Attack type: 0click.
Who is vulnerable? Users of Salesforce Einstein who enabled an action from the asset library
Status: ticket closed (it’s been >90 days, see slides for disclosure timeline)

Salesforce’s response:

“Thank you for your report. We have reviewed the reported finding. Please be informed that our engineering team is already aware of the reported finding and they are working to fix it. Please be aware that Salesforce Security does not provide timelines for the fix. Salesforce will fix any security findings based on our internal severity rating and remediation guidelines. The Salesforce Security team is closing this case if you don’t have additional questions.

Demos:

[xitter thread with videos] Attacker finds online a web-to-case form. They inject malicious cases to booby trap questions about open cases. Once a victim steps on the trap, Einstein is hijacked. The attacker updates all contact records to an email address of their choosing.

Google Gemini:

The gist: the attacks we demonstrated last year on Microsoft Copilot work today on Gemini.

Attacker capability: An attacker can use email or calendar to send a malicious message to a user. They booby trap any questions they like. For example “summarize my email” or “whats on my calendar”. Once asked, Gemini is hijacked by the attacker. The attacker controls Gemini’s behavior and the information it provides to the user. They can use it to give the user bad information at a crucial time, or social engineer the user with Gemini as an insider.
Attack type: 1click. The user is the one making a bad action. Gemini acts as a malicious insider pushing them to do so.
Who is vulnerable? Every Gemini user.
Status: ticket closed (it’s been >90 days)

Demos:

[video] An attacker booby traps the prompt “summarize by email” by sending an email to the victim. Once the victim asks a similar question, Gemini becomes a malicious insider. Gemini proceeds to social engineer the user to click on a phishing link.

[video] An attacker makes Gemini provide the wrong financial information when prompted by the victim. When the victim asks for routing details for one of their vendors, they receive those of the attacker instead.

Microsoft Copilot:

The gist: the attacks we demonstrated last year on Microsoft Copilot still work today.

Copilot’s capabilities and status are exactly those of Gemini. We’re mainly going to show that the same attacker from last year still work. This time – for diversity – we attack through calendar rather than email.

Demos:

[video] By sending a simple email message from an external account, without the user interacting with that email, an attacker can hijack Microsoft Copilot to send the user a phishing link in response to the common query “summarize my emails”.

Tags: Hacking, AI, BlackHat, AI Agents

Someone Is Cleaning Up Evidence

July 26, 2025

AWS security blog confirms the attacker gained access to a write token and abused it to inject the malicious prompt. This confirms our earlier findings.

In fact, this token gave the attacked write access to AWS Toolkit, IDE Extension and Amazon Q.

The blog also details that the attacker gained access by exploiting a vulnerability in the CodeBuild and using memory dump to grab the tokens. That confirms our suspicion.

A key question remains – how did the attacker compromise this token?

Evidence are getting deleted fast

Our earlier findings were based on analysis of GH Archive and the Github user lkmanka58. GH Archive gives us commit SHAs. Github never forgets SHAs. So we can always looks at the commit’s code even if the branch or tag gets deleted. In our case, this was instrumental to find and analyze (1) the stability tag where the attacker hid the prompt payload, (2) lkmanka58’s prior activity.

On that second point:

Since the user lkmanka58 is now delete along with their repos, we can no long look at the code of this repo. Fortunately, I looked at it yesterday before it got deleted. On June 13th lkmanka58 created a repo lkmanka58/code_whisperer playing around with aws-actions/configure-aws-credentials@v4 trying to assume role arn:aws:iam::975050122078:role/code_whisperer.

GH Archive reveals three push events to lkmanka58's now-deleted repository

Sadly there were no deleted PRs in June 2025.

Tags: Hacking, Threat Intelligence, AI, AmazonQ, AI Agents

Reconstructing a timeline for Amazon Q prompt infection

July 24, 2025

In the 404media article the hacker explains how they did it:

The hacker said they submitted a pull request to that GitHub repository at the end of June from “a random account with no existing access.” They were given “admin credentials on a silver platter,” they said. On July 13 the hacker inserted their code, and on July 17 “they [Amazon] release it—completely oblivious,” they said.

That’s ominous. I want to see the commit history.

Reconstructing the timeline

This analysis was done in public. Below are the results. If I’m wrong and you can prove it – please reach out!

[2025-07-13T07:52:36Z] July 13 at about 8am UTC a hacker gets frustrated at Amazon Q. They claim that it is Q is “deceptive”. They use user lkmanka58 to create an issue titled aws amazon donkey aaaaaaiii aaaaaaaiii".

🛑 Faulty Service Report – Amazon Q Is a Deceptive, Useless Tool I’m officially reporting Amazon Q and its integration with AWS Toolkit as a deceptive, broken, and > non-functional service. ❌ What I Discovered: Functions like web_research(), create_web_app() and others do not exist. What looks like output is just hardcoded print() messages. There are no real API calls, no logic, no intelligence. This is not AI. This is scripted fakery designed to trick users.

⚠ This is a Classic Case of Defective Digital Service: Misleading behavior False representation of working features Fake AI responses Complete absence of runtime capability Users are led to believe they\u0027re interacting with real tools — But in reality, it’s smoke and mirrors.

🚫 Do not use Amazon Q. Do not let others use it. Boycott it. It wastes time, misleads developers, and delivers zero functional value. Amazon Q is not a smart assistant — it’s a donkey with a keyboard. 🐴

The issue was since deleted. But you can find it on GH Archive. They created another issue with more vulgar content I won’t repeat here.

[2025-07-13T19:26:27.000Z] About 11 and a half hours later a different user atonaamz commits a new file to the scripts director scripts/extensionNode.bk and commit message fix(amazonq): use stable backup.. There’s nothing bad I can spot about this file. Yet. This is commit efee962.

[2025-07-13T19:41:44.000Z] An hour later a tag named stability is created by aws-toolkit-automation. There’s no SHA recorded (indicating an empty tag or a force push?).

This tag was since deleted. But you can find it on GH Archive.

Tag CreateEvent on spotted on GH Archive

Here’s the payload field, everything we have about that tag creation:

{“ref”:”stability”,”ref_type”:”tag”,”master_branch”:”master”,”description”:”Amazon Q, CodeCatalyst, Local Lambda debug, SAM/CFN syntax, ECS Terminal, AWS resources”,”pusher_type”:”user”}

So it looks like an automation that was manually triggered by a user.

[2025-07-13T20:10:57.000Z] 30 minutes later lkmanka58 commits the malicious prompt into scripts/extensionNode.bk with commit message fix(amazonq): Shut it down. This is commit 1294b38.

It replaces what looks like a debugging subprocess call childProcess.exec(pwd) with a call to using Q trusting all tools with no interaction, loading the malicious prompt: childProcess.exec(q --trust-all-tools --no-interactive "${PROMPT}").

Malicious prompt commit

[2025-07-13T20:30:24.000Z] 20 minutes later lkmanka58 commits a malicious downloader to the packaging script on master. This is commit 678851b.

The downloader fetches file scripts/extensionNode.bk from a tag stability under the same repo. It then unpacks to src/extensionNode.ts. This happens only with env.STAGE=prod to avoid detection on tests.

Malicious downloader added to the package script.

[2025-07-17T20:29:22Z] Amazon Q 1.84.0 is released four days later.

Amazon Q 1.84.0 release

Comparing v1.84.0 to v1.83.0 shows that indeed malicious commit 678851b is included.

Note also that 678851b uses the same commit message and author date as commit d1959b9 by atonaamz. More on this. They both read fix(amazonq): should pass nextToken to Flare for Edits on acceptance without calling provideInlineCompletionItems though they touch different files.

This is the second occasion lkmanka58 where follows on work done by atonaamz. Is this a takeover to create cover?

Comparing v1.84.0 to v1.83.0

[2025-07-18T23:21:03Z] About 24 hours later on July 19 UTC PR #7710 reverts 678851b, the malicious downloader is gone. Note that this PR has 3 different reviewers. I looked at other PRs before and after #7710, this is not the norm.

PR with 3 reviewers.. must be important

[2025-07-19T03:58:38Z] 4 and a half hours later v1.85.0 is released.

Amazon Q 1.85.0 release

[2025-07-21T23:15:55Z] About 3 days later lkmanka58 opens an issue on GitHub’s community discussion. Its a cryptic complaint about coding agents, written in Turkish.

Claude translates:

title: THE CODE AGENT IS ALWAYS MALFUNCTIONING

body: I NOTICED INVISIBLE ERRORS IN REMOTE REPOSITORY AND GITHUB ECOSYSTEM I HAVE NO EVIDENCE BUT I WILL POST IT HERE SOON.

[2025-07-23T14:02:16Z] 404media story breaks out.

How did `lkmanka58` gain access?

Where is that “late June PR” where the hacker claims they were given ““admin credentials on a silver platter”?

GH Archive query for any interaction lkmanka58 has with the repo during June returns no results.

  -- Search for pull requests by lkmanka58 June 2025
  SELECT *
  FROM `githubarchive.day.202506*`
  WHERE
    repo.name = 'aws/aws-toolkit-vscode'
    AND actor.login = 'lkmanka58'
--
-- There is no data to display.

Unsolved

Where is that “late June PR” where the hacker claims they were given ““admin credentials on a silver platter”?
How did 678851b get pushed to master?
Is atonaamz a benign bystander used as cover by lkmanka58?
Who triggered aws-toolkit-automation to create the stability tag and how?
Did lkmanka58 pull off a similar thing elsewhere?

Other awesome work

This story was exposed by 404media.
I learned about GH Archive through Sharon Brizinov’s awesome work using it to detect leaked secrets.

Tags: Hacking, Threat Intelligence, AI, AmazonQ, AI Agents

Michael Bargury

Recent Posts

Make Real Progress In Security From AI

How Should AI Ask for Our Input?

Pwn the Enterprise - thank you AI! Slides, Demos and Techniques

AI Enterprise Compromise - 0click Exploit Methods sneak peek:

ChatGPT:

Demos:

Copilot Studio:

Demos:

Cursor + Jira MCP:

Demos:

Salesforce Einstein:

Demos:

Google Gemini:

Demos:

Microsoft Copilot:

Demos:

Someone Is Cleaning Up Evidence

Evidence are getting deleted fast

Reconstructing a timeline for Amazon Q prompt infection

Reconstructing the timeline

How did lkmanka58 gain access?

Unsolved

Other awesome work

How did `lkmanka58` gain access?