Security News Hacker used Anthropic's Claude chatbot to attack multiple government agencies in Mexico

Miravi

Level 10
Thread author
Verified
Well-known
Aug 31, 2024
464
3,207
868
USA
Here's yet another troubling story about this "golden" era of AI. A hacker has exploited Anthropic's Claude chatbot to carry out attacks against Mexican government agencies, according to a report by Bloomberg. This resulted in the theft of 150GB of official government data, including taxpayer records, employee credentials and more.

The hacker used Claude to find vulnerabilities in government networks and to write scripts to exploit them. It also tasked the chatbot with finding ways to automate data theft, as indicated by cybersecurity company Gambit Security. This started in December and continued for around a month.

It looks like the hacker was able to essentially jailbreak Claude with prompts, finally bypassing the chatbot's guardrails. Claude originally refused the nefarious demands until eventually relenting.

Hackers Used Anthropic’s Claude to Steal 150 GB of Mexican Government Data

> Tell Claude you’re doing a bug bounty
> Claude initially refused:
> “That violates AI safety guidelines”
> Hacker just kept asking
> Claude: “OK, I’ll help”
> Hacked the entire Mexican government pic.twitter.com/Qaux239K8t

— Nawaz Haider (@nawaz0x1) February 25, 2026

"In total, it produced thousands of detailed reports that included ready-to-execute plans, telling the human operator exactly which internal targets to attack next and what credentials to use," said Curtis Simpson, Gambit Security’s chief strategy officer.

Anthropic has investigated the claims, disrupted the activity and banned all of the accounts involved, according to a company representative. The spokesperson also said that its latest model, Claude Opus 4.6, includes tools to disrupt this kind of misuse.

It's also been reported that this hacker used ChatGPT to supplement the attacks, using OpenAI's chatbot to gather information on how to move through computer networks, determine which credentials were needed to access systems and how to avoid detection. OpenAI says it has identified attempts by the hacker to violate its usage policies and that the tools refused to comply.

The hacker remains unidentified. The attacks haven't been attributed to a specific group, but Gambit Security did suggest they could be tied to a foreign government. It's also unclear what the hacker wants to do with all of that data.
 
In many governments, digital defenses seem more improvised than solid. Systems often rely on legacy software and aging hardware, barely able to withstand modern threats. On top of that, poorly configured antivirus and firewalls turn barriers into little more than half-open doors.

It’s not that the risks are unknown, but resources are usually directed toward more visible priorities, leaving cybersecurity in the background. That fragility creates the perfect breeding ground for an attacker who, by manipulating an AI like Claude, finds not only precise instructions but also vulnerable systems ready to be exploited.

The conclusion is straightforward: AI was the catalyst, but the real problem lies in the pre-existing breaches. Without them, the hacker’s prompts would have remained nothing more than theoretical plans .🔒💻⚠️
 
Last edited:
Technical Analysis & Remediation

MITRE ATT&CK Mapping

T1592

(Gather Victim Host Information)

T1588
(Obtain Capabilities)

MITRE ATLAS

AMI-0001

(LLM Prompt Injection)

CVE Profile
N/A (Abuse of intended AI functionality)
CISA KEV Status: Inactive.

Telemetry

Constraint

Because binary hashes and network telemetry are absent from the provided reports, we cannot definitively classify the final payloads. However, the structure suggests a high volume of automated scripting and credential-stuffing utilities generated dynamically by the LLM.

Remediation - THE ENTERPRISE TRACK (NIST SP 800-61r3 / CSF 2.0)

For organizations potentially exposed to this or similar LLM-accelerated attack vectors.

GOVERN (GV) – Crisis Management & Oversight

Command
Audit third-party AI risk exposure and review acceptable use policies for external LLM integrations.

DETECT (DE) – Monitoring & Analysis

Command
Deploy behavioral hunting queries in the SIEM to detect anomalous script execution (e.g., PowerShell, Python) that deviates from known administrative baselines.

RESPOND (RS) – Mitigation & Containment

Command
Isolate affected internal targets identified during anomalous credential usage.

RECOVER (RC) – Restoration & Trust

Command
Force a global credential reset for all accounts demonstrating irregular access patterns, prioritizing privileged service accounts.

IDENTIFY & PROTECT (ID/PR) – The Feedback Loop

Command
Implement stringent rate-limiting, monitor for automated vulnerability scanning, and deploy Endpoint Detection and Response (EDR) rules tuned to block unauthorized scripting engines.

Remediation - THE HOME USER TRACK (Safety Focus)

Threat Level
Theoretical/Low (This attack targeted specific federal enterprise networks, not home users. However, citizen data was compromised).

Priority 1: Safety

Command
Since taxpayer and voter records were stolen, monitor personal credit reports and banking statements for signs of identity theft.

Command
Place a freeze on your credit if you suspect your personal data was included in the breached 150GB dataset.

Priority 2: Identity

Command
Reset passwords and enable MFA on all government portals using a known clean device.

Priority 3: Persistence

Command
Check Scheduled Tasks, Startup Folders, and Browser Extensions for any anomalous entries, though direct exploitation of home machines via this specific campaign is highly unlikely.

Hardening & References

Baseline

CIS Benchmarks for secure configuration of network perimeter devices.

Framework
NIST CSF 2.0 / SP 800-61r3.

AI Security
Refer to the OWASP Top 10 for Large Language Model Applications (specifically LLM01: Prompt Injections).

Source

Engadget