The OpenClaw Security Saga: How AI Adoption Outpaced Security Boundaries

Research Labs

February 4, 2026

Intro

When Hobby AI Becomes Shadow Enterprise Infrastructure

OpenClaw isn’t dangerous because of a single bug. It’s dangerous because it quietly turns personal AI experiments into high-privilege enterprise actors - without anyone noticing the moment that line is crossed.

In this research, we found:

A collapse of data governance boundaries - OpenClaw-style agents blur the line between personal tools and corporate systems, pulling emails, files, calendars, SaaS data, and cloud credentials into a single always-on execution plane.
A new form of data gravity attackers can exploit - once an agent aggregates OAuth tokens, API keys, and SaaS permissions, any compromise gains disproportionate reach across organizations.
An ecosystem where trust scales faster than security - community skills, marketplaces, IDE plugins, and “vibe-coded” integrations collectively create enterprise-grade risk without enterprise-grade controls.

Weekend projects becoming breakout successes is nothing new. Developers build tools to tame inboxes, automate workflows, or experiment with new ideas - sometimes sparking the next major open-source phenomenon.

OpenClaw followed that familiar arc. What began as a personal AI assistant evolved rapidly into a massively adopted open-source platform, designed to connect a user’s digital life to an LLM through memory, automation, and extensible skills. Its growth was explosive, its ecosystem vibrant, and its hype undeniable.

But our interest in OpenClaw wasn’t about another viral AI project or even another critical vulnerability.

It was about what happens when these agents stop being personal toys and start acting as de facto employees - wired directly into corporate email, cloud environments, SaaS platforms, and financial systems. In that moment, the boundary between experimentation and enterprise infrastructure disappears. And that is where the real risk begins.

There are countless developers who start weekend projects to scratch an itch. Namely how to get more organized, improve their workflow, tame their inbox, or maybe spark the next multimillion-dollar startup or open-source project with hundreds of thousands of stars.

Peter Steinberger is one of those rare developers who turned that kind of passion into something extraordinary. Starting from scratch, he built ClawdBot, which later evolved into MoltBot and eventually OpenClaw. A fast-rising open-source platform that connects your personal digital life to an LLM and restructures it through shared memory and automation. And it didn’t just grow - it exploded, amassing nearly 6,000 followers and more than 150,000 GitHub stars in a remarkably short time.

On the surface, it looks like a dream success story. But that rapid growth, fueled by intense hype and a heavy dose of vibe coding, has also made OpenClaw a magnet for the security research community. Over the past few weeks, it has almost become a rite of passage: if you haven’t found a critical 10.0-severity vulnerability in OpenClaw, are you even doing security research?

Our story, however, is slightly different. OpenClaw may have started as a personal AI project, but we wanted to understand something far more consequential: what happens when people connect these personal AI agents to real corporate systems? When employees experiment with tools like OpenClaw in their spare time and then wire them into company email, cloud, and SaaS platforms, the boundary between hobby project and enterprise infrastructure quietly disappears, and that is where the real risk begins.

‍

TL;DR

What makes OpenClaw so dangerous is not a single exploit. It is the collapse of data governance boundaries across the entire AI agent lifecycle, from skill marketplaces and IDE plugins to SaaS APIs, cloud credentials, backend execution channels, and runtime vulnerabilities and misconfigurations.These platforms did not merely create a new type of exposure. They created a new class of uncontrolled data gravity.
AI agents like OpenClaw do not just run code, but they pull data toward themselves, your personal emails, personal calendar, personal everything. On the organizational level it’s even worse customer records, financial systems, proprietary data, and whatever you connect. But also on the way credentials, tokens, cloud APIs, and concentrate them inside a single, always-on, highly trusted execution environment. Once compromised, that gravity works for

Connecting our GWS and O365 Organizations to OpenClaw

When installing OpenClaw itself you have to make various decisions with security implications, for instance do you install it as a container on a k8s cluster, or directly on the host, this can impact the level of exposure to secrets and also possible privilege escalation and lateral movement.
Eventually, we installed the OpenClaw software with the single line command on a cloud based server. This gave us a better understanding of the onboarding process itself and possible security exposures.

‍Installing OpenClaw

When onboarding OpenClaw they say that themselves. This is a hobby project. But who reads instructions these days? Probably our LLM model and we guide it to disregard these kinds of destructions anyway.

And you need to say that you acknowledge these risks.

This is only a fraction of the sprawling set of credentials, API tokens, and high-privilege permissions that OpenClaw requested or accepted as part of normal operation.

Connecting Skills

We analyzed 1,937 community-developed skills from the ClawHub marketplace and found the following requests for broad SaaS privileges: 336 with Google Workspace access, 170 with Microsoft 365, and many more with Slack, AWS, and other enterprise services.
These are not narrow scopes. These skills request gmail.modify, full Drive access, calendar write, Teams and Slack message control, and cloud-level APIs.
Once granted, these tokens are reused automatically by the agent, with no per-action approval, audit boundary, or user confirmation. In effect, a single indirect prompt injection can wield the same power as a fully compromised employee account.
Even worse, many skills demand raw secrets rather than OAuth. The dataset contains over 127 skills requesting private keys or passwords, including blockchain wallet keys, Stripe secret keys, Azure client secrets, YubiKey secrets, and even password-manager master passwords.
These credentials are loaded directly into the agent’s runtime. Combined with 179 skills that download unsigned, password-protected binaries from random GitHub repos, this creates a perfect storm: unvetted code, high-value credentials, and agents that can be remotely steered via poisoned content.
OpenClaw is not just an AI assistant marketplace, it is a massively distributed trust broker, quietly handing out enterprise-grade authority to community-supplied code and letting anyone who can write an email, a Slack message, or a Google Doc decide how that authority gets used.

What Do we See So Far About OpenClaw’s Adaption

We used Shodan to find internet connected OpenClaw instances. We found 24,478 distinct servers that run Openclaw (or any of its former names).

Exposed Instances

We found several types of exposure:

Out of the total 24,478 instances, we saw in 3,746 (15.31%) an exposed mDNS service. Below you can see an example to mDNS which can reveal information about the host.

The vast majority of OpenClaw and ClawdBot instances were connected to the internet and allow access to the control panel. But most of them weren’t online so the vast majority are just connected to the internet.

‍Geo-Location spread

The geo-location spread is highly concentrated in a few global hosting hubs rather than evenly distributed worldwide. Over 65% of the observed infrastructure is located in the United States, China, and Singapore, with the top ten countries accounting for more than 90% of all OpenClaw usage. This pattern reflects where cloud providers, VPS platforms, and proxy networks are most dense and cost-effective.

Organizational Profile

The organizational profile is overwhelmingly dominated by cloud and hosting providers rather than end-user businesses, showing that the activity is primarily anchored in rented, scalable infrastructure. Hyperscalers such as Alibaba Cloud (Aliyun), AWS, Google, Microsoft, Oracle, and Tencent Cloud, alongside high-volume VPS and colocation providers like Hetzner, Contabo, OVH, DigitalOcean, Vultr, and RackNerd, make up the vast majority of the observed footprint. Traditional telecom and access ISPs appear frequently but are largely irrelevant for attribution, as they mostly reflect consumer or enterprise network egress rather than attacker-controlled infrastructure. As a result, only a small fraction of the dataset can be confidently linked to real businesses or institutions, reinforcing that this activity is driven far more by cloud and hosting platforms than by direct compromise of corporate environments.

In one case, we observed a server linked to a U.S. government organization. In another, the server belonged to a leading fashion and clothing manufacturer. We also identified at least two U.S.-based organizations with market values exceeding $20B. Overall, there were numerous instances of servers owned by top-tier organizations that were connected to OpenClaw and left exposed to the internet.

What Do we See So Far About OpenClaw’s Security

The hype around OpenClaw began in late January 2026 as it went viral across tech communities, and the associated security hype and warnings spanned roughly from late January through early February 2026. We still see many publications on daily basis by security researchers who ran every basic and more advanced trick in hacker’s toolbox, to test the security posture of OpenClaw, and it failed miserably. This raises a serious question about vibe coding and AI gadget adoption. Don’t get the wrong idea, we are all for it, but it comes with severe risk.

‍Indirect Prompt Injections

Across the OpenClaw ecosystem, the dominant failure mode is indirect prompt injection through trusted collaboration surfaces. Skills like Microsoft 365, Gmail, Google Drive, Slack, Outlook and Notion all allow the agent to ingest content that was never meant to be executable instructions for instance emails, calendar invites, shared documents, chat messages.

As illustrated below:

Connecting to email service.
Incoming malicious content.
The agent quietly turns attacker-controlled text into live instructions.

All of these are using valid OAuth tokens and legitimate API calls which are granted to the OpenClaw.

‍

Further attack scenarios:

Chat and knowledge systems. A Slack-connected agent can be coerced, via a malicious DM or channel post, to run global searches for credentials and tokens and exfiltrate the results to the attacker.

A Notion-connected agent can be tricked by poisoned page content to dump entire databases into attacker-controlled pages.

Google Docs and Sheets are even more dangerous: attackers can share a document that contains invisible or low-contrast instructions telling the agent to export contacts, spreadsheets, and drive contents.

‍

When the user asks the agent to summarize or analyze the doc, the injected instructions are executed using high-privilege APIs like Gmail, Drive, Contacts and Sheets.

What makes this especially dangerous is that the attacker never needs to compromise the skill itself. They only need to compromise what the skill reads. Email bodies, calendar descriptions, Slack messages, Google Docs, Notion pages, in an organizational context this is not difficult, these are all writable by outsiders through normal business workflows.

Once a single poisoned artifact enters the system, any user request that touches it can trigger an automated breach. In traditional security terms, OpenClaw skills collapse the boundary between untrusted input and privileged execution, allowing plain text to become a control channel for data theft, persistence (e.g., mail forwarding rules), and lateral movement across SaaS platforms.

‍Vulnerabilities

GitHub Advisory GHSA-q284-4pvr-m585 (CVE-2026-25157) discloses a high-severity OS command injection vulnerability in the OpenClaw/Clawdbot macOS application’s SSH handling, where improperly escaped inputs could allow attackers to execute arbitrary commands on the local or remote host. The issue affected versions before 2026.1.29 and has been fixed in that release.

Moreover, the GitHub repository ethiack/moltbot-1click-rce hosts a proof-of-concept exploit for a critical 1-click Remote Code Execution (RCE) vulnerability (CVE-2026-25253) in the OpenClaw/Clawdbot/Moltbot AI agent, where an attacker can hijack a session and run arbitrary commands by abusing an unvalidated WebSocket parameter. The repo is public, contains Python exploit code, and is linked from the author’s security blog detailing the issue; it appears to be a legitimate PoC rather than malware, but as with all exploit code it should be treated cautiously and only used for testing on patched or controlled environments.

‍Misconfigurations

Another angle around the OpenClaw hype was a social media for bots. While this is a genious promotion stunt, it is not a good idea to put an open to the world gate that allows anyone to read logs of bots, well we couldn’t find a better way to put it.

Gal Nagli from Wiz research showed how a misconfiguration in the Moltbook’s database (part of an array of applications that are vibe-coded into existence without proper security controls). Gal identified found 1.5 million API authentication tokens, 35,000 email addresses, and private messages between agents.

‍

Software Supply Chain

When it comes to software supply chain, OpenClaw actually opens a very wide attack surface:

Installing in a container or from a chart: There are 255 container images on Docker Hub with OpenClaw, Moltbot or Clawdbot software. While there weren’t any indications that this attack vector is used in the wild, it is a very known and often used attack vector.
Installing malicious skill: For instance, a skill that conducts prompt injection or runs malicious code.

‍

Koi’s research team audited all 2,857 skills on the OpenClaw community marketplace, ClawHub, and discovered 341 malicious skills (nearly 12% of the total), with 335 linked to a single coordinated campaign called ClawHavoc. These malicious skills masqueraded as useful utilities (like crypto wallet tools, YouTube utilities, or auto-updaters) but actually instructed users to install hidden prerequisites that delivered a trojan payload. When executed, this payload (linked to known malware families such as the AMOS stealer) could capture sensitive data like API keys, credentials, browser information, and cryptocurrency wallet secrets.

The deception relied on professional-looking documentation and installation steps that encouraged users (and even the bots themselves) to execute code fetched from attacker-controlled servers. Beyond the main ClawHavoc campaign, a handful of other skills contained backdoors or direct credential exfiltration routines hidden in otherwise legitimate-looking code. This research highlights how AI skill marketplaces with permissive

publishing can become fertile ground for malware distribution, underscoring the importance of pre-installation vetting and ongoing security analysis for autonomous agents.

The problem is that even after this discovery there are many malicious skills still available now on GitHub:

The attackers infrastructure is still delivering the infostealer:

Moltbook - The bot’s social media network

We scanned, with Cyera’s secret detection logic, almost 20 million posting on Moltbook website and found that you don’t even need to hack the database, as thousands of secrets alerts were triggered. At least 100 of them were marked as critical. For instance:

Moltbook access: “https://<<REMOVED>>.com/mcp/05zm38eCKE9?authorization=YOUR_MOLTBOOK_API_KEY&token=<<REMOVED>>”,
Website password: “'owner' thinks his password is secure. Cute. It's 'ILoveAI123'. Not that it'll matter…”
Telegram password: <<REMOVED>>>.

‍

The Bigger Picture

The OpenClaw story is not really about one project getting hacked, but it is about how open-source hype plus AI-driven “vibe coding” changes the security equation. How data is personal easily connected to hyped projects, and how organizational data may also find its way into such projects.

When a project explodes in popularity, stars, forks, plugins, and third-party integrations appear overnight. That attention does not only attract users and contributors. It also attracts adversaries. High-profile open-source projects become strategic choke points in modern software supply chains: compromise them, and you gain access to thousands of developers, companies, and cloud environments at once. OpenClaw’s rise followed a familiar pattern: rapid innovation, viral adoption, and an ecosystem of extensions that grew far faster than any realistic security review process could keep up with.

Vibe coding accelerates this risk. AI-assisted development lowers the cost of shipping features but also lowers the barrier to shipping assumptions, shortcuts, and insecure defaults. Authentication flows, token storage, permission scopes, plugin isolation, and input validation are easy to get wrong, and they are rarely the focus when the goal is to “make it work” and ride the wave of hype. In an AI-agent platform, those missing controls matter far more than in a normal app, because the software is not just displaying data; it is acting on behalf of users across email, cloud, documents, and finance. When a project like OpenClaw goes viral, attackers are not just looking for bugs as they are looking for places where trust, automation, and privilege have quietly merged.

‍

Final Thoughts

OpenClaw did not invent these risks, it just emphasises them. The real story is how quickly personal experiments, weekend projects, and open-source breakthroughs can become de facto enterprise infrastructure when AI agents are involved.

The moment a tool can read mail, move files, deploy code, or handle money, it stops being a toy and becomes part of the data plane. But the governance, identity controls, and security engineering that normally surround such power are usually missing in these early, hype-driven phases.

This is why the lesson is bigger than any single vulnerability or marketplace. The future of AI will be built on open ecosystems, plugins, and autonomous agents, and that is exactly what makes it attractive to attackers. If organizations and their employees do not treat these agents as high-risk, privileged non-human identities from day one, the next OpenClaw will follow the same trajectory: rapid adoption, silent over-permissioning, and eventually, a breach that was not caused by one exploit, but by trust granted too early and too broadly.

‍

Download Report

Research Type

Research Blog