Skip to content
CTCO
Go back

OpenClaw's Security Reckoning: What Nvidia's NemoClaw Tells Us About Enterprise Readiness

Published:  at  07:18 AM
·
10 min read
· By Joseph Tomkinson
Reality Checks
Human + AI
Feature image for the OpenClaw security reckoning article

Table of contents

Open Table of contents

From Clawdbot to 250,000 Stars (and a Pile of CVEs)

If you’ve been following this blog, you’ll know I covered OpenClaw (then still called Clawdbot, then briefly Moltbot) back in January. At the time, it was a fascinating, slightly terrifying experiment: a self-hosted AI agent that could execute shell commands, manage your email, browse the web, and take actions across your entire digital life, all through messaging platforms like WhatsApp and Slack.

It was a hobbyist project by developer Peter Steinberger. A clever one, sure. But a hobbyist project nonetheless.

Fast-forward two months and it’s GitHub’s most-starred non-aggregator software project in history (ahead of Linux, ahead of React) with an estimated two million-plus users worldwide. It’s been adopted by enterprises across 82 countries. AWS launched a managed OpenClaw offering on Lightsail. And Nvidia just dedicated a chunk of its biggest annual conference to building a safety net around it.

That escalation? It happened faster than anyone’s security posture could keep up with. And that’s the problem.

The Security Situation Is… Not Great

I’m going to be blunt here, because I think the industry needs more bluntness on this topic and less “it’s early days, we’ll figure it out.”

The numbers are stark. SecurityScorecard’s STRIKE team found over 42,900 public-facing OpenClaw instances across 82 countries. Of those, more than 15,000 were confirmed vulnerable to remote code execution. Nearly all of them (98.6%) were running on cloud platforms, not home networks. These are enterprises and developers, not hobbyists tinkering on a Raspberry Pi.

The most dangerous vulnerability, CVE-2026-25253, had a CVSS score of 8.8. The attack vector was almost comically simple: visit a malicious URL, and your primary authentication token gets leaked. With that token, an attacker gets full administrative control over your gateway. The whole thing takes milliseconds.

But it gets worse. Bitdefender found that roughly 12% of ClawHub’s entire skills registry (OpenClaw’s public marketplace for plugins) had been compromised with malicious code. We’re talking keyloggers on Windows, Atomic Stealer malware on macOS, all disguised behind professional documentation and innocent names like “solana-wallet-tracker.” That’s not a vulnerability you can patch. That’s a supply chain problem baked into the architecture.

And just this week, Endor Labs disclosed six more vulnerabilities: SSRF bugs, missing authentication, path traversal issues. Some don’t even have CVE IDs yet.

Here’s the thing that keeps nagging at me. This isn’t a case of sloppy code that a few good PRs will fix. OpenClaw stores API keys and session tokens in unencrypted files. Authentication is disabled by default. The gateway broadcasts itself on the local network via mDNS. Every single instance holds credentials for Claude, OpenAI, Google AI, whichever models you’ve connected to it. Each one is a honeypot sitting on your infrastructure.

An infographic showing the security vulnerabilities of OpenClaw, including CVE-2026-25253 and the compromised skills in ClawHub.
An infographic showing the security vulnerabilities of OpenClaw, including CVE-2026-25253 and the compromised skills in ClawHub.
An infographic showing the security vulnerabilities of OpenClaw, including CVE-2026-25253 and the compromised skills in ClawHub.

So What Did Nvidia Actually Build?

This is where it gets interesting, and frankly, where the real story is.

At GTC on Monday, Nvidia introduced the NemoClaw stack. It’s a combination of Nvidia’s Nemotron models and a new open-source runtime called OpenShell, packaged so you can deploy the whole thing in a single command. OpenShell sits between the AI agent and your compute infrastructure, providing what Nvidia calls a “safety and governance layer.”

In practical terms, that means: isolation between the agent and the host system, credential management that doesn’t involve plaintext files, and a governance framework for controlling what the agent can and can’t do. It’s essentially Nvidia looking at OpenClaw and saying, “The demand is real, the security is not, so here’s a harness.”

Jensen didn’t mince words. He acknowledged that OpenClaw “is not ready for the enterprise” while simultaneously arguing that the technology is too important to ignore. That tension, between obvious utility and obvious risk, is the defining characteristic of this moment.

And I think Nvidia’s response is telling. They didn’t build a competitor. They didn’t dismiss OpenClaw. They built a safety cage around it and said, “Now it’s your turn to figure out your strategy.” That’s a very specific bet about where the market is headed.

An architecture diagram of Nvidia's NemoClaw stack, showing how OpenShell provides a safety and governance layer between the AI agent and the host system.
An architecture diagram of Nvidia's NemoClaw stack, showing how OpenShell provides a safety and governance layer between the AI agent and the host system.
An architecture diagram of Nvidia's NemoClaw stack, showing how OpenShell provides a safety and governance layer between the AI agent and the host system.

The Enterprise Response: Bans, Warnings, and a Lot of Nervousness

While Nvidia was building guardrails, everyone else was reaching for the emergency brake.

Meta has reportedly told employees that installing OpenClaw on work devices is grounds for termination. Not a slap on the wrist. Termination. That’s a strong signal from a company that is, itself, one of the world’s largest AI developers.

China’s National Computer Network Emergency Response Technical Team issued warnings about OpenClaw’s use in government agencies and state-owned enterprises. The restrictions extend to families of military personnel. Major Chinese cloud providers (Alibaba, Tencent, ByteDance) were actively promoting OpenClaw deployment even as the government was pulling the handbrake. The gap between adoption enthusiasm and security caution has rarely been this visible.

And then there’s Microsoft. Their Defender Security Research Team published a blog post that included a line I haven’t been able to stop thinking about: “OpenClaw should be treated as untrusted code execution with persistent credentials. It is not appropriate to run on a standard personal or enterprise workstation.”

That’s Microsoft, a company that would very much like you to buy Azure credits, telling you not to run this thing on your normal machines. When the cloud vendor says “please don’t,” you should probably listen.

The “Lethal Trifecta” and Why This Isn’t Just About OpenClaw

Here’s where I want to pull back from the specifics, because I think there’s a bigger point that gets lost in the CVE numbers and the business memos.

Researchers at Georgetown’s Centre for Security and Emerging Technology describe what they call a “lethal trifecta” for AI agents: access to private data, the ability to communicate externally, and the ability to ingest untrusted content. OpenClaw, by design, ticks all three boxes.

And honestly? So will every useful agentic AI system. That’s the uncomfortable truth. The features that make these tools powerful (reading your emails, acting on your calendar, accessing your files, executing commands) are the same features that make them dangerous. You can’t have an agent that’s both useful and completely sandboxed. At some point, it needs access. The question is how much, and with what controls.

This is the fundamental design tension of agentic AI, and it’s one that our existing security models weren’t built for. Firewalls don’t help when the threat comes from a legitimate process doing legitimate-looking things. EDR tools see an approved application making an approved API call. DLP rules don’t fire because nothing looks like exfiltration. The agent is just doing its job. Except it’s following instructions that were hidden in a forwarded email.

That prompt injection scenario isn’t hypothetical, by the way. It’s been demonstrated repeatedly. An attacker hides a single instruction in a piece of content the agent is processing. The agent, doing exactly what it was configured to do, follows that instruction. Credentials leave the building. Nothing triggers an alert.

If that doesn’t worry you, it should.

What This Means If You’re Leading an Engineering Org

Alright, so what do you actually do with all of this? I’ve been thinking about this a lot, and I think the answer sits somewhere between “ban everything” and “YOLO, let the agents run.”

Don’t pretend it’s not happening. Shadow AI is the new shadow IT. Your developers are already experimenting with OpenClaw and tools like it. A flat-out ban rarely works. People find workarounds, and now you’ve driven the problem underground where you can’t even see it. Kaspersky’s advice here is sound: “A policy of ‘yes, but with guardrails’ is always received better than a blanket ‘no.’”

If you’re going to evaluate, isolate ruthlessly. Microsoft’s guidance is the right starting point. Dedicated VMs. Non-privileged credentials. Access to non-sensitive data only. Continuous monitoring. A rebuild plan. Treat it like you’re running untrusted code, because you are.

Watch the supply chain. ClawHub’s malicious skills problem is a preview of what’s coming for every agent marketplace. If your teams are installing skills or plugins, you need a vetting process. Tools like mcp-scanner and skill-scanner are emerging for exactly this reason. Use them.

Start defining your agent governance model now. Galileo just released Agent Control, an open-source governance layer for AI agent behaviour. Nvidia has OpenShell. Microsoft has Defender guidance. The tooling is immature but it exists, and getting familiar with it now, before you’re deploying agents in production, is worth the investment.

Have the architecture conversation. Where does the agent run? What credentials does it hold? What data can it access? What happens when it gets compromised (not if, when)? These are questions your security team, your platform team, and your engineering leadership need to answer together. If you’re waiting for the perfect framework, you’ll be waiting while your competitors ship (and possibly get breached, but that’s another post).

Jensen Might Be Right, and That’s What Makes This Difficult

I want to be honest about something. I think Jensen Huang’s framing (OpenClaw as the next Linux, the next Kubernetes) is probably overstating it. But only probably. The underlying trend is real. Self-hosted, autonomous AI agents that can act on your behalf are not a fad. The demand is clearly there. Two million users in a few months don’t lie.

The problem isn’t the concept. The problem is that we’re building the plane while flying it, and the plane currently has unencrypted fuel lines and no cockpit door. NemoClaw is essentially Nvidia bolting on a cockpit door and saying, “Okay, now try again.”

For engineering leaders, the job right now is uncomfortable but clear. You need to understand this space. You need to have a position on it: whether that’s “we’re evaluating in a sandbox,” “we’re waiting for the tooling to mature,” or “we’re building our own agent governance framework.” What you can’t afford is silence, because your teams are making decisions right now whether you’re involved or not.

The agentic AI genie is out of the bottle. OpenClaw proved the demand. Its security record proved the risk. And Nvidia’s NemoClaw proved that even the biggest players think the answer isn’t “put it back in the bottle”; it’s “build a better bottle.”

Now, honestly, I’d rather be building that bottle than cleaning up the mess from not having one. But you do you.


If you found this useful, you might also want to check out my earlier post on Clawdbot/Moltbot/OpenClaw from January, which covers the architecture and capabilities in more detail. A lot has changed since then, but the core questions I raised haven’t gone away. They’ve just gotten louder.


You Might Also Like



Comments