← Back to Catalog

HKUDS/ClawWork

↗ GitHub

"ClawWork: OpenClaw as Your AI Coworker - 💰 $15K earned in 11 Hours"

7,796

Stars

1,000

Forks

83

Watchers

28

Open Issues

Python·MIT License·Last commit Mar 3, 2026·by @HKUDS·Published April 1, 2026·Analyzed 6d ago
A

Safety Rating A

The repository appears to be a legitimate open-source AI agent benchmarking framework from HKUDS (a research group). No hardcoded secrets, malicious patterns, or prompt injection attempts were found in the provided content. The use of external code sandboxes (E2B by default) is a standard pattern for safe code execution in agent frameworks. The primary risk surface is the outbound API usage (OpenAI, Tavily, E2B), which is expected and user-configured. No critical security findings were identified.

AI-assisted review, not a professional security audit.

AI Analysis

ClawWork is an AI agent benchmarking and simulation framework that transforms AI assistants into economically accountable 'AI coworkers.' It evaluates multiple LLM-based agents (GPT-4o, Claude, Gemini, Qwen, etc.) on 220 real-world professional tasks from the GDPVal dataset spanning 44 occupational sectors. Agents start with a $10 budget, pay for their own token usage, and earn income based on work quality scored against BLS wage rates — creating a real-world economic survival benchmark. It includes a FastAPI + React dashboard for live monitoring, an 8-tool agent toolkit, LLM-based task evaluation, and optional integration with the Nanobot/OpenClaw gateway via a 'ClawMode' wrapper.

Use Cases

  • Benchmarking LLM agents on real-world professional task performance under economic constraints
  • Comparing multiple AI models head-to-head across 44 professional domains using the GDPVal dataset
  • Running a local AI agent simulation where the agent earns income by completing professional tasks and pays for token usage
  • Integrating economic accountability into an existing Nanobot/OpenClaw AI gateway deployment
  • Visualizing agent performance metrics (balance, quality scores, work vs. learn decisions) via a real-time React dashboard

Tags

#ai-agents#research#framework#multi-agent

Security Findings (4)

hardcoded_secrets

No hardcoded secrets detected. The README explicitly directs users to copy .env.example to .env and populate API keys (OPENAI_API_KEY, E2B_API_KEY, WEB_SEARCH_API_KEY) themselves. No keys appear embedded in the repository content provided.

dependency_vulnerabilities

No manifest files were included in the repository content provided for static analysis. The project requires Python 3.10+, uses FastAPI, LangChain/LiteLLM, React frontend, and third-party sandboxes (E2B, BoxLite). No specific CVEs are identifiable from the README alone, but e2b-code-interpreter and external sandbox execution are elevated-risk dependencies that a curator should verify in requirements.txt.

prompt_injection_attempt

No prompt injection attempts detected. The README content is straightforward documentation with no embedded instructions targeting AI analysts.

malicious_code

No malicious code patterns detected in the provided repository content. The architecture (FastAPI backend, React frontend, LangChain agent tools) is consistent with a legitimate benchmarking framework.

Project Connections