HoloDesktop CLI Lets AI Agents See and Operate a PC Screen

HoloDesktop CLI bridges the gap between code-writing AI agents and the GUI tasks that still require human eyes and hands, potentially automating full workflows end to end.

Reporting from 1 sources: GIGAZINE.

HoloDesktop CLI Lets AI Agents See and Operate a PC Screen

H Company announced HoloDesktop CLI, a client that lets AI agents see the screen and perform mouse and keyboard operations. It integrates with existing agents like Claude Code and Cursor, and can handle GUI tasks such as testing web app features. The tool supports MCP, ACP, and A2A protocols, and offers both cloud API and self-hosted local inference.

H Company released HoloDesktop CLI on Thursday, a client that gives AI agents the ability to see a PC screen and perform mouse clicks and keyboard input. The tool runs the company's H Agent, which can operate any application by looking at the screen, even those without a dedicated API. HoloDesktop CLI supports the MCP, ACP, and A2A protocols, allowing it to work alongside agents such as Claude Code, Cursor, and Codex. In one example, Claude Code writes new code, then hands testing to HoloDesktop CLI, which finds bugs on the screen, Claude Code fixes them, and HoloDesktop CLI verifies again. Users can choose between H Company's cloud API for convenience or a self-hosted local mode that keeps screenshots and inputs on the user's hardware. Safety features include an emergency kill switch triggered by pressing Esc twice quickly. The CLI and protocol integration code are open source under Apache 2.0, while the agent runtime binary remains closed source under H Company's terms. H Company plans to add a background mode and a native app for daily workflows, as well as a cloud-based agent that can run multiple instances across PCs.

Synthesized by Yomimono from the 1 cited source below, including Japanese-language reporting where cited, then editorially reviewed before publishing.

Sources