AI will click your cursor
Watching a computer use a computer helps to feel the future a bit more
There is an unseen hand at your mouse, guiding the cursor. It reads the screen, clicks, scrolls, and types out words on the keyboard—mimicking a skilled user—is it a skilled user? The AI takes screenshots with slow, deliberate blinks, each a fleeting glimpse dragging together the meaning of the human interface into the AI mind. It navigates menus, opens documents, and pulls data together from other programs, the internet, and the depths of your servers.
You return with a cup of tea to review the AI’s products and strategy overview. Like a trusted associate, it’s revised a shaky section of the documents based on a chat with the client two weeks ago that you’d forgotten about and pulled up materials from the firm’s research software to provide context for a tricky analysis. It’s even anticipated your next question: “Can we get this out before 6pm?” You have dinner plans, and the AI knows that there’s shared context between your job and your personal life.
Claude Sonnet 3.5
Anthropic announced a pair of upgraded AI foundation models this week: Claude Sonnet 3.5 and Claude Haiku 3.5. Sonnet introduces another groundbreaking capability via Anthropic’s API: "computer use”. Sonnet can navigate standard computer interfaces like a human, performing actions like clicking, typing, and reading screens. This lets Sonnet flow through software environments designed for people, leveraging software tools as if the AI is both program and end-user. It’s a player piano without the sheet music.
Although it is publicly available, it remains experimental, showing some early limitations like difficulty with dynamic screen interactions and occasional errors. Nevertheless, early adopting companies are exploring its potential for the central suggested use case of automating complex tasks. Concerns remain about security risks, particularly prompt injection attacks—where the AI could be manipulated into taking actions on your computer through malicious inputs. Anthropic is aware and cautious, emphasizing the need for developers to handle this beta feature carefully, especially as its present performance remains below human proficiency.
The release again pushes the evolving capabilities of AI in performing human tasks while underscoring the ongoing challenges in security, reliability, and effective deployment—even more exciting possibilities for task and workflow automation, even more relentless significant concerns about potential vulnerabilities and proper use.