AI updates

Anthropic Launches Claude 3.5 Sonnet: AI Can Control Computers

Claude 3.5 Sonnet AI operating a computer autonomously.

Introducing Claude 3.5 Sonnet AI: Revolutionizing Computer Interaction

Anthropic has recently launched its Claude 3.5 Sonnet AI model, which features a groundbreaking addition: the ability to control a computer by simply observing the screen. This capability, termed computer use, is currently in public beta and is available via API, allowing developers to direct Claude to perform tasks on a computer just like a human.

Comparison with Other AI Tools

This new feature puts Claude 3.5 on a similar playing field with AI tools from major competitors like Microsoft and OpenAI. Microsoft’s Copilot Vision and OpenAI’s ChatGPT desktop app utilize the ability to interpret screen information. Furthermore, Google’s Gemini app on Android phones presents analogous capabilities, but none of them have yet rolled out fully functional tools that can autonomously perform click actions and interact deeply with systems.

Experimental Phase and Limitations

While the computer use feature is innovative, Anthropic has cautioned users that it is still experimental. The company describes the current capabilities as "cumbersome and error-prone," urging developers to provide feedback to enhance its functionality. Some notable limitations include:

  • Claude cannot yet perform complex actions like dragging or zooming.
  • The "flipbook" approach Claude uses—taking screenshots instead of continuous video—means it may miss quick events or notifications.
  • There are proactive measures in place to limit Claude's interaction with social media and certain sensitive activities such as elections.

Performance Improvements in Coding and Tool Use

The Claude 3.5 Sonnet model also showcases significant advancements across various performance benchmarks. Specifically:

  • Agentic Coding: Performance on the SWE-bench Verified metric improved significantly from 33.4% to 49.0%, outperforming all publicly available coding models.
  • Tool Use Tasks: On the TAU-bench, Claude’s scores increased from 62.6% to 69.2% in retail applications and from 36.0% to 46.0% in the more complicated airline domain.

Competitive Pricing and Pricing Strategy

Despite these enhancements, Anthropic has maintained the same pricing structure and speed for the Claude 3.5 Sonnet model as its predecessor, ensuring accessibility for current and potential customers.

Future Outlook

With developers invited to pilot the computer use function, the prospect for rapid refinement and efficiency increases is promising. As feedback accumulates, it's likely that future iterations will enhance Claude’s ability to understand and execute a broader range of tasks effectively.

Conclusion

Anthropic’s Claude 3.5 Sonnet AI model represents a pivotal advancement in AI technology, particularly in its potential to enhance productivity and user interaction with computers. While still in its early stages, the feedback from developers and stakeholders will be crucial in optimizing its capabilities moving forward.

For more updates on AI technology trends, follow our article series where we explore various AI tools and their impact on modern computing.

Reading next

Dr. Ronnie Chatterji appointed as OpenAI's first Chief Economist, leading economic research on AI.
An eVTOL aircraft taking off vertically, symbolizing the future of urban air mobility.

Leave a comment

All comments are moderated before being published.

Trang web này được bảo vệ bằng hCaptcha. Ngoài ra, cũng áp dụng Chính sách quyền riêng tưĐiều khoản dịch vụ của hCaptcha.