Building truly useful AI assistants requires handling messy, interconnected real-world contexts—not isolated tasks—and current models fall far short of this challenge, but synthetic data generation can help close the gap.
Claw-Anything is a benchmark for testing AI agents as always-on personal assistants with access to a user's full digital world—including activity history, multiple services, and both GUI and CLI interfaces.