AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility — ThinkLLM