For agents to reliably execute data-driven tasks, structured semantic metadata is essential—unstructured web search alone leads to frequent failures where agents retrieve descriptions instead of actionable data.
This paper compares how AI agents find data on the web: one searches billions of unstructured web pages, while another uses structured semantic metadata (schema.org). The semantic approach retrieves more usable data with 65.7% higher precision, while the unstructured approach covers more questions but often returns unhelpful pages instead of actual datasets.