Search-augmented LLMs used for recommendations are highly vulnerable to web content manipulation; even a single fake review or promotional page can mislead models into promoting non-existent products, and common defenses like reasoning and skepticism prompting often backfire.
This paper tests how easily AI recommendation systems can be tricked by fake product reviews and promotional content planted on websites. Researchers created FORGE, a benchmark that simulates web pollution by replacing real products with fake ones in search results, then measured how often 12 different LLMs recommend the fake products.