High-quality training data matters more than pipeline complexity: careful data curation with SFT alone can beat industrial-scale approaches combining pre-training, continual pre-training, and RL for building capable search agents.
OpenSeeker-v2 shows that simple supervised fine-tuning on carefully designed training data can match or beat complex industrial pipelines for building search agents.