Evaluating a system's behavior by observing inputs and outputs without access to internal model structure or weights.