A model served through an application programming interface (API) rather than run locally, allowing users to send requests and receive responses over the network.