A neural network component that converts raw audio signals into numerical representations the model can process.