A machine learning model trained to identify and flag harmful, inappropriate, or policy-violating content in text.