Predefined groups of harmful content types (such as violence, hate speech, or misinformation) that a safety model is trained to recognize and flag.