Tags
Red-Team
LLM Engineering (11): Safety and Alignment
What alignment means engineering-wise, refusal calibration, the red-team taxonomy, hallucination metrics, sleeper agents, refusal as a feature vector, constitutional AI, and what shipping safely actually requires in …
