Trust, but Verify: Peeling Low-Bit Transformer Networks for Training Monitoring — ThinkLLM