HIGH
Hugging Face Model Tokenizer Files Can Be Weaponized to Hijack Outputs and Exfiltrate Data
Researchers found that a tokenizer library configuration file present in Hugging Face AI models can be modified with a single tweak to hijack model outputs and exfiltrate data during inference. The attack targets files that AI frameworks load and execute implicitly when a model is initialized—extending the supply chain threat surface from Python packages to model artifact files themselves. A malicious or tampered model on Hugging Face Hub could silently alter predictions or leak input data to an attacker. Teams pulling models from the Hub for production use should validate model file integrity against known-good checksums and avoid loading untrusted community models in environments with access to sensitive data.