IBM DeepScanner
A scalable framework for detecting anomalies in LLM activations
About the Project
DeepScanner is a scalable infrastructure designed to reverse-engineer foundation models by analyzing their internal activation states. It enables researchers to detect anomalous patterns—such as hallucinations or toxic outputs—before they manifest in the generated text.
I engineered the distributed workflow for this framework, optimizing the pipeline to run parallel experiments on the IBM Cloud Cognitive Cluster. This significantly reduced the time required to analyze large-scale models like Llama 3 and Granite.
The system supports modular probing of different model layers and integrates seamlessly with existing training pipelines, making it a critical tool for AI safety and interpretability research.
Project Details
Milestones
- Architecture Design - June 2025
Designed the modular architecture for extracting and analyzing model activations
- Distributed Pipeline - July 2025
Implemented Ray-based distributed processing to handle large-scale model data
- Optimization - August 2025
Optimized data loading and tensor processing, achieving a 3x speedup in analysis time
- Paper Submission - August 2025
Results contributed to a paper submission at AAAI 2026