Skip to Main Content

IBM DeepScanner

A scalable framework for detecting anomalies in LLM activations

Image will load when scrolled into view
PythonPyTorchRayIBM CloudLLMs

About the Project

DeepScanner is a scalable infrastructure designed to reverse-engineer foundation models by analyzing their internal activation states. It enables researchers to detect anomalous patterns—such as hallucinations or toxic outputs—before they manifest in the generated text.

I engineered the distributed workflow for this framework, optimizing the pipeline to run parallel experiments on the IBM Cloud Cognitive Cluster. This significantly reduced the time required to analyze large-scale models like Llama 3 and Granite.

The system supports modular probing of different model layers and integrates seamlessly with existing training pipelines, making it a critical tool for AI safety and interpretability research.

Project Details

StatusInternal Tool
Role
Research Engineer
Stack
Python
PyTorch
Ray (Distributed Computing)
IBM Cloud Cognitive Cluster
Hugging Face Transformers
Weights & Biases

Milestones

  • Architecture Design - June 2025

    Designed the modular architecture for extracting and analyzing model activations

  • Distributed Pipeline - July 2025

    Implemented Ray-based distributed processing to handle large-scale model data

  • Optimization - August 2025

    Optimized data loading and tensor processing, achieving a 3x speedup in analysis time

  • Paper Submission - August 2025

    Results contributed to a paper submission at AAAI 2026