About
I hold an MS in Data Science with an AI concentration from Georgetown University, and work full-time as an Analytics Engineer in the Supply Chain Engineering department at L3Harris Technologies, where I build and deploy agentic systems on Palantir Foundry and design enterprise analytics pipelines on Snowflake.
My path into data science runs through economics — a BA from Georgetown gave me a rigorous grounding in causal inference and quantitative modeling before I moved into machine learning, deep learning, and large-scale data engineering.
I am available for consulting engagements in ML engineering, data pipeline architecture, and applied AI. Reach out at [email protected].
Education
Georgetown University
MS Data Science, AI Concentration
Neural Nets & Deep Learning, Reinforcement Learning, Advanced NLP, Big Data
Education
Georgetown University
BA Economics
3.9
MS GPA
2+
Internships
Projects
MyTorch
A deep learning library built from scratch — PyTorch core, zero dependencies.
Problem
Deep learning frameworks abstract away the mechanics of backpropagation and gradient flow, making it difficult to build genuine intuition for how neural networks learn.
Approach
- Implemented tensor operations, a computation graph, and forward/backward passes for linear and activation layers entirely in NumPy.
- Replicated PyTorch-style optimizer and neural net module abstractions to support gradient-based training end-to-end.
Result
A fully functional deep learning library capable of training multi-layer networks, demonstrating mastery of the mathematical foundations underlying modern ML frameworks.
Enterprise Agentic Automation — L3Harris
Full-time enterprise analytics engineering with agentic pipelines on Palantir Foundry.
Problem
Manual supplier category assignment was time-intensive and error-prone at scale, creating a bottleneck in procurement analytics workflows.
Approach
- Designed and deployed an enterprise agentic automation system on Palantir Foundry to classify supplier categories without human intervention.
Result
Reduced manual effort by 50%+ and integrated the system into live production workflows used across the organization.
Amazon Electronics NLP Study
Sentiment classification and clustering across 100,000 electronics reviews.
Problem
Large-scale consumer review data is unstructured and noisy — extracting signal for sentiment and topic structure requires careful feature engineering and model selection.
Approach
- Applied TF-IDF embeddings with PCA, t-SNE, KMeans, DBSCAN, and hierarchical clustering to map the review landscape.
- Trained Random Forest, SVM, and logistic regression classifiers on binary sentiment labels.
Result
85% accuracy on sentiment prediction across a 100,000-review subset drawn from a corpus of 233 million records.
Mario Kart Reinforcement Learning Agent
DQN agent trained to drive Mario Kart DS inside a live Nintendo DS emulator.
Problem
Training an RL agent on a real game requires bridging a hardware emulator, visual observations, and a reward signal derived from raw RAM — none of which are available out of the box.
Approach
- Wrapped the py-desmume Nintendo DS emulator in a custom OpenAI Gymnasium environment with grayscale 84×84 vision, 4-frame stacking, and RAM-extracted rewards for speed, lap progress, and collision penalties.
- Trained a Stable-Baselines3 DQN agent with a CNN policy on CUDA; built a live training viewer displaying metrics and action probabilities.
Result
Agent successfully completes a full lap on Yoshi Falls, demonstrating end-to-end RL from emulator integration through policy training to verifiable in-game behavior.
CarbonTrack CLI
Installable Python CLI for CO₂ emissions forecasting, visualization, and self-monitoring.
Problem
Global emissions datasets (OWID) are large and heterogeneous — extracting country-level trends and forward projections requires a reproducible, composable pipeline.
Approach
- Built a proper Python package (pyproject.toml) with modules for ingestion, preprocessing, scikit-learn forecasting, and multi-country comparison plots.
- Integrated CodeCarbon to track the tool's own emissions footprint per run, making the environmental cost of analysis explicit.
Result
A fully installable CLI tool with batch forecasting, grid visualizations, and a self-auditing carbon log — packaged for reuse across datasets.
F1 Lap Telemetry Visualizer
Streamlit app animating Formula 1 driver telemetry against the fastest lap.
Problem
Raw F1 telemetry data (speed, throttle, gear, braking) is rich but not easily comparable across drivers without an interactive visualization layer.
Approach
- Pulled 2024 race telemetry via the FastF1 API and stored it in a local DuckDB database for fast querying.
- Built a Streamlit app with animated matplotlib track maps, color-coded telemetry overlays, and play/pause/scrub controls.
Result
Live-deployed app at dsan5200group25.streamlit.app enabling lap-by-lap driver comparison across any 2024 race.
YOLOv8 Video Object Detection
Gradio app running YOLOv8n on uploaded video, returning an annotated GIF.
Problem
Demonstrating real-time object detection on arbitrary video input requires packaging inference, annotation, and a usable UI into a single deployable unit.
Approach
- Wrapped Ultralytics YOLOv8n frame-by-frame inference in a Gradio interface; annotated bounding boxes with class labels and confidence scores and converted output to animated GIF.
Result
Deployed on Hugging Face Spaces — detects 80 COCO object classes across uploaded video with no local setup required.
Skrmiish User Segmentation
Cohort discovery and churn prediction on 770 TB of live game telemetry.
Problem
A fast-growing game studio needed to understand player behavior at scale to inform retention strategy, but lacked structured user segments.
Approach
- Applied KMeans + PCA to the full 770 TB Azure telemetry database to surface distinct, actionable user cohorts.
- Built churn prediction models using QDA and logistic regression to flag at-risk players early.
Result
Delivered user segments directly adopted for strategy optimization and a churn model with an 0.81 F1 score.
Skills
Languages
ML / AI
Data Platforms
Visualization
Get in touch
Available for data science and ML consulting engagements.