Skip to main content

Portfolio

Sean Patrick Morris

MS Data Science, Georgetown University. Enterprise Analytics & AI engineer at L3Harris.

About

I hold an MS in Data Science with an AI concentration from Georgetown University, and work full-time as an Analytics Engineer in the Supply Chain Engineering department at L3Harris Technologies, where I build and deploy agentic systems on Palantir Foundry and design enterprise analytics pipelines on Snowflake.

My path into data science runs through economics — a BA from Georgetown gave me a rigorous grounding in causal inference and quantitative modeling before I moved into machine learning, deep learning, and large-scale data engineering.

I am available for consulting engagements in ML engineering, data pipeline architecture, and applied AI. Reach out at [email protected].

Education

Georgetown University

MS Data Science, AI Concentration

Aug 2024 – Dec 2025 (Graduated)GPA 3.9

Neural Nets & Deep Learning, Reinforcement Learning, Advanced NLP, Big Data

Education

Georgetown University

BA Economics

Aug 2020 – May 2024GPA 3.4

3.9

MS GPA

2+

Internships

Projects

Featured

MyTorch

A deep learning library built from scratch — PyTorch core, zero dependencies.

Problem

Deep learning frameworks abstract away the mechanics of backpropagation and gradient flow, making it difficult to build genuine intuition for how neural networks learn.

Approach

  • Implemented tensor operations, a computation graph, and forward/backward passes for linear and activation layers entirely in NumPy.
  • Replicated PyTorch-style optimizer and neural net module abstractions to support gradient-based training end-to-end.

Result

A fully functional deep learning library capable of training multi-layer networks, demonstrating mastery of the mathematical foundations underlying modern ML frameworks.

PythonNumPyDeep LearningBackpropagation

Enterprise Agentic Automation — L3Harris

Full-time enterprise analytics engineering with agentic pipelines on Palantir Foundry.

Problem

Manual supplier category assignment was time-intensive and error-prone at scale, creating a bottleneck in procurement analytics workflows.

Approach

  • Designed and deployed an enterprise agentic automation system on Palantir Foundry to classify supplier categories without human intervention.

Result

Reduced manual effort by 50%+ and integrated the system into live production workflows used across the organization.

Palantir FoundryPythonSQLSnowflakePowerBI

Amazon Electronics NLP Study

Sentiment classification and clustering across 100,000 electronics reviews.

Problem

Large-scale consumer review data is unstructured and noisy — extracting signal for sentiment and topic structure requires careful feature engineering and model selection.

Approach

  • Applied TF-IDF embeddings with PCA, t-SNE, KMeans, DBSCAN, and hierarchical clustering to map the review landscape.
  • Trained Random Forest, SVM, and logistic regression classifiers on binary sentiment labels.

Result

85% accuracy on sentiment prediction across a 100,000-review subset drawn from a corpus of 233 million records.

PythonScikit-learnNLPTF-IDFPandasMatplotlib

Mario Kart Reinforcement Learning Agent

DQN agent trained to drive Mario Kart DS inside a live Nintendo DS emulator.

Problem

Training an RL agent on a real game requires bridging a hardware emulator, visual observations, and a reward signal derived from raw RAM — none of which are available out of the box.

Approach

  • Wrapped the py-desmume Nintendo DS emulator in a custom OpenAI Gymnasium environment with grayscale 84×84 vision, 4-frame stacking, and RAM-extracted rewards for speed, lap progress, and collision penalties.
  • Trained a Stable-Baselines3 DQN agent with a CNN policy on CUDA; built a live training viewer displaying metrics and action probabilities.

Result

Agent successfully completes a full lap on Yoshi Falls, demonstrating end-to-end RL from emulator integration through policy training to verifiable in-game behavior.

PythonPyTorchStable-Baselines3OpenAI GymnasiumDQNCUDA

CarbonTrack CLI

Installable Python CLI for CO₂ emissions forecasting, visualization, and self-monitoring.

Problem

Global emissions datasets (OWID) are large and heterogeneous — extracting country-level trends and forward projections requires a reproducible, composable pipeline.

Approach

  • Built a proper Python package (pyproject.toml) with modules for ingestion, preprocessing, scikit-learn forecasting, and multi-country comparison plots.
  • Integrated CodeCarbon to track the tool's own emissions footprint per run, making the environmental cost of analysis explicit.

Result

A fully installable CLI tool with batch forecasting, grid visualizations, and a self-auditing carbon log — packaged for reuse across datasets.

Pythonscikit-learnpandasmatplotlibCodeCarbonCLI

F1 Lap Telemetry Visualizer

Streamlit app animating Formula 1 driver telemetry against the fastest lap.

Problem

Raw F1 telemetry data (speed, throttle, gear, braking) is rich but not easily comparable across drivers without an interactive visualization layer.

Approach

  • Pulled 2024 race telemetry via the FastF1 API and stored it in a local DuckDB database for fast querying.
  • Built a Streamlit app with animated matplotlib track maps, color-coded telemetry overlays, and play/pause/scrub controls.

Result

Live-deployed app at dsan5200group25.streamlit.app enabling lap-by-lap driver comparison across any 2024 race.

PythonStreamlitDuckDBFastF1matplotlibpandas

YOLOv8 Video Object Detection

Gradio app running YOLOv8n on uploaded video, returning an annotated GIF.

Problem

Demonstrating real-time object detection on arbitrary video input requires packaging inference, annotation, and a usable UI into a single deployable unit.

Approach

  • Wrapped Ultralytics YOLOv8n frame-by-frame inference in a Gradio interface; annotated bounding boxes with class labels and confidence scores and converted output to animated GIF.

Result

Deployed on Hugging Face Spaces — detects 80 COCO object classes across uploaded video with no local setup required.

PythonYOLOv8GradioHugging FaceComputer Vision

Skrmiish User Segmentation

Cohort discovery and churn prediction on 770 TB of live game telemetry.

Problem

A fast-growing game studio needed to understand player behavior at scale to inform retention strategy, but lacked structured user segments.

Approach

  • Applied KMeans + PCA to the full 770 TB Azure telemetry database to surface distinct, actionable user cohorts.
  • Built churn prediction models using QDA and logistic regression to flag at-risk players early.

Result

Delivered user segments directly adopted for strategy optimization and a churn model with an 0.81 F1 score.

PythonScikit-learnAzureKMeansPCALogistic Regression

Skills

Languages

PythonSQLRShell / zshHTML / CSS

ML / AI

Neural NetworksRandom ForestsSVMGradient BoostingClusteringPCA / t-SNEReinforcement LearningNLP

Data Platforms

SnowflakeAWS (Redshift, EC2, S3)AzureGoogle BigQueryMongoDBPalantir Foundry

Visualization

PowerBIMatplotlibSeabornggplot2Palantir Workshop

Get in touch

Available for data science and ML consulting engagements.