Projects

Work spanning machine learning, distributed systems, and software engineering.

AI Deep Dive

Jan 2025 – Present ·

An open-source AI/ML learning platform with interactive courses, coding challenges with server-side evaluation, and capstone projects. The flagship course covers building GPT from scratch, from tokenization to the full Transformer architecture.

Built an open-source learning platform (2,000+ users) with interactive courses, visualizations, and a server-side code judge
Designed the flagship course, Build GPT from Scratch, with 20+ coding challenges, interactive visualizations, and a capstone project covering transformers end-to-end
Wrote the code judge on FastAPI and Redis Streams with isolate-sandboxed workers, seccomp filters, and configurable memory limits
Set up the frontend with Next.js, MDX for content authoring, KaTeX for math rendering, and Monaco editor with Vim mode
Deployed with Prometheus and Grafana monitoring, automated backups, and health/readiness probes

PyTorch Transformers FastAPI Redis Streams Deep Learning Next.js TypeScript MDX KaTeX Docker Prometheus

Transformer From Scratch: English-German Translator

May 2025 – Jul 2025 ·

Code Demo

From-scratch PyTorch implementation of the original Transformer (Vaswani et al., 2017) with multi-GPU training and live W&B experiment tracking. Trained a 65M-parameter model on WMT14 English-German to 25.53 BLEU.

Implemented all core Transformer components from scratch: multi-head attention, feed-forward networks, positional encoding, encoder-decoder stacks, and layer normalization
Applied scaling laws (Kaplan et al.) and compute-optimal training insights (Hoffmann et al.) to guide model sizing and training schedule
Designed multi-node, multi-GPU training pipeline with PyTorch DDP and gradient synchronization
Implemented beam search and greedy decoding for inference
Trained for 120K+ steps on 6x V100 GPUs, reaching 25.53 BLEU on WMT14 En-De
Integrated Weights & Biases for live experiment tracking of loss, perplexity, and BLEU scores

PyTorch Transformers Distributed Training NLP Weights & Biases Hugging Face Datasets Python

MYA: My Assistant

Sep 2025 – Present ·

Code

Personal AI assistant that unifies search and management across email, calendar, tasks, and local files. Uses a hybrid RAG pipeline (BM25 + semantic embeddings + Reciprocal Rank Fusion) for retrieval, an LLM-powered email intelligence engine, and multi-interface access through web, CLI, Raycast, and Chrome extension.

Built a RAG pipeline with hybrid retrieval combining BM25 keyword search, dense semantic embeddings, and Reciprocal Rank Fusion for cross-source retrieval
Built an email intelligence engine with LLM-driven priority scoring, auto-classification, labeling, and context-aware draft generation
Designed a pluggable LLM provider layer supporting OpenAI, Groq, and Ollama for local inference
Developed a FastAPI async backend with ChromaDB for vector storage, SQLite for metadata, and APScheduler for background indexing
Built a Next.js web dashboard with email triage, calendar briefings, task management, and daily digest views
Built CLI (Typer + Rich), Raycast extension, and Chrome extension for multi-interface access
Integrated Gmail, Google Calendar, and TickTick APIs with OAuth, background sync, and notification scheduling

RAG LLMs ChromaDB Python FastAPI OpenAI Next.js TypeScript BM25 Groq OAuth

Scriptify: Handwriting Synthesis from Text

Apr 2025 – Jun 2025 ·

Code Demo

End-to-end handwriting synthesis system that generates realistic handwriting from text. Uses a custom deep learning model (~5M parameters) for synthesis, a FastAPI backend for inference, and a React frontend for live rendering.

Trained an attention-based LSTM with a Gaussian Mixture Model output layer in PyTorch, following Graves' handwriting synthesis paper
Optimized inference for real-time generation with TorchScript compilation and quantization
Served the model through FastAPI with a React frontend and interactive canvas for live rendering and stroke animation

PyTorch Deep Learning LSTM TorchScript FastAPI React Python

RoPE Transformer: Rotary Positional Embeddings

Aug 2025 – Aug 2025 ·

Code

Drop-in replacement for sinusoidal positional encodings in the base Transformer using Rotary Positional Embeddings (RoPE). Achieved 25.97 BLEU on WMT14 En-De (+0.44 over baseline) with no additional parameters or compute cost.

Extended multi-head attention with rotary transformations, reusing shared feed-forward, normalization, and embedding modules from the base Transformer
Designed pluggable attention interfaces for swapping between standard and RoPE mechanisms
Implemented frequency-based rotation matrices for query/key embeddings (Su et al., 2021) with cached rotations for training efficiency
Achieved 25.97 BLEU on WMT14 En-De (baseline: 25.53) with no additional parameters on 6x V100 GPUs

PyTorch Transformer Distributed Training RoPE ML Research Python

Distributed Task Queue

Aug 2025 – Present ·

Code

Distributed task queue for offloading work to background workers. Features at-least-once delivery, priority queues, exponential backoff with Dead-Letter Exchange, and PostgreSQL-backed result storage.

Designed the end-to-end pipeline (API, broker, workers, result store) with at-least-once delivery and idempotency guarantees
Implemented durable queuing with manual ACK/NACK and persisted results before ACK to prevent lost work during failures
Exposed a REST API for job submission, status retrieval, and health checks
Implemented automatic retries with exponential backoff (5s to 1h) using RabbitMQ TTL and Dead-Letter Exchange
Designed priority queuing (default/high) with tuned worker prefetch/concurrency to balance throughput and latency
Stored task status, attempts, timestamps, and results in PostgreSQL with optional idempotency keys
Integrated Prometheus metrics and a lightweight dashboard for observability

Go RabbitMQ PostgreSQL Docker Prometheus Grafana

Neural Network From Scratch

Apr 2025 – Jul 2025 ·

Code

Neural network library built from scratch in NumPy with modular layers supporting MLPs, RNNs, and CNNs. Handles forward/backward passes, multiple optimizers, and full backpropagation.

Implemented core architectures (MLP, RNN, CNN) with forward and backward passes in pure NumPy
Wrote activation functions (ReLU, Sigmoid, Softmax, Tanh) and loss functions for regression (MSE, MAE) and classification (Cross-Entropy)
Added He and Xavier weight initialization for stable gradient flow during training
Built data loaders and preprocessing utilities for training pipelines

NumPy Python Machine Learning Pytest

HDFS Cloud Starter

Jul 2025 – Aug 2025 ·

Code

Deploys a multi-node Hadoop HDFS cluster on GCP with a single command. Uses Terraform for infrastructure provisioning and Ansible for configuration management, orchestrated by a shell script.

Provisioned VMs, networking, and firewall rules on GCP using Terraform
Wrote Ansible playbooks to automate Hadoop installation, Java setup, SSH config, and user creation across nodes
Generated Ansible inventory dynamically from Terraform outputs to bridge provisioning and configuration
Created a unified bash script to orchestrate end-to-end infrastructure provisioning and cluster setup
Added a test MapReduce job (word count) to validate the deployed cluster

Terraform Ansible GCP Hadoop HDFS MapReduce Bash Python

Kafka-Lite

Aug 2025 – Present ·

Code

Kafka implementation in Java built through the Codecrafters challenge. Supports TCP networking, binary wire protocol parsing, API version negotiation, concurrent clients, and message consumption via Fetch. Producer functionality in progress.

Built a TCP server in Java handling multiple concurrent client connections
Implemented Kafka's binary wire protocol parsing with correlation IDs and API version negotiation
Added DescribeTopicPartitions and Fetch request handling for reading messages from disk
Currently implementing the Produce API for message publishing

Java Distributed Systems TCP Networking Kafka Protocol