Supercharge Your Model Training
-
Updated
Nov 12, 2025 - Python
Supercharge Your Model Training
A control plane for concurrent LLM RL on shared GPUs
LICITRA v1 evidence — superseded by licitra-mmr-evidence
LICITRA v1 — superseded by licitra-mmr-core
End-to-end personalized feed ranking system demonstrating retrieval → ranking pipelines, offline evaluation, realistic simulation, and business-aligned diagnostics inspired by large-scale social platforms.
cloud-native machine learning platform for real-time macroeconomic inflation nowcasting
Production-style real-time ML feature store with low-latency inference
Profile-first ML systems project optimizing a multi-camera end-to-end driving model for hardware efficiency using PyTorch, CUDA streams, NVTX instrumentation, and Nsight Systems.
Deterministic decision gate for AI/ML systems. Risk-Gate enforces strict, schema-driven admissibility boundaries between AI/LLM intent and real system actions. It provides a fixed, human-owned decision structure with deterministic allow/block outcomes, explicit audit logging, and environment-specific policy via configuration — no ML, no heuristics,
Benchmarking and optimizing transformer inference across PyTorch, ONNXRuntime, and TensorRT with latency/throughput analysis on GPU and CPU.
End-to-end fraud anomaly detection system using FastAPI, Isolation Forest, Streamlit, Docker, and a CI/CD pipeline.
Production-style ML inference system for Pneumonia detection from chest X-rays, featuring custom CNN architectures, versioned model serving, preprocessing parity, observability, drift detection, and rollback using FastAPI and Docker.
Autonomous training optimizer for nanoGPT using multi-agent patch search, empirical validation, and rollback-safe execution. TinyShakespeare val_loss improved from ~4.17 to ~1.8454.
Scalable Training Telemetry and Metrics Visualization
Failure-first analysis of retrieval-augmented and agentic systems, focused on isolating and attributing failures across retrieval, planning, execution, memory, and policy layers.
Containerized ML inference service exposing a churn prediction model via FastAPI, with Docker-based deployment and AWS-ready architecture.
Real-time traffic analytics platform demonstrating ML systems design: detection, tracking, event logging, observability, and reporting.
Add a description, image, and links to the ml-systems topic page so that developers can more easily learn about it.
To associate your repository with the ml-systems topic, visit your repo's landing page and select "manage topics."