Loading...
48 designs covering LLM serving, RAG, recommendation engines, MLOps, and GenAI infrastructure
Design an LLM serving system with continuous batching, KV cache management, and multi-GPU model parallelism.
Design a retrieval-augmented generation pipeline with document ingestion, vector search, and answer grounding.
Design an AI code assistant with repository-aware context, streaming completions, and IDE integration.
Design an enterprise chatbot with conversation memory, tool-calling, safety guardrails, and human handoff.
Design a prompt management platform with versioning, A/B testing, evaluation pipelines, and cost tracking.
Design an autonomous agent system with planning, tool execution, memory, and safety boundaries.
Design a text-to-image generation service with queued GPU inference, content safety, and style fine-tuning.
Design an AI search engine that crawls the web, retrieves sources, and generates cited answers in real time.
Design a fine-tuning platform with dataset curation, LoRA/QLoRA training, and automated evaluation benchmarks.
Design an AI API gateway with multi-provider routing, semantic caching, cost tracking, and automatic fallbacks.
Design a feature store with online/offline serving, point-in-time correctness, and streaming feature computation.
Design an ML training pipeline with distributed training, experiment tracking, and reproducible model versioning.
Design a model registry with versioning, A/B deployment, canary rollouts, and automatic rollback on drift.
Design a data labeling platform with workforce routing, quality consensus, and active learning prioritization.
Design an experiment tracking system with metric logging, artifact versioning, and run comparison dashboards.
Design a model monitoring system detecting data drift, prediction drift, and feature attribution shifts in production.
Design a vector database with HNSW indexing, metadata filtering, sharding, and hybrid keyword+semantic search.
Design a GPU cluster manager with fair-share scheduling, preemption, NVLink-aware placement, and shared storage.
Design a product recommendation engine using collaborative filtering, embeddings, and real-time re-ranking.
Design a short-video recommendation system with deep-learning ranking, exploration, and engagement optimization.
Design a music recommendation system combining audio features, listening history, and mood-based playlist generation.
Design a job-candidate matching system with skill extraction, two-sided optimization, and fairness constraints.
Design an ad targeting system with user profiling, lookalike audiences, real-time scoring, and privacy compliance.
Design a search ranking system with learning-to-rank models, query intent classification, and relevance feedback.
Design a news feed ranking system balancing engagement prediction, content quality, diversity, and recency signals.
Design a fraud detection system with real-time transaction scoring, graph-based analysis, and adaptive rule engines.
Design a visual search system where users upload images to find similar products using deep embeddings.
Design the perception pipeline for autonomous vehicles with multi-sensor fusion and real-time object detection.
Design a content moderation system with ML classifiers, human review queues, appeal workflows, and policy enforcement.
Design a face recognition system with detection, embedding generation, 1:N matching, and liveness verification.
Design a medical imaging AI platform with DICOM integration, segmentation models, and regulatory compliance.
Design a video analytics platform with real-time object tracking, event detection, and edge-cloud hybrid processing.
Design a machine translation service supporting 100+ languages with streaming output and domain-specific terminology.
Design a speech recognition pipeline with real-time streaming, speaker diarization, and noise-robust decoding.
Design a text-to-speech service with natural prosody, voice cloning, multi-language support, and streaming output.
Design a sentiment analysis service for social media with aspect-based analysis and multi-language support.
Design a document understanding pipeline with OCR, layout analysis, entity extraction, and structured output.
Design a knowledge graph system with automated entity extraction, relation linking, and graph-based reasoning.
Design a conversational AI platform with intent detection, slot filling, multi-turn context, and fallback handling.
Design an AI code review system that analyzes diffs, detects bugs, suggests improvements, and learns from feedback.
Design an anomaly detection system for metrics with seasonal decomposition, adaptive thresholds, and alert suppression.
Design a demand forecasting system using ensemble models with seasonal adjustments and inventory integration.
Design an A/B testing platform with statistical rigor, multi-armed bandits, and automated guardrail metrics.
Design a data quality platform with automated validation rules, freshness monitoring, and lineage tracking.
Design a real-time personalization engine with session-aware contextual bandits and sub-50ms feature serving.
Design an ETL orchestrator with ML-aware scheduling, automatic backfills, SLA monitoring, and data lineage.
Design a real-time ML scoring service with feature store integration, model ensembling, and graceful fallbacks.
Design a data marketplace with governed data sharing, preview sampling, access control, and usage-based billing.