Harshit Kumar

Large language models: how they work, how to use them effectively, and how to build applications on top of them including RAG, prompt engineering, and fine-tuning.

8 posts

LLM
Frontier AI Models Evaluation Benchmarks

A guide to frontier AI model benchmarks in 2026, covering MMLU, GPQA Diamond, HLE, SWE-bench, ARC-AGI-2, MMMU, Arena Elo, etc. What each benchmark measures, which models lead, why...

Jun 26, 2026 · 27 min read
Agentic AI
Introduction to Model Context Protocol (MCP)

MCP is an open-source protocol that standardizes how LLMs connect to external tools and data sources, replacing fragile custom integrations with a common interface.

Feb 20, 2026 · 15 min read
LLM
Evaluation Metrics for Large Language Models

Walkthrough of evaluation metrics for large language models: perplexity, cross-entropy, BLEU, ROUGE, METEOR, CIDEr, BERTScore, RAG metrics, safety metrics, and LLM-as-a-judge, with equations and visualizations.

Dec 12, 2025 · 17 min read
LLM
Prompt Engineering Techniques: How to Write Effective Prompts

A deep-dive into prompt engineering techniques from few-shot prompting and chain-of-thought, ReAct, and prompt injections with examples.

Oct 10, 2025 · 14 min read
LLM
Distributed Training: How to train Large Language Models (LLM)

Comprehensive guide to distributed training for LLMs covering data parallelism, model parallelism, tensor parallelism, ZeRO optimizer, FSDP, 3D parallelism, DeepSpeed with interactive visualization, code examples.

Mar 21, 2025 · 29 min read
LLM
Vision Language Models (VLM)

Overview of Vision Language Models (VLMs) and their training paradigms: contrastive learning (CLIP), masking (FLAVA), generative approaches (CoCa, Chameleon), and pretrained backbone methods (Frozen, LLaVA, BLIP-2).

Jul 12, 2024 · 22 min read
CUDA
Matrix Multiplication in CUDA

Implementing matrix multiplication in CUDA from a naive CPU baseline to GPU-accelerated versions using tiled shared memory for deep learning workloads.

Jun 07, 2024 · 21 min read
LLM
Retrieval Augmented Generation (RAG) Chatbot for 10Q Financial Reports

Building a RAG-based chatbot for 10Q financial reports to reduce LLM hallucinations by grounding answers in retrieved document context.

Apr 26, 2024 · 12 min read