Harshit Kumar

Personal notes on machine learning, deep learning, and software engineering — what's Technical Fridays?

Computer Vision
Color and Color Spaces in Computer Vision

Understanding color models (RGB, HSV, LAB, Luv) and color spaces in computer vision from additive mixing and chromaticity to perceptually uniform CIE spaces and Delta E color difference....

Jan 17, 2020 16 min read
Deep Learning
Introduction to Panoptic Segmentation: A Tutorial

Panoptic segmentation unifies semantic and instance segmentation assigning class labels and unique IDs to every pixel in an image.

Oct 18, 2019 6 min read
Deep Learning
Evaluation metrics for object detection and segmentation: mAP

How IoU, precision-recall curves, and mean Average Precision (mAP) are used to evaluate object detection and segmentation models.

Sep 20, 2019 5 min read
Deep Learning
Quick intro to Instance segmentation: Mask R-CNN

Instance segmentation with Mask R-CNN: combining object detection and semantic segmentation to identify and segment each object instance separately.

Aug 23, 2019 12 min read
Deep Learning
Quick intro to semantic segmentation: FCN, U-Net and DeepLab

An introduction to semantic segmentation, pixel-level classification using Fully Convolutional Networks, U-Net, and DeepLab architectures.

Aug 9, 2019 8 min read
Deep Learning
Converting FC layers to CONV layers

How and why to replace fully connected layers with equivalent convolutional layers, enabling CNNs to accept arbitrary input sizes.

Aug 2, 2019 1 min read
Personal
Two Years of Technical Fridays

Marking two years of Technical Fridays, with over 10,000 global readers and a focus on computer vision going forward.

Jul 19, 2019 1 min read
Speech Recognition
Introduction to Automatic Speech Recognition

The fundamentals of Automatic Speech Recognition (ASR), acoustic models, Hidden Markov Models, and how Bayes' rule drives decoding.

Apr 19, 2019 3 min read
Deep Learning
Data augmentation

How data augmentation like flipping, rotation, color jittering artificially expands training data to build more generalizable deep learning models.

Apr 12, 2019 1 min read
Deep Learning
Generative Adversarial Networks variants: DCGAN, Pix2pix, CycleGAN

An overview of GAN variants, DCGAN for image generation, Pix2pix for paired image translation, and CycleGAN for unpaired style transfer.

Apr 5, 2019 4 min read

« Prev 1 2 3 4 5 6 7 8 9 Next »