Category
deep learning
13 articles across 3 sub-topics

Types of LLM Quantization: By Timing, Scope, and Mapping
TLDR: There is no single "best" LLM quantization. You classify and choose quantization along three axes: when you quantize (timing), what you quantize (scope), and how values are encoded (mapping). In practice, most teams start with weight quantizati...
Why Embeddings Matter: Solving Key Issues in Data Representation
TLDR: Embeddings convert words (and images, users, products) into dense numerical vectors in a geometric space where semantic similarity = geometric proximity. "King - Man + Woman ≈ Queen" is not magic — it is the arithmetic property of well-trained ...
What are Logits in Machine Learning and Why They Matter
TLDR: Logits are the raw, unnormalized scores produced by the final layer of a neural network — before any probability transformation. Softmax converts them to probabilities. Temperature scales them before Softmax to control output randomness. 📖 T...
Unlocking the Power of ML, DL, and LLM Through Real-World Use Cases
TLDR: ML, Deep Learning, and LLMs are not competing technologies — they are a nested hierarchy. LLMs are a type of Deep Learning. Deep Learning is a subset of ML. Choosing the right layer depends on your data type, problem complexity, and available t...

LoRA Explained: How to Fine-Tune LLMs on a Budget
TLDR: Fine-tuning a 7B-parameter LLM updates billions of weights and requires expensive GPUs. LoRA (Low-Rank Adaptation) freezes the original weights and trains only tiny adapter matrices that are added on top. 90%+ memory reduction; zero inference l...

Diffusion Models: How AI Creates Art from Noise
TLDR: Diffusion models work by first learning to add noise to an image, then learning to undo that noise. At inference time you start from pure static and iteratively denoise into a meaningful image. They power DALL-E, Midjourney, and Stable Diffusio...
A Guide to Pre-training Large Language Models
TLDR: Pre-training is the phase where an LLM learns "Language" and "World Knowledge" by reading petabytes of text. It uses Self-Supervised Learning to predict the next word in a sentence. This creates the "Base Model" which is later fine-tuned. 📖 ...

LLM Model Quantization: Why, When, and How to Deploy Smaller, Faster Models
TLDR: Quantization converts high-precision model weights and activations (FP16/FP32) into lower-precision formats (INT8 or INT4) so LLMs run with less memory, lower latency, and lower cost. The key is choosing the right quantization method for your a...

Variational Autoencoders (VAE): The Art of Compression and Creation
TLDR: A VAE learns to compress data into a smooth probabilistic latent space, then generate new samples by decoding random points from that space. The reparameterization trick is what makes it trainable end-to-end. Reconstruction + KL divergence loss...

Deep Learning Architectures: CNNs, RNNs, and Transformers
TLDR: CNNs, RNNs, and Transformers solve different kinds of pattern problems. CNNs are great for spatial data like images, RNNs handle ordered sequences, and Transformers shine when long-range context matters. Choosing the right architecture often ma...

Neural Networks Explained: From Neurons to Deep Learning
TLDR: A neural network is a stack of simple "neurons" that turn raw inputs into predictions by learning the right weights and biases. Training means repeatedly nudging those numbers via back-propagation until the error shrinks. Master the basics and ...
How Transformer Architecture Works: A Deep Dive
TLDR: The Transformer is the architecture behind every major LLM (GPT, BERT, Claude, Gemini). Its core innovation is Self-Attention — a mechanism that lets the model weigh relationships between all tokens in a sequence simultaneously, regardless of d...
