The Complete AI Glossary: 100+ Terms Every Developer Needs to Know (2026 Edition)

Your definitive reference for navigating the AI landscape—from foundational concepts to cutting-edge techniques

Introduction: Why This Glossary Matters

The field of artificial intelligence moves fast. Terms that were niche jargon in 2023 are now everyday vocabulary for developers. Keeping up isn’t just about staying current—it’s about understanding the tools and technologies that are reshaping how we build software.

This glossary serves as both a reference and a map. Each term connects to broader concepts, and many link to our in-depth guides where you can explore further. Whether you’re just starting with AI or looking to fill gaps in your knowledge, this is your starting point.

Core Concepts (A-Z)

A

AGI (Artificial General Intelligence)

AI systems with human-like general intelligence—capable of learning, reasoning, and problem-solving across any domain. Unlike narrow AI (which excels at specific tasks), AGI would transfer knowledge between domains. Current consensus: we’re not there yet, but the timeline is hotly debated.

AI Agent

An autonomous system that perceives its environment, makes decisions, and takes actions to achieve goals. Unlike simple chatbots, agents can use tools, maintain state across interactions, and execute multi-step workflows. Learn about building AI agents.

Alignment

The challenge of ensuring AI systems pursue intended goals without harmful side effects. A central concern as models become more capable—poorly aligned AI could optimize for the wrong objectives.

Attention Mechanism

The breakthrough technique (introduced in “Attention Is All You Need”) that allows models to focus on relevant parts of input when generating output. The foundation of modern LLMs.

B

Backpropagation

The algorithm that trains neural networks by calculating how much each weight contributed to the error, then adjusting weights to reduce that error. The engine behind deep learning.

Benchmark

Standardized tests that measure AI performance on specific tasks (MMLU for knowledge, HumanEval for coding, etc.). Essential for comparing models, though they don’t capture real-world utility.

Bias

Systematic errors in AI systems that produce unfair outcomes. Can emerge from training data, model architecture, or deployment context. A persistent challenge requiring ongoing attention.

Black Box

A system whose internal workings are opaque—we can observe inputs and outputs but not understand the reasoning. Most modern LLMs are black boxes, raising concerns for high-stakes applications.

C

Chain-of-Thought (CoT)

A prompting technique where the model explains its reasoning step-by-step before giving a final answer. Dramatically improves performance on complex reasoning tasks. Master CoT prompting.

Checkpoint

A saved state of a model during training. Checkpoints allow resuming training, fine-tuning from intermediate states, or comparing model versions.

Context Window

The amount of text (measured in tokens) a model can process at once. Larger windows enable processing longer documents, entire codebases, or extended conversations. Current leaders: 128K-2M tokens.

Copilot (the concept)

AI assistance integrated directly into workflows—originally from aviation, now applied to coding, writing, and creative work. The goal: augment human capabilities without replacing judgment.

D

Deep Learning

Machine learning using neural networks with multiple hidden layers. The “deep” refers to depth of layers, not profound understanding. Powers computer vision, NLP, and generative AI.

Diffusion Model

Generative AI that learns to reverse a gradual noise-adding process. The technique behind DALL-E, Midjourney, and Stable Diffusion for image generation.

Digital Twin

A virtual model of a physical system (factory, supply chain, engine) used for simulation and optimization. Bezos’s $100B fund targets this technology.

Distributed Training

Training large models across multiple GPUs/TPUs, often across data centers. Essential for models with hundreds of billions of parameters.

E

Embedding

A numerical vector representation of data (text, images, audio) that captures semantic meaning. Similar concepts have similar vectors—enabling semantic search and clustering. Learn about embeddings in our RAG guide.

Epoch

One complete pass through the training dataset. Multiple epochs are typically needed for convergence—too few underfit, too many overfit.

Expert System

Early AI approach encoding human expertise as rules. Largely superseded by machine learning, but hybrid approaches are seeing renewed interest.

Explainable AI (XAI)

Techniques making AI decisions interpretable by humans. Critical for high-stakes applications (medicine, finance, law) where “the model said so” isn’t sufficient.

F

Few-Shot Learning

Learning from very few examples—humans excel at this; traditional ML struggled. Modern LLMs can perform tasks with just 2-3 examples in the prompt.

Fine-Tuning

Adapting a pre-trained model to specific tasks or domains by training on smaller, targeted datasets. More efficient than training from scratch. Learn fine-tuning techniques.

Foundation Model

Large models trained on broad data that can be adapted to many downstream tasks. GPT-4, Claude, and Llama are foundation models—the base layer of the AI stack.

FP8/FP16/BF16

Reduced-precision number formats for model weights. FP16 (16-bit) halves memory vs FP32; FP8 (8-bit) enables larger models on consumer hardware. Trade-off: slight accuracy loss.

G

Generative AI

AI systems that create new content—text, images, audio, code, video. Distinguished from discriminative AI (which classifies or predicts). The dominant AI paradigm of 2023-2026.

GGUF

The file format used by llama.cpp for quantized models. Enables running large models on consumer hardware through aggressive quantization. Deep dive on GGUF and other formats.

GPU Cluster

Multiple GPUs connected for distributed training or inference. The infrastructure behind modern AI—NVIDIA’s H100s are the current gold standard.

Gradient Descent

The optimization algorithm that minimizes loss by adjusting model weights in the direction of steepest descent. Variants (Adam, SGD, etc.) differ in update rules.

H

Hallucination

When AI generates plausible-sounding but false information. A persistent problem with LLMs—mitigated through RAG, fine-tuning, and careful prompting, but not eliminated.

Hidden Layer

Neural network layers between input and output that extract increasingly abstract features. “Deep” learning refers to having many hidden layers.

Human-in-the-Loop

Systems where humans oversee, guide, or correct AI decisions. Essential for high-stakes applications and continuous improvement.

Hybrid AI

Combining multiple AI approaches (symbolic + neural, cloud + edge) to leverage strengths of each. Increasingly common in production systems.

I

Inference

Using a trained model to make predictions on new data. Distinct from training—much less computationally intensive, enabling real-time applications.

Instruction Tuning

Fine-tuning models to follow natural language instructions. Transforms base models (which just predict text) into helpful assistants (which follow directions).

IoT AI

AI running on Internet of Things devices—sensors, cameras, appliances. Requires tiny models (often <1MB) running on minimal hardware.

Iterative Deployment

Releasing AI systems gradually—first to small user groups, monitoring for issues, then expanding. Essential for managing risk with powerful systems.

J

Jailbreak

Prompts designed to bypass safety constraints and make models produce harmful content. An ongoing arms race between safety researchers and adversarial users.

JSON Mode

Constraining model output to valid JSON—critical for structured data extraction and API integration. Most major models now support this.

Jupyter

The interactive notebook environment dominating data science and AI experimentation. Enables mixing code, visualization, and narrative in shareable documents.

Just-in-Time Compilation

Compiling code at runtime for optimization. Used by AI frameworks (PyTorch, JAX) to accelerate training and inference.

K

Knowledge Distillation

Training a smaller “student” model to replicate a larger “teacher” model’s behavior. Enables running powerful AI on resource-constrained devices.

KV Cache

Key-Value cache storing attention computations for previous tokens—essential for efficient autoregressive generation. Larger caches enable longer context but consume more memory.

Kubernetes (for AI)

Container orchestration platform managing distributed AI workloads. The infrastructure layer for production ML systems.

L

LLM (Large Language Model)

Neural networks trained on vast text corpora to predict and generate language. GPT-4, Claude, Llama—the engines behind the current AI revolution. Self-hosting guide.

LM Studio

Popular GUI for running local LLMs. Abstracts away complexity of model management, quantization, and inference. Great starting point for local AI.

Local AI

Running AI models on your own hardware rather than cloud APIs. Privacy-preserving, cost-effective at scale, but requires technical setup. Complete local AI guide.

LoRA (Low-Rank Adaptation)

Efficient fine-tuning technique that trains small adapter matrices rather than full model weights. Enables fine-tuning large models on consumer GPUs.

M

MCP (Model Context Protocol)

Anthropic’s open standard for AI tool integration. Think “USB-C for AI”—common protocol enabling any AI to use any tool. MCP explained.

ML Ops

Machine Learning Operations—the practices, tools, and culture for deploying and maintaining ML systems in production. DevOps adapted for ML’s unique challenges.

MoE (Mixture of Experts)

Architecture where only a subset of model parameters activate per input. Enables massive models (trillions of parameters) with manageable inference costs.

Multimodal

AI systems processing multiple data types—text, images, audio, video. GPT-4V, Claude 3, Gemini are multimodal. Vision LLM guide.

N

Neural Network

Computing systems inspired by biological brains—layers of interconnected nodes (neurons) that learn patterns from data. The foundation of modern AI.

NLP (Natural Language Processing)

AI subfield focused on understanding and generating human language. Revolutionized by transformer architectures and large-scale pre-training.

Node (in neural nets)

A single computational unit in a neural network. Takes weighted inputs, applies activation function, produces output. Networks have millions to billions of nodes.

Normalization

Techniques stabilizing neural network training—Batch Norm, Layer Norm, etc. Essential for training deep networks effectively.

O

Ollama

Popular tool for running local LLMs with simple CLI commands. “ollama run llama3” and you’re chatting with a local model. Used in our RAG tutorial.

One-Shot Learning

Learning from a single example. Humans excel; traditional ML struggled. Modern LLMs can often perform tasks with just one demonstration.

Open Source AI

AI models with publicly available weights and architectures. Llama, Mistral, Qwen enable self-hosting and customization. Compare open vs closed models.

Optimization

The process of adjusting model parameters to minimize loss. Gradient descent is the workhorse; variants (Adam, AdaGrad, etc.) adapt learning rates.

P

Parameter

A configurable value in a model learned during training. Modern LLMs have billions to trillions. More parameters generally mean more capacity—but also more compute requirements.

Perplexity

A measure of how well a model predicts text—lower is better. Like “surprise”: how unexpected is this text to the model? Used to evaluate and compare models.

Pipeline

A sequence of data processing steps—from raw data to model predictions. ML pipelines include preprocessing, feature engineering, inference, and post-processing.

Post-Training

Techniques applied after initial training—RLHF, fine-tuning, quantization. Where much of the “magic” of helpful assistants happens.

Q

Quantization

Reducing model precision (FP32 → FP16 → INT8 → INT4) to shrink size and speed up inference. Essential for running large models on consumer hardware. Quantization deep dive.

QLoRA

Quantized LoRA—combining quantization with efficient fine-tuning. Enables fine-tuning 70B parameter models on single consumer GPUs.

Query (in attention)

The component in attention mechanisms representing “what we’re looking for.” Paired with Keys (“what’s available”) and Values (“the actual content”).

RAG (Retrieval-Augmented Generation)

Enhancing LLMs with external knowledge retrieval. The model searches a knowledge base, then generates informed responses. Complete RAG guide.

R

Reasoning Model

Models optimized for step-by-step problem-solving—OpenAI’s o1, DeepSeek-R1. Trade speed for accuracy on complex tasks.

Reinforcement Learning

Training through trial and error, receiving rewards for good outcomes. RLHF (from human feedback) aligns models with human preferences.

RNN (Recurrent Neural Network)

Earlier architecture for sequence processing—maintains hidden state across inputs. Largely superseded by transformers for most tasks.

Robotics

AI controlling physical systems. The frontier of embodied AI—combining perception, reasoning, and physical action.

S

Self-Attention

Attention mechanism where input elements attend to each other. Enables transformers to model relationships across entire sequences simultaneously.

Semantic Search

Search based on meaning rather than keyword matching. Uses embeddings to find conceptually similar content even with different wording. Vector DB guide.

Sentence Transformer

Models creating embeddings for sentences/paragraphs. Essential for semantic search, clustering, and similarity tasks.

Singularity

Hypothetical point where AI surpasses human intelligence, triggering runaway technological growth. More philosophical concept than technical term—timeline highly uncertain.

T

Token

The basic unit of text for LLMs—could be a word, part of a word, or punctuation. GPT-4’s context window is measured in tokens (~0.75 words/token).

Tokenizer

The component converting text to tokens (and back). Different models use different tokenizers—affects context window efficiency and multilingual performance.

Training Data

The examples used to teach models. Quality and diversity matter enormously—garbage in, garbage out. Increasingly a competitive moat.

Transformer

The architecture (introduced 2017) that revolutionized NLP and enabled modern LLMs. Based entirely on attention mechanisms—no recurrence or convolution.

U

Uncertainty Quantification

Techniques for measuring model confidence. Critical for knowing when to trust predictions—especially important in high-stakes applications.

Unsupervised Learning

Learning patterns from unlabeled data. Most LLM pre-training is unsupervised—predicting next token without explicit labels.

User Interface (AI)

How humans interact with AI systems—chat, voice, API, embedded in applications. Rapidly evolving as capabilities expand.

V

vLLM

High-performance inference engine for LLMs. Uses PagedAttention for efficient memory management—enables much higher throughput than naive implementations.

Vector Database

Database optimized for storing and searching embeddings. Powers semantic search, RAG, and recommendation systems. Vector DB comparison.

Vision LLM

Multimodal models processing both images and text—LLaVA, GPT-4V, Claude 3. Enable visual question answering, OCR, and image understanding. Vision LLM guide.

Voice Clone

AI generating speech matching a specific person’s voice. Powerful for accessibility and entertainment; concerning for deepfakes and fraud.

W

Weights

The learned parameters of a neural network. “Model weights” = the actual file containing the trained network. Open-source models release weights; closed models don’t.

Whisper

OpenAI’s open-source speech recognition model. Highly accurate, multilingual, and free—dominates the transcription space.

Word Embedding

Vector representations of words capturing semantic meaning. Word2Vec, GloVe were early examples; modern models use contextual embeddings.

Workflow Automation

Using AI to automate multi-step business processes. Beyond single tasks to end-to-end automation—invoice processing, customer support, content pipelines.

X

XAI (Explainable AI)

See Explainable AI above. Critical for regulatory compliance and building trust in AI systems.

xFormers

Meta’s library of efficient transformer components. Optimized attention implementations, memory-efficient training techniques.

XGBoost

Gradient boosting framework—while not deep learning, remains dominant for tabular data. Often beats neural nets on structured datasets.

Y

YAML (for AI configs)

Human-readable data format widely used for ML configuration files—training configs, model cards, deployment specs.

YOLO (You Only Look Once)

Real-time object detection architecture. Processes images in single forward pass—fast enough for video analysis.

You.com

AI-powered search engine emphasizing privacy and summarization. One of many attempts to reinvent search for the AI age.

Z

Zero-Shot Learning

Performing tasks without any examples in the prompt. Modern LLMs excel at this—generalization from broad pre-training.

Zipf’s Law

Statistical pattern where word frequency is inversely proportional to rank. Explains why language models need massive vocabularies.

Z-Score Normalization

Standardizing data by subtracting mean and dividing by standard deviation. Common preprocessing step for numerical features.

Acronyms Quick Reference

Acronym	Full Term
AGI	Artificial General Intelligence
AI	Artificial Intelligence
API	Application Programming Interface
BF16	BFloat16 (16-bit floating point)
CoT	Chain-of-Thought
FP8/FP16/FP32	Floating Point (8/16/32-bit)
GGUF	GPT-Generated Unified Format
GPU	Graphics Processing Unit
LLM	Large Language Model
LoRA	Low-Rank Adaptation
MCP	Model Context Protocol
ML	Machine Learning
MoE	Mixture of Experts
NLP	Natural Language Processing
OCR	Optical Character Recognition
RAG	Retrieval-Augmented Generation
RLHF	Reinforcement Learning from Human Feedback
TPU	Tensor Processing Unit
XAI	Explainable AI

Sources and Further Reading

1. Vaswani et al. (2017). “Attention Is All You Need.” NeurIPS. 2. Brown et al. (2020). “Language Models are Few-Shot Learners.” NeurIPS. 3. OpenAI. (2023). “GPT-4 Technical Report.” arXiv. 4. Anthropic. (2024). “Constitutional AI: Harmlessness from AI Feedback.” 5. Google DeepMind. (2024). “Gemini: A Family of Highly Capable Multimodal Models.” 6. Meta AI. (2023). “Llama 2: Open Foundation and Fine-Tuned Chat Models.” 7. Mistral AI. (2023). “Mistral 7B.” 8. Hu et al. (2021). “LoRA: Low-Rank Adaptation of Large Language Models.” 9. Dettmers et al. (2023). “QLoRA: Efficient Finetuning of Quantized LLMs.” 10. Anthropic. (2024). “Model Context Protocol Specification.” 11. Ouyang et al. (2022). “Training language models to follow instructions with human feedback.” 12. Radford et al. (2019). “Language Models are Unsupervised Multitask Learners.” 13. Kaplan et al. (2020). “Scaling Laws for Neural Language Models.” 14. Hoffmann et al. (2022). “Training Compute-Optimal Large Language Models.” 15. Wei et al. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.”

Last updated: March 2026. The AI field evolves rapidly—check back for updates.

The Complete AI Glossary: 100+ Terms Every Developer Needs to Know (2026 Edition)

The Complete AI Glossary: 100+ Terms Every Developer Needs to Know (2026 Edition)

Introduction: Why This Glossary Matters

Core Concepts (A-Z)

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Acronyms Quick Reference

Related Reading

Sources and Further Reading

Related articles

Recent articles

Come and join us....