AI Glossary

Your comprehensive guide to artificial intelligence terminology. Master the language of AI with clear definitions and practical examples.

100+
AI Terms
15
Categories
2025
Updated

A

Core Concept

Artificial Intelligence (AI)

The simulation of human intelligence processes by machines, especially computer systems. These processes include learning, reasoning, problem-solving, perception, and language understanding.

Example: ChatGPT uses AI to understand and generate human-like text responses.
Advanced

AGI (Artificial General Intelligence)

A theoretical form of AI that possesses the ability to understand, learn, and apply intelligence across a wide range of tasks at a human level or beyond. Unlike narrow AI, AGI can transfer knowledge between different domains.

Example: While current AI excels at specific tasks, true AGI would match human versatility across all cognitive tasks.
Model Architecture

Attention Mechanism

A technique in neural networks that allows models to focus on specific parts of input data when making predictions. It's fundamental to transformer architectures and modern language models.

Example: When translating "The cat sat on the mat", attention helps the model focus on relevant words for each translation step.
Training

Adversarial Training

A training method where models are exposed to adversarial examples (intentionally perturbed inputs) to improve robustness and generalization.

Example: Training an image classifier with slightly modified images to make it more resistant to attacks.

B

Training

Backpropagation

An algorithm for training neural networks by calculating gradients of the loss function and propagating them backward through the network to update weights.

Example: After making a prediction, the network adjusts its weights based on how wrong it was.
Training

Batch Size

The number of training examples processed together in one iteration before the model's weights are updated.

Example: A batch size of 32 means the model processes 32 examples before updating.
Data

Bias (Dataset)

Systematic errors in AI systems that result from prejudiced assumptions in training data or model design, leading to unfair outcomes.

Example: A hiring AI trained on historical data might inherit gender biases from past decisions.

C

Application

Chatbot

An AI-powered conversational agent that can interact with users through text or voice, understanding queries and providing relevant responses.

Example: ChatGPT, Claude, and Gemini are advanced chatbots powered by large language models.
Architecture

CNN (Convolutional Neural Network)

A deep learning architecture specifically designed for processing grid-like data such as images. Uses convolutional layers to automatically learn spatial hierarchies of features.

Example: Image recognition systems use CNNs to identify objects, faces, and scenes.
Application

Computer Vision

A field of AI that enables computers to interpret and understand visual information from the world, including images and videos.

Example: Self-driving cars use computer vision to detect pedestrians, road signs, and obstacles.
Model Interaction

Context Window

The maximum amount of text (measured in tokens) that an AI model can process at once. Larger context windows allow models to handle longer conversations and documents.

Example: Claude 4.5 has a 200K token context window, allowing it to process entire books.

D

Core Concept

Deep Learning

A subset of machine learning that uses neural networks with multiple layers (deep neural networks) to learn hierarchical representations of data.

Example: Image recognition systems use deep learning to identify features from pixels to objects.
Model Type

Diffusion Model

A generative model that learns to create data by reversing a gradual noising process. Widely used in image generation.

Example: Stable Diffusion and Midjourney use diffusion models to generate images from text.
Training

Dropout

A regularization technique that randomly drops units from a neural network during training to prevent overfitting.

Example: During training, 20% of neurons might be randomly deactivated each iteration.

E

Representation

Embedding

A representation of data (text, images, etc.) as dense vectors in a continuous vector space, where similar items are positioned closer together.

Example: Words like "cat" and "kitten" have similar embeddings because they're semantically related.
Training

Epoch

One complete pass through the entire training dataset during the training process of a machine learning model.

Example: Training for 10 epochs means the model sees the complete dataset 10 times.

F

Training

Fine-tuning

The process of taking a pre-trained model and further training it on a specific task or dataset to adapt it to particular needs.

Example: Taking GPT-4 and fine-tuning it on medical literature to create a healthcare chatbot.
Training

Few-Shot Learning

The ability of a model to learn from just a few examples, typically by providing examples in the prompt rather than full retraining.

Example: Showing an AI 3 examples of a new classification task and having it perform well immediately.

G

Model Type

GAN (Generative Adversarial Network)

A neural network architecture with two competing networks: a generator that creates data and a discriminator that evaluates it, improving through competition.

Example: GANs can generate realistic fake faces or artwork.
Model Type

Generative AI

AI systems that can create new content, including text, images, music, code, and video, based on patterns learned from training data.

Example: ChatGPT generates text, Midjourney generates images, and GitHub Copilot generates code.
Model Architecture

GPT (Generative Pre-trained Transformer)

A family of large language models developed by OpenAI that use transformer architecture and are pre-trained on vast amounts of text data.

Example: ChatGPT is based on the GPT architecture.

H

Model Issue

Hallucination

When an AI model generates information that sounds plausible but is factually incorrect or completely fabricated.

Example: A chatbot confidently citing a scientific paper that doesn't exist.
Training

Hyperparameter

Configuration settings for a machine learning model that are set before training begins, such as learning rate, batch size, and number of layers.

Example: Choosing a learning rate of 0.001 before starting model training.

I

Training

Inference

The process of using a trained AI model to make predictions or generate outputs on new, unseen data.

Example: After training an image classifier, using it to identify objects in new photos is inference.
Model Interaction

Instruction Tuning

Fine-tuning language models to better follow user instructions and prompts by training on instruction-response pairs.

Example: Training a model to respond appropriately when given commands like "Summarize this text" or "Translate to French".

L

Core Concept

Large Language Model (LLM)

Neural networks with billions of parameters trained on massive text datasets to understand and generate human-like text.

Example: GPT-4, Claude, and Gemini are large language models.
Training

Learning Rate

A hyperparameter that controls how much to adjust model weights during training. Too high causes instability, too low makes training slow.

Example: A learning rate of 0.001 means weights change by small increments each step.
Training

Loss Function

A mathematical function that measures how far a model's predictions are from the actual values, guiding the training process.

Example: Mean squared error is a common loss function for regression tasks.

M

Core Concept

Machine Learning (ML)

A subset of AI where systems learn from data to improve performance without being explicitly programmed for specific tasks.

Example: Email spam filters learn to identify spam by analyzing thousands of labeled emails.
Model Type

Multimodal AI

AI systems that can process and understand multiple types of data simultaneously, such as text, images, audio, and video.

Example: GPT-4 Vision can analyze both text and images in the same conversation.

N

Application

Natural Language Processing (NLP)

A branch of AI focused on enabling computers to understand, interpret, and generate human language.

Example: Language translation, sentiment analysis, and chatbots all use NLP.
Core Concept

Neural Network

A computing system inspired by biological neural networks, consisting of interconnected nodes (neurons) organized in layers that process information.

Example: Image recognition systems use neural networks to identify patterns in pixels.

O

Training

Overfitting

When a model learns the training data too well, including noise and outliers, resulting in poor performance on new data.

Example: A model that memorizes training examples but can't generalize to new situations.

P

Model Configuration

Parameters

The learned weights and biases in a neural network that are adjusted during training. Model size is often measured in parameters.

Example: GPT-4 reportedly has over 1 trillion parameters.
Model Interaction

Prompt Engineering

The practice of crafting effective inputs (prompts) to get desired outputs from AI models, especially language models.

Example: Adding "Think step by step" to a prompt often improves reasoning quality.
Training

Pre-training

The initial phase of training where a model learns general patterns from large amounts of unlabeled data before being fine-tuned for specific tasks.

Example: GPT models are pre-trained on internet text before being fine-tuned for chat.

R

Application

RAG (Retrieval-Augmented Generation)

A technique that enhances language models by retrieving relevant information from external sources before generating responses.

Example: A chatbot that searches a company's documentation before answering questions.
Training

Reinforcement Learning

A type of machine learning where agents learn by interacting with an environment and receiving rewards or penalties for actions.

Example: AlphaGo learned to play Go through reinforcement learning by playing millions of games.
Training

RLHF (Reinforcement Learning from Human Feedback)

A training method where humans rate model outputs, and these ratings guide the model to align better with human preferences.

Example: ChatGPT was trained using RLHF to be more helpful and harmless.

S

Model Interaction

Self-Attention

A mechanism that allows models to weigh the importance of different parts of the input relative to each other.

Example: Understanding that "it" refers to "book" in "The book was interesting. It was well-written."
Training

Supervised Learning

A type of machine learning where models learn from labeled training data with known input-output pairs.

Example: Training an image classifier with photos labeled as "cat" or "dog".

T

Model Configuration

Temperature

A parameter controlling randomness in AI text generation. Higher values increase creativity but reduce consistency.

Example: Temperature 0.1 gives consistent, focused outputs; 0.9 gives creative, varied responses.
Tokenization

Token

The basic unit of text that language models process. A token can be a word, part of a word, or punctuation.

Example: "ChatGPT" might be split into tokens: "Chat" and "GPT".
Architecture

Transformer

A neural network architecture that uses self-attention mechanisms, forming the basis of modern language models like GPT and BERT.

Example: All modern large language models use transformer architecture.
Training

Transfer Learning

Using knowledge from one task to improve learning on a related but different task, often by starting with a pre-trained model.

Example: Using an image model trained on general photos to recognize medical X-rays.

U

Training

Unsupervised Learning

Machine learning where models find patterns in unlabeled data without explicit guidance on what to learn.

Example: Clustering customer data to discover market segments without predefined categories.

V

Model Type

Vision Transformer (ViT)

An adaptation of transformer architecture for computer vision tasks, processing images as sequences of patches.

Example: Modern image recognition systems increasingly use ViTs instead of CNNs.

Z

Training

Zero-Shot Learning

The ability of a model to perform tasks it wasn't explicitly trained on, using only instructions without examples.

Example: Asking GPT-4 to translate to a rare language without providing translation examples.