A
Artificial Intelligence (AI)
The simulation of human intelligence processes by machines, especially computer systems. These processes include learning, reasoning, problem-solving, perception, and language understanding.
AGI (Artificial General Intelligence)
A theoretical form of AI that possesses the ability to understand, learn, and apply intelligence across a wide range of tasks at a human level or beyond. Unlike narrow AI, AGI can transfer knowledge between different domains.
Attention Mechanism
A technique in neural networks that allows models to focus on specific parts of input data when making predictions. It's fundamental to transformer architectures and modern language models.
Adversarial Training
A training method where models are exposed to adversarial examples (intentionally perturbed inputs) to improve robustness and generalization.
B
Backpropagation
An algorithm for training neural networks by calculating gradients of the loss function and propagating them backward through the network to update weights.
Batch Size
The number of training examples processed together in one iteration before the model's weights are updated.
Bias (Dataset)
Systematic errors in AI systems that result from prejudiced assumptions in training data or model design, leading to unfair outcomes.
C
Chatbot
An AI-powered conversational agent that can interact with users through text or voice, understanding queries and providing relevant responses.
CNN (Convolutional Neural Network)
A deep learning architecture specifically designed for processing grid-like data such as images. Uses convolutional layers to automatically learn spatial hierarchies of features.
Computer Vision
A field of AI that enables computers to interpret and understand visual information from the world, including images and videos.
Context Window
The maximum amount of text (measured in tokens) that an AI model can process at once. Larger context windows allow models to handle longer conversations and documents.
D
Deep Learning
A subset of machine learning that uses neural networks with multiple layers (deep neural networks) to learn hierarchical representations of data.
Diffusion Model
A generative model that learns to create data by reversing a gradual noising process. Widely used in image generation.
Dropout
A regularization technique that randomly drops units from a neural network during training to prevent overfitting.
E
Embedding
A representation of data (text, images, etc.) as dense vectors in a continuous vector space, where similar items are positioned closer together.
Epoch
One complete pass through the entire training dataset during the training process of a machine learning model.
F
Fine-tuning
The process of taking a pre-trained model and further training it on a specific task or dataset to adapt it to particular needs.
Few-Shot Learning
The ability of a model to learn from just a few examples, typically by providing examples in the prompt rather than full retraining.
G
GAN (Generative Adversarial Network)
A neural network architecture with two competing networks: a generator that creates data and a discriminator that evaluates it, improving through competition.
Generative AI
AI systems that can create new content, including text, images, music, code, and video, based on patterns learned from training data.
GPT (Generative Pre-trained Transformer)
A family of large language models developed by OpenAI that use transformer architecture and are pre-trained on vast amounts of text data.
H
Hallucination
When an AI model generates information that sounds plausible but is factually incorrect or completely fabricated.
Hyperparameter
Configuration settings for a machine learning model that are set before training begins, such as learning rate, batch size, and number of layers.
I
Inference
The process of using a trained AI model to make predictions or generate outputs on new, unseen data.
Instruction Tuning
Fine-tuning language models to better follow user instructions and prompts by training on instruction-response pairs.
L
Large Language Model (LLM)
Neural networks with billions of parameters trained on massive text datasets to understand and generate human-like text.
Learning Rate
A hyperparameter that controls how much to adjust model weights during training. Too high causes instability, too low makes training slow.
Loss Function
A mathematical function that measures how far a model's predictions are from the actual values, guiding the training process.
M
Machine Learning (ML)
A subset of AI where systems learn from data to improve performance without being explicitly programmed for specific tasks.
Multimodal AI
AI systems that can process and understand multiple types of data simultaneously, such as text, images, audio, and video.
N
Natural Language Processing (NLP)
A branch of AI focused on enabling computers to understand, interpret, and generate human language.
Neural Network
A computing system inspired by biological neural networks, consisting of interconnected nodes (neurons) organized in layers that process information.
O
Overfitting
When a model learns the training data too well, including noise and outliers, resulting in poor performance on new data.
P
Parameters
The learned weights and biases in a neural network that are adjusted during training. Model size is often measured in parameters.
Prompt Engineering
The practice of crafting effective inputs (prompts) to get desired outputs from AI models, especially language models.
Pre-training
The initial phase of training where a model learns general patterns from large amounts of unlabeled data before being fine-tuned for specific tasks.
R
RAG (Retrieval-Augmented Generation)
A technique that enhances language models by retrieving relevant information from external sources before generating responses.
Reinforcement Learning
A type of machine learning where agents learn by interacting with an environment and receiving rewards or penalties for actions.
RLHF (Reinforcement Learning from Human Feedback)
A training method where humans rate model outputs, and these ratings guide the model to align better with human preferences.
S
Self-Attention
A mechanism that allows models to weigh the importance of different parts of the input relative to each other.
Supervised Learning
A type of machine learning where models learn from labeled training data with known input-output pairs.
T
Temperature
A parameter controlling randomness in AI text generation. Higher values increase creativity but reduce consistency.
Token
The basic unit of text that language models process. A token can be a word, part of a word, or punctuation.
Transformer
A neural network architecture that uses self-attention mechanisms, forming the basis of modern language models like GPT and BERT.
Transfer Learning
Using knowledge from one task to improve learning on a related but different task, often by starting with a pre-trained model.
U
Unsupervised Learning
Machine learning where models find patterns in unlabeled data without explicit guidance on what to learn.
V
Vision Transformer (ViT)
An adaptation of transformer architecture for computer vision tasks, processing images as sequences of patches.
Z
Zero-Shot Learning
The ability of a model to perform tasks it wasn't explicitly trained on, using only instructions without examples.