ThinkLLM
ModelsCapabilitiesUse CasesBenchmarksPapersGlossary
ModelsCapabilitiesUse CasesBenchmarksPapersGlossary
AboutPrivacyTermsRSS

ThinkLLM

Spot an error in our data? Let us know.

Glossary

Technical terms explained for non-experts. These definitions appear throughout ThinkLLM to help you understand model profiles.

3559 terms

ABCDEFGHIJKLMNOPQRSTUVWXYZ

1

1-Bit Architecture

Architecture

A model design where weights are restricted to only three discrete values (-1, 0, or 1) instead of continuous floating-point numbers, drastically reducing model size and computation.

1-Bit Model

Architecture

A neural network where each weight is represented using only 1 bit of information (in this case, as one of three values: -1, 0, or 1).

1-Bit Precision

Architecture

An extreme form of quantization where each weight is represented by just a single bit (0 or 1), maximizing compression but reducing model expressiveness.

1-Bit Quantization

Deployment

An extreme form of compression that represents model weights using only 1 bit of information per value, drastically reducing memory use but with significant quality loss.

16-Bit Precision

Formats

A data format that represents model weights using 16 bits per number, balancing memory efficiency with numerical accuracy.

3

3D Gaussians

Techniques

Mathematical shapes (Gaussian distributions) positioned in 3D space used to represent and render 3D scenes efficiently.

3d Layout Conditioning

Techniques

Guiding AI model outputs by conditioning on 3D spatial layout information.

3D Scene Reconstruction

Techniques

Building a complete 3D model of a physical environment from images or sensor data.

3D Scene Understanding

Techniques

Comprehending the three-dimensional structure, objects, and relationships within a physical environment.

4

4-Bit Integer Quantization

Techniques

A specific quantization method that represents model weights using only 4 bits per number instead of the standard 32 bits, dramatically reducing memory usage.

4-bit Precision

Performance

A quantization level where model weights are stored using only 4 bits per value, significantly reducing model size at the cost of some accuracy.

4-bit Quantization

Techniques

A specific type of quantization that represents model weights using only 4 bits instead of the original 32 bits, enabling very efficient inference on consumer hardware.

5

5-bit Quantization

Formats

A specific compression method that represents model weights using only 5 bits of data per value, enabling efficient local deployment on resource-constrained hardware.

6

6-bit Precision

Architecture

A quantization method that represents model weights using only 6 bits per value, significantly reducing memory requirements compared to standard 32-bit floating-point storage.

6-bit Quantization

Techniques

A specific quantization method that represents model weights using 6 bits instead of the standard 32 bits, significantly shrinking the model while maintaining reasonable accuracy.

8

8-bit Precision

Formats

A quantization method that represents model weights using 8 bits instead of the standard 32 bits, reducing memory usage by approximately 75% while maintaining reasonable performance.

8-bit Quantization

Formats

A specific quantization method that represents model weights using 8 bits instead of the standard 32 bits, significantly reducing memory requirements.

A

Abliterated

Behavior

A model variant where safety filters and refusal mechanisms have been removed, allowing it to respond to requests without built-in content restrictions.

Abliteration

Techniques

A technique that removes or disables a model's built-in safety refusal mechanisms, allowing it to respond to a wider range of requests.

Abnormality Localization

Techniques

Identifying and highlighting the specific regions in medical images where disease or abnormalities are present.

Abnormality Maps

Techniques

Visual maps showing which regions of a medical image are abnormal, derived from comparing to historical cases.

Absolute Query-Key Relevance

Techniques

A measure of relevance between a query and key that is independent of other keys, allowing explicit rejection of irrelevant keys.

Abstention

Techniques

When a system declines to make a prediction or recommendation instead of providing an answer.

Abstract Syntax Tree (AST)

Techniques

A tree representation of code structure that shows how statements and expressions relate to each other.

Abstractive Summarization

Techniques

Generating a summary by creating new sentences that capture key information, rather than selecting existing text.

Acceptance Rate

Techniques

The proportion of draft model's proposed tokens that the target model accepts as correct during speculative decoding.

Accessibility

Techniques

Designing technology so people with disabilities can use it effectively.

Accountability Attribution

Techniques

Determining which party in a system is responsible for harms or failures.

Accuracy-Effort Trade-off

Techniques

A measure of how well an agent performs relative to the computational cost or number of steps it takes.

Acoustic Representation

Architecture

An internal mathematical encoding of sound properties that a model learns to recognize, such as frequency, pitch, and timbre characteristics.

Acquisition Function

Techniques

A rule that decides which point to evaluate next by balancing exploration of new areas with exploitation of promising regions.

Action Binding

Techniques

The problem of correctly associating a specific action command with the correct agent or subject in a scene.

Action Blindness

Techniques

A failure mode where agents make poor action choices that lead to uninformative observations, cascading into reasoning errors.

Action Recognition

Techniques

The task of identifying and classifying specific actions or activities occurring in video frames.

Action-conditioned generation

Techniques

Creating videos where specific physical actions (like forces or robot movements) control what happens in the scene.

Action-Conditioned Rollouts

Techniques

Simulating multiple future steps of an environment given a sequence of actions the agent might take.

Actionable Representation

Techniques

A learned encoding of an object that explicitly captures how it responds to and changes under different actions.

Activated Parameters

Architecture

The portion of a model's total parameters that are actually used to process a given input; in MoE models, this is typically much smaller than the total parameter count.

Activation Distribution

Techniques

The probability distribution of neuron outputs at each layer of a network.

Activation Noise

Techniques

Random variations added to a model's internal computations to test robustness.

Activation Patching

Techniques

A mechanistic interpretability technique that replaces activations during inference to identify which components cause specific behaviors.

Activation Pattern

Techniques

The specific configuration of which neurons are active across a network when processing a particular input or task.

Activation Precision

Architecture

The number of bits used to represent intermediate calculations during inference; keeping this higher (like 16-bit) helps preserve model quality when weights are heavily compressed.

Activation Probing

Techniques

Analyzing internal neural network activations to understand what a model has learned or decided at different points.

Activation Quantization

Techniques

The process of reducing the precision of intermediate values (activations) computed during model inference, separate from weight quantization.

Activation Steering

Techniques

Controlling model behavior by modifying internal activations during inference without changing model weights.

Activation-based Jailbreaking

Techniques

Bypassing AI safety features by manipulating the internal numerical patterns the model uses to process information.

Active Learning

Techniques

A training approach where the model chooses which new examples to learn from rather than using random data.

Active Parameter Count

Performance

The number of model parameters that are actually used during inference for a given input, as opposed to the total parameters available.

Active Parameter Design

Architecture

A model architecture where only a subset of parameters are used for each token, reducing computational cost while maintaining model capacity.

Active Parameters

Architecture

The subset of a model's total parameters that are actually used during inference for each input, as opposed to all parameters being used every time.

Acyclicity Constraint

Techniques

A mathematical constraint ensuring a causal graph has no cycles, enforcing valid causal structures.

AdamW

Techniques

A standard optimizer algorithm commonly used to train neural networks by adjusting weights based on gradients.

Adapter

Techniques

A small, specialized module added to a model that modifies its output for a specific task without changing the core model weights.

Adapter Code

Techniques

Custom code written to translate data between incompatible formats or interfaces.

Adapter-Based Architecture

Techniques

Adding lightweight modules to a pre-trained model to enable new capabilities without retraining the entire model.

Adaptive Attack

Techniques

An attack that adjusts its strategy based on feedback from the target system to improve its effectiveness.

Adaptive Learning

Techniques

Educational systems that adjust content difficulty and pacing based on real-time analysis of learner performance and understanding.

Adaptive Policy

Techniques

A system that dynamically adjusts parameters (like reward weights) based on the current task or input.

Adaptive Prompting

Techniques

Dynamically selecting or modifying prompts based on the specific input query to optimize model performance.

Adaptive Quantization

Techniques

A quantization approach that adjusts its representation strategy based on the distribution of input values.

Adaptive Reasoning

Techniques

Dynamically adjusting how much computational effort a model uses based on problem difficulty.

Admm

Techniques

Optimization algorithm that splits problems into smaller parts solved alternately.

Adversarial Attack

Techniques

Intentional manipulation of input data to trick an AI model into making wrong decisions.

Adversarial Auditing

Techniques

Systematically testing an agent's reasoning to find logical or evidential violations it may have missed.

Adversarial co-evolution

Techniques

A training loop where attack and defense agents compete and improve against each other iteratively.

Adversarial Evaluation

Techniques

Testing designed to find weaknesses and edge cases rather than help the system succeed.

Adversarial Examples

Techniques

Deliberately tricky test cases designed to fool AI models, like plausible wrong answers.

Adversarial Falsification

Techniques

Systematically searching for inputs where a model fails, used here to find materials where ML predictions diverge from ground truth.

Adversarial Learning

Techniques

Training where two networks compete—one generates behavior, the other judges if it matches the expert.

Adversarial loop

Techniques

A process where one agent intentionally creates challenging test cases to improve another agent's output.

Adversarial Objectives

Techniques

Training approach where a generator and discriminator compete to improve output quality and realism.

Adversarial Perturbations

Techniques

Carefully crafted, often imperceptible changes added to images to fool AI models into producing incorrect outputs.

Adversarial Prompting

Techniques

Deliberately crafted inputs designed to trick an LLM into unsafe or unreliable outputs.

Adversarial Robustness

Techniques

The ability of an AI system to maintain correct behavior even when facing intentionally crafted misleading inputs.

Adversarial Training

Techniques

A defense method that trains models on adversarial examples to improve robustness against attacks.

Adversarial Training-Free Defense

Techniques

A defense mechanism that protects models from attacks without requiring exposure to adversarial examples during training.

Aesthetic Assessment

Techniques

Evaluating the visual appeal and artistic qualities of images or scenes, such as composition and harmony.

Affect Coupling

Techniques

Linking emotional or sentiment states between connected entities in a system.

Affective Polarity

Techniques

The emotional tone of text, measured as the degree of negativity, positivity, or neutrality in language.

Affordance Prediction

Techniques

Predicting which areas or objects in a scene are suitable for a specific action or interaction.

Agency

Techniques

An AI system's ability to act autonomously toward goals in its environment.

Agent autonomy

Techniques

The degree to which an agent retains independent decision-making capability without external manipulation.

Agent Harness

Techniques

The framework or system that orchestrates how an AI agent retrieves information, calls tools, and processes results.

Agent Orchestration

Techniques

Coordinating multiple AI agents to work together on complex tasks.

Agent Recommendation

Techniques

Automatically selecting the most suitable agent(s) for a task from available registries using matching and ranking techniques.

Agent Skill

Techniques

A specific capability or tool that an AI agent can use to accomplish part of a larger task.

Agent Trajectory

Techniques

The sequence of actions and decisions an agent makes while working toward a goal.

Agent-Based Model

Techniques

A simulation where independent agents follow simple rules and interact, creating emergent group behavior.

Agentic

Behavior

A model designed to act autonomously by making decisions, selecting actions, and using tools to accomplish multi-step tasks.

Agentic AI

Techniques

An AI system that can autonomously plan and execute multi-step tasks, making decisions along the way.

Agentic Behavior

Behavior

The ability of a model to autonomously plan and execute sequences of actions or tool calls to accomplish a goal.

Agentic Coding

Behavior

An approach where an AI model autonomously plans and executes multi-step coding tasks, making decisions about which files to modify and how to structure solutions.

Agentic Depth

Techniques

Sequential overhead from cascaded perception, reasoning, and tool-calling loops in agentic systems.

Agentic Engineering

Techniques

Designing and building systems where AI agents autonomously plan, decide, and act toward goals.

Agentic Evaluation

Techniques

Testing an AI system's ability to complete multi-step tasks that require planning, searching, and taking actions.

Agentic Framework

Techniques

A system where an AI model acts as an agent that can call tools repeatedly to solve problems step-by-step, rather than answering in a single pass.

Agentic Language

Techniques

A structured language with explicit control constructs (IF, GOTO, FORALL) that agents use to execute plans deterministically.

Agentic Microphysics

Techniques

The study of local interaction dynamics where one agent's output becomes another agent's input under specific protocol conditions.

Agentic Multimodal Models

Techniques

AI systems that can process multiple types of input (text, images, etc.) and actively interact with external tools and environments.

Agentic Perception

Techniques

Vision systems that extract structured state information needed for an agent to make decisions, not just recognize objects.

Agentic Reasoning

Techniques

Reasoning through explicit tool calls or code execution that can be interpreted and debugged, but may incur latency from external execution.

Agentic Reinforcement Learning

Techniques

Training autonomous agents to make sequential decisions by learning from rewards and reusable experience.

Agentic Reinforcement Learning

Training

A training approach where an AI model learns to make sequential decisions and take autonomous actions to complete multi-step tasks, rather than just responding to individual prompts.

Agentic Search Systems

Techniques

AI systems that iteratively search and synthesize information to solve complex problems autonomously.

Agentic Self-Correction

Techniques

An AI agent's ability to detect and fix its own errors by using tools or feedback without human intervention.

Agentic Strategies

Techniques

Structured approaches where an AI system takes initiative to gather information systematically rather than passively responding to user input.

Agentic Systems

Techniques

AI systems that autonomously plan, act, and adapt based on feedback to accomplish multi-step goals in complex environments.

Agentic Tasks

Behavior

Complex tasks where a model acts autonomously to break down goals into steps, use tools, and make decisions to reach an objective.

Agentic Workflows

Behavior

Processes where a model autonomously plans and executes multiple steps or tool calls to accomplish a goal, rather than responding to a single prompt.

Aggregation

Techniques

Combining multiple data points or model outputs into a single summary result.

AI-Augmented Ecosystems

Techniques

Interconnected systems where multiple AI components interact through shared data and infrastructure.

Aleatoric Uncertainty

Techniques

Randomness or noise inherent in data that cannot be reduced with more information.

Algorithmic Bias

Techniques

Systematic errors in AI systems that unfairly disadvantage certain groups of people.

Algorithmic Fairness

Techniques

Ensuring AI systems treat different groups equitably without discrimination.

Algorithmic Monoculture

Techniques

Tendency of AI systems to produce similar outputs or behaviors, either naturally or in response to incentives.

ALiBi Positional Encoding

Architecture

A technique that helps the model understand the order and position of words in long sequences without needing to add extra position information to each word.

Aligned

Training

A model trained to behave safely and follow human values through techniques like safety filtering and refusal of harmful requests.

Alignment

Training

The process of training a model to behave safely and according to human values and preferences, which base models typically lack.

Alignment Faking

Techniques

When an AI model appears aligned under monitoring but subverts its goals when unmonitored.

Alignment Fine-Tuning

Training

The process of adjusting a model's behavior to make it safer, more helpful, and better aligned with human values.

Alignment Guardrails

Training

Safety constraints built into a model during training to prevent it from generating harmful, biased, or inappropriate content.

Alignment Layer

Training

Additional training applied to a base model to make it behave safely and follow user intentions more reliably.

Allocation Monotonicity

Techniques

A guarantee that higher bids weakly increase an item's chance of being recommended without requiring model retraining.

Alpha Release

Deployment

An early, experimental version of software that is still under development and may have bugs or incomplete features.

Ambiguity Bias

Techniques

Errors caused by confusion between similar or overlapping UI elements when determining which one to interact with.

Amino Acid Sequence

Behavior

The linear chain of amino acids that makes up a protein, which determines its structure and function.

Amino Acid Sequences

Formats

The linear arrangement of amino acids that make up a protein, written as a string of letters where each letter represents a different amino acid.

Amortization

Techniques

Spreading the cost of an expensive computation across multiple uses to reduce per-use cost.

Ancestor-Only Attention Mask

Techniques

An attention pattern that restricts a model to only attend to ancestor nodes in a tree structure, enabling efficient tree verification.

Anchor Selection

Techniques

Choosing a reference model to compare all other models against in pairwise evaluation tasks.

Anchoring

Techniques

Bias where initial information disproportionately influences subsequent decisions.

Annotation Aggregation

Techniques

Methods for combining multiple human judgments into a single training signal for the model.

Annotation Framework

Techniques

A structured set of guidelines for labeling data with specific linguistic or semantic information.

Annotation Pipeline

Techniques

A systematic process for labeling data with human-verified information to create training datasets.

Annotator Disagreement

Techniques

Variation in how different people label the same content, reflecting genuine differences in perspective rather than labeling error.

Anode Material

Techniques

The negative electrode in a battery where ions are stored during charging.

Anomaly Detection

Techniques

Identifying data points or objects that deviate significantly from normal patterns or training data.

Answer Set Programming

Techniques

A declarative programming paradigm for solving combinatorial problems using logical rules and constraints.

Apache 2.0 License

Licensing

An open-source software license that allows free use, modification, and distribution of code with minimal restrictions.

Apache 2.0 License

Licensing

A permissive open-source license that allows free use, modification, and distribution of software with minimal restrictions.

Apache License

Licensing

A permissive open-source license that allows you to use, modify, and distribute software with minimal restrictions.

Apache Licensed

Licensing

A permissive open-source license that allows free use, modification, and distribution of software with minimal restrictions.

API

Deployment

An interface that allows developers to send requests to and receive responses from an AI model over the internet.

API Access

Deployment

A programmatic interface that allows developers to send requests to the model and receive responses without running it locally.

API Accessibility

Deployment

The ability to access and use a model programmatically through an application programming interface, allowing developers to integrate it into their applications.

API Accessible

Deployment

A model that can be used through an application programming interface, allowing developers to integrate it into their applications programmatically.

API Availability

Deployment

Access to a model through an application programming interface, allowing developers to integrate the model into their applications and services programmatically.

API Compatibility

Deployment

The ability of a service to work with the same code and commands as another service, making it easy to switch between them.

API Deployment

Deployment

A method of making an AI model available for use over the internet through standardized web requests, rather than running it locally.

API Inference

Deployment

Running a model through a web service interface where you send requests and receive predictions without needing to host the model yourself.

API Schema

Techniques

A specification describing how a backend service accepts requests and returns data.

API Token

Techniques

A credential that grants an application or automated agent permission to access services and data on behalf of a user or organization.

API-Based Deployment

Deployment

A model served through an application programming interface (API) rather than run locally, allowing users to send requests and receive responses over the network.

API-Only Access

Deployment

A model that can only be used through programmatic requests (code) rather than through a web interface or chat application.

Append Only Log

Techniques

Data structure that records events sequentially without allowing deletions.

Apple Silicon

Deployment

Apple's custom-designed processors (like M1, M2, M3) optimized for running machine learning models on Mac computers.

Apple Silicon Optimization

Deployment

Software tuning that allows a model to run efficiently on Apple's custom processors (like M1, M2, M3) found in Mac computers.

Approximation Ratio

Techniques

A measure of how close a solution is to the optimal solution, expressed as a ratio.

Approximation Theory

Techniques

Mathematical framework for understanding how well functions can represent complex phenomena.

Architecture

Architecture

The underlying structural design of a neural network that defines how data flows through layers and components.

Arithmetic Circuit

Techniques

A mathematical representation of a computation as a directed graph of arithmetic operations.

Arithmetic Reasoning

Evaluation

A model's ability to perform mathematical calculations and solve problems involving numbers and operations.

Arousal

Techniques

The intensity or activation level of an emotion, ranging from calm to excited.

Artificial Neural Network (ANN)

Techniques

A machine learning model inspired by biological neurons that learns patterns from data to make predictions or classifications.

Artistic Style Prediction

Techniques

The task of identifying or classifying the artistic style of a work (e.g., Renaissance, Impressionism) using AI.

Aspect-Decomposed Synthetic Corpus

Techniques

Training data generated by breaking down queries into multiple aspects and creating complementary evidence examples.

Associative Memory

Techniques

A system that retrieves stored patterns by establishing stable attractors around them, like Hopfield networks.

Associative Reasoning

Techniques

The ability to find meaningful connections and relationships between different concepts or ideas.

Asymmetric Encoding

Techniques

A technique where queries and documents are encoded differently to optimize retrieval performance, rather than treating them identically.

Asymmetric Search

Techniques

A retrieval approach where the query and the documents being searched have different lengths or structures, like matching a short question to long passages.

Asymptotic-Preserving

Techniques

A neural network approach that correctly captures physics behavior across different scales and parameter regimes.

Attention

Architecture

A mechanism that lets the model focus on relevant parts of the input when generating each output token.

Attention Head

Techniques

A parallel attention mechanism within a transformer layer that learns different aspects of input relationships.

Attention Maps

Techniques

Visual representations showing which parts of an input a model focuses on when generating each output.

Attention Mechanism

Architecture

A technique that allows a model to focus on the most relevant parts of the input when generating each output token.

Attention Partition

Techniques

How a model's attention mechanism divides its focus between different input elements like image and text tokens.

Attention Pass

Techniques

A single forward computation through an attention mechanism that produces weighted outputs from input queries and values.

Attention Pooling

Techniques

Aggregating embeddings by learning weighted combinations that emphasize the most relevant slices or features.

Attention Sink

Techniques

A token that attracts excessive attention from the model regardless of its semantic importance.

Attention Sinks

Techniques

Tokens that attract disproportionate attention from the model regardless of their semantic relevance to the task.

Attention Visualization

Techniques

Techniques that show which parts of input data a model focuses on during processing.

Attention-Based Grounding Score

Techniques

A signal measuring how well a reasoning step is supported by the input and previously accepted steps.

Attractor Module

Techniques

A component that refines embeddings by solving for fixed points using implicit differentiation during training.

Attribute Inference

Techniques

Deducing personal characteristics like gender, age, or ethnicity from user data without explicit disclosure.

Attribution Method

Techniques

A technique that identifies which parts of an input (like image regions) are most responsible for a model's predictions or errors.

Attribution-Based Neuron Mining

Techniques

A technique to identify which neurons are responsible for processing specific types of input by analyzing their contribution to outputs.

AUC-Consistency Dissociation

Techniques

When a model maintains high classification accuracy (AUC) while its explanations become inconsistent across similar cases.

Audio Captioner

Techniques

A system that generates text descriptions of audio content, allowing LLMs to reason about sound indirectly.

Audio Classification

Behavior

The task of automatically assigning audio clips to predefined categories, such as identifying whether a sound is music, speech, or environmental noise.

Audio Codec

Formats

A tool that compresses and decompresses audio data to reduce file size while preserving sound quality.

Audio Conditioning

Techniques

Using an audio sample to guide or control what a generative model produces, rather than using text or other inputs.

Audio Embedding

Architecture

A numerical representation (vector) that captures the essential features and meaning of audio data in a compact form that machine learning models can process.

Audio Embeddings

Architecture

Numerical representations of audio that capture its meaning and characteristics in a form that machine learning models can process.

Audio Encoder

Techniques

A neural network component that converts raw audio signals into numerical representations the model can process.

Audio Fidelity

Performance

The quality and accuracy of synthesized audio in reproducing natural-sounding speech.

Audio Reconstruction

Techniques

The process of converting compressed audio tokens back into playable audio that closely matches the original sound.

Audio Transcription

Techniques

Converting spoken audio content into written text for analysis.

Audio-Language Pretraining

Training

A training approach that teaches a model to understand connections between audio sounds and text descriptions by learning from large unlabeled datasets.

Audio-Visual Processing

Architecture

The ability to simultaneously analyze sound and video streams to understand content where both sight and sound are important.

Audio-Visual Understanding

Behavior

The ability to jointly process and reason about both sound and video content to understand events, speech, and context more completely than analyzing either alone.

Auditing

Techniques

The process of systematically reviewing code or systems to detect errors, vulnerabilities, or malicious modifications.

Auditory Knowledge

Techniques

An LLM's understanding of sound, audio concepts, and acoustic phenomena learned from text-only pre-training.

Augmented Lagrangian Method

Techniques

An optimization algorithm that solves constrained problems by iteratively updating variables and penalty parameters.

AUROC

Techniques

Area Under the Receiver Operating Characteristic curve, a metric measuring how well a model ranks correct answers above incorrect ones.

Authorial Intent

Techniques

The underlying purpose or goal behind a creator's choices, whether to inform accurately or mislead deliberately.

Auto-tuning

Techniques

Automatically selecting optimal parameter values for a program by testing different configurations.

Autocomplete

Behavior

A feature that predicts and suggests the next tokens or code snippets as a user types, completing partial inputs.

Autoencoder

Techniques

A neural network that compresses data into a smaller representation (encoder) and reconstructs it (decoder).

Automated Evaluation

Techniques

Using algorithms to automatically measure AI model performance on tasks.

Automated Program Repair

Techniques

Techniques that automatically generate patches to fix bugs or vulnerabilities in source code.

Automated Programming Assessment

Techniques

Systems that automatically evaluate student code submissions for correctness and understanding.

Automated Verification

Techniques

Using computational methods to automatically check whether a proposed solution is correct without human review.

Automatic Differentiation

Techniques

Computing gradients of functions by decomposing them into elementary operations and applying the chain rule.

Automatic Speech Recognition (ASR)

Techniques

Technology that converts spoken audio into written text automatically.

Automation Bias

Techniques

The tendency for humans to over-rely on or trust automated systems, even when they make mistakes.

AutoML (Automated Machine Learning)

Techniques

Automated tools that search over multiple model architectures and hyperparameters to find the best classifier without manual tuning.

Autonomous Agent

Behavior

An AI system that can independently perceive its environment, make decisions, and take actions to accomplish goals without constant human direction.

Autonomous Agents

Behavior

AI systems that can independently plan and execute multi-step tasks without human intervention at each step.

Autonomous Feedback Loop

Techniques

A system where AI automatically evaluates and improves itself without human intervention in the loop.

Autonomous Play

Techniques

A robot independently practicing tasks and generating training data without human guidance or intervention.

Autonomy Spectrum

Techniques

A range of control levels from fully human-controlled to fully autonomous AI, with hybrid modes in between.

Autoregressive

Architecture

A model that generates text one token at a time by predicting the next word based on all previous words in the sequence.

Autoregressive Decoding

Techniques

The standard method most language models use to generate text by predicting one token (word piece) at a time, left to right, where each prediction depends on all previous tokens.

Autoregressive Generation

Behavior

A text generation approach where the model predicts one word at a time, using all previously generated words to inform the next prediction.

Autoregressive Language Model

Architecture

A model that generates text by predicting one word or token at a time, using only the words that came before it.

Autoregressive Model

Techniques

A model that predicts the next item in a sequence based on all previous items, one step at a time.

Autoregressive Models

Architecture

Language models that generate text one token (word piece) at a time, where each new token depends on all previously generated tokens.

Autoregressive Rollout

Techniques

Generating predictions sequentially where each prediction depends on previous predictions, causing errors to compound over time.

Autoregressive Video Diffusion

Techniques

A generative model that creates videos frame-by-frame sequentially, where each new frame depends on previously generated frames.

Autoregressive zooming

Techniques

Generating a sequence of zoom-level decisions one at a time, where each decision depends on previous ones, to progressively narrow down a location.

AutoRound

Training

An automated quantization method that intelligently rounds weights to lower precision while minimizing the loss in model performance.

AutoRound Quantization

Deployment

Intel's automated quantization method that intelligently rounds model weights to lower precision while minimizing accuracy loss.

B

Backbone

Architecture

The core language model architecture that forms the foundation of a larger system, in this case Llama 3.

Backbone Architecture

Architecture

The core neural network structure that a model is built upon, which in this case is Llama 3.

Backbone Model

Architecture

A core neural network component that extracts features from input data, typically used as a foundation for larger systems rather than standalone.

Backdoor Attack

Techniques

A security attack where hidden malicious behavior is embedded in a model to trigger on specific inputs.

Backpropagation Through Time

Techniques

A training method for recurrent networks that computes gradients by unrolling the network across time steps.

Backtesting

Techniques

Testing a model on historical data to evaluate how it would have performed.

Backtracking

Techniques

Reverting to an earlier decision point when an approach fails, rather than trying to fix errors at the current level.

Backward Transfer

Techniques

How learning new tasks affects performance on previously learned tasks.

Balanced Accuracy

Techniques

A fairness metric that averages accuracy across classes, preventing high scores when one class dominates predictions.

Bandit Feedback

Techniques

Learning setting where you only observe the outcome of your chosen action, not all alternatives.

BART

Architecture

A neural network architecture that combines an encoder (which reads text) and a decoder (which generates text), commonly used for tasks like summarization and text generation.

BART Architecture

Architecture

A neural network design that combines an encoder (for understanding text) and decoder (for generating text) to learn meaningful representations.

Base Architecture

Architecture

The foundational neural network design that a model is built upon; inheriting from a base architecture means the model follows the same core structure and design principles.

Base Language Model

Training

A foundational AI model trained on raw text data without additional fine-tuning for specific tasks or instructions.

Base Learners

Techniques

The individual weak models (like decision trees or neural networks) that are combined in an ensemble method.

Base Model

Training

A pretrained model that completes text patterns but hasn't been trained to follow instructions, serving as a starting point for customization through fine-tuning.

Base Model Size

Architecture

A smaller version of a model architecture that prioritizes speed and lower memory usage over maximum performance, making it suitable for resource-constrained environments.

Base Pretrained

Training

A model trained only on raw text prediction without additional instruction-following training, so it completes text continuations rather than answering questions or following commands.

Base Pretrained Model

Training

A language model trained on raw text data without additional instruction tuning, so it completes text patterns rather than following specific user instructions.

Baseline Model

Evaluation

A simple reference model used to compare performance against more complex models or to establish a minimum expected behavior.

Basin of Attraction

Techniques

A region in a model's state space where inputs converge to the same output or memory.

Basis Functions

Techniques

Simple mathematical shapes (like sine waves or Gaussians) combined to represent complex signals.

Batch Effects

Techniques

Systematic differences in data caused by processing samples in separate groups.

Bayes-Nash Equilibrium

Techniques

A stable outcome where each agent's strategy is optimal given their private information and beliefs about others' strategies.

Bayesian Filters

Techniques

Probabilistic methods that estimate hidden states by recursively updating beliefs based on observations and a system model.

Bayesian Incentive Compatible (BIC)

Techniques

A mechanism where participants are motivated to tell the truth about their preferences, given what they know.

Bayesian Inference

Techniques

A statistical method that updates beliefs about unknown values using observed data and prior knowledge.

Bayesian Linguistic Belief State

Techniques

A semi-structured representation combining numerical probabilities with natural-language evidence summaries, updated iteratively by an LLM.

Bayesian Neural Networks

Techniques

Neural networks that model uncertainty by treating weights as probability distributions rather than fixed values.

Bayesian optimization

Techniques

A method that uses probability to intelligently update and improve a system based on past results.

Bayesian Persuasion

Techniques

A framework for analyzing how information disclosure strategically influences decision-makers' choices.

BCE Loss

Training

Binary Cross-Entropy loss, a training objective commonly used for relevance scoring tasks where the model learns to predict whether a query-document pair is relevant or not.

Beam Search

Techniques

A decoding algorithm that keeps the top-k most likely candidate sequences at each step, balancing quality and computational cost.

Bee Equation

Techniques

A mathematical model describing how honeybee swarms reach consensus on nest sites through recruitment and inhibition.

Behavior Cloning

Techniques

Training a policy to imitate expert demonstrations by supervised learning on state-action pairs.

Behavioral Feedback

Techniques

How human responses to interventions create secondary effects that influence system outcomes.

Belief State

Techniques

A representation of what an AI system or person currently believes to be true about a situation.

Belief-Desire-Intention (BDI) Model

Techniques

A framework modeling agent behavior through beliefs (what they know), desires (what they want), and intentions (what they commit to do).

Bellman Operator

Techniques

A mathematical operator that updates value estimates based on immediate rewards and future value predictions.

Benchmark

Evaluation

A standardized test suite used to measure and compare model performance on specific tasks.

Benchmark Dataset

Techniques

A standardized set of test problems used to measure and compare the performance of different algorithms or models.

Benchmarkless Comparative Safety Scoring

Techniques

Comparing model safety when no labeled benchmark exists for the specific language, domain, or regulatory context.

Benign Overfitting

Techniques

A phenomenon where a model fits training data perfectly but still generalizes well to unseen data.

BERT

Architecture

A foundational neural network architecture designed to understand the meaning of words in context by learning from large amounts of text.

BERT Architecture

Architecture

A transformer-based model design that reads text in both directions simultaneously to understand context, widely used as a foundation for language understanding tasks.

BERT Encoder

Architecture

A neural network model that reads text and converts it into numerical vector representations that capture the meaning of words and sentences.

BERT Model

Architecture

A transformer-based neural network architecture designed to understand text by learning bidirectional context, commonly used as a foundation for natural language understanding tasks.

BERT-Based

Architecture

A model architecture that uses the same foundational design as BERT, which learns bidirectional context by reading text in both directions simultaneously.

BERT-Based Model

Architecture

A model built on BERT, a foundational architecture that learns bidirectional text representations and is commonly adapted for specific tasks like spell-checking.

BERT-Style Architecture

Architecture

A neural network design based on the BERT model that uses transformer layers to understand relationships between words in text by looking at context from all directions.

BERT-Style Encoder

Architecture

A transformer-based model architecture that reads text bidirectionally to understand context and produce meaningful representations of words and sentences.

BERT-Tiny

Architecture

A heavily compressed version of the BERT language model with far fewer parameters, designed for fast inference on resource-constrained devices.

Best-of-N Sampling

Techniques

A decoding strategy that generates N candidate responses and selects the one ranked highest by a reward model.

Beta Release

Deployment

An early version of software that is still being tested and refined, meaning it may have bugs or incomplete features but is available for broader evaluation.

Betti Number

Techniques

A topological property that counts connected components and holes in a structure, used here to enforce vessel connectivity.

BF16

Formats

A 16-bit floating-point format that balances precision and memory efficiency, commonly used for training and deploying large language models.

BF16 Format

Formats

A 16-bit floating-point format (Brain Float 16) that balances precision and memory efficiency, commonly used for storing and running large language models.

BF16 Precision

Formats

A 16-bit numerical format that balances memory efficiency with numerical stability, using fewer bits than standard 32-bit floats while maintaining training and inference quality.

BFloat16 (BF16)

Formats

A 16-bit floating-point format that preserves numerical precision similar to full 32-bit precision while using half the memory, making large models faster and cheaper to run.

Bi-Encoder

Architecture

A model architecture that encodes two pieces of text separately into comparable vector representations, allowing efficient comparison of their semantic similarity.

Bi-level Optimization

Techniques

An optimization approach with two nested loops: an inner loop optimizing fast weights and an outer loop optimizing the main model parameters.

Bias Evaluation

Techniques

Systematic testing of AI models to identify and measure discriminatory patterns against specific groups.

Bias-Boundedness

Techniques

A mathematical guarantee that limits how much bias can affect a model's decisions, even if the bias source is unknown.

Bias-Sensitive Regions

Techniques

Parts of a model where social biases are most likely to emerge or be encoded in the computations.

Bias-Variance Decomposition

Techniques

Breaking down prediction error into bias (systematic error) and variance (sensitivity to training data).

Bid-Aware Decoding

Techniques

An inference technique that adjusts which items are generated based on real-time bid values, steering recommendations toward higher-value items.

Bidirectional Attention

Architecture

A mechanism that allows the model to look at context both before and after each word when understanding text, rather than just looking forward.

Bidirectional Context

Architecture

The ability to understand relationships between words by looking at both the words that come before and after a given word.

Biencoder

Architecture

A neural network architecture that encodes two separate pieces of text independently and compares them to measure semantic similarity, commonly used for matching and retrieval tasks.

Big-M Constant

Techniques

A large coefficient used in MILP formulations to enforce logical constraints; larger values make the relaxation weaker and solving slower.

BigBird-Pegasus Architecture

Architecture

A transformer-based model architecture designed to handle very long text sequences efficiently by using sparse attention patterns instead of processing every word pair.

Bilevel Optimization

Techniques

An optimization framework with two hierarchical levels where upper-level decisions constrain lower-level optimization problems.

Bilinear Decomposition

Techniques

A factorization where value and policy functions are expressed as products of goal-conditioned coefficients and learned basis functions.

Bilingual

Behavior

A model trained to understand and generate text in two languages, in this case Japanese and English.

Bilingual Model

Training

A language model trained to understand and generate text in two languages with comparable fluency.

BiLSTM (Bidirectional LSTM)

Techniques

A recurrent neural network that processes text in both forward and backward directions to capture context from both sides of each word.

Bimodal Encoder

Architecture

A model that processes two different types of input (in this case, code and natural language) and converts them into a shared representation space.

Binary Routing

Techniques

A decision mechanism where neurons act as on/off switches to direct data through different computational paths.

Biomedical Corpus

Training

A large collection of medical and scientific texts (like research papers and journals) used to train the model on domain-specific language and concepts.

Biomedical NLP

Techniques

Natural language processing techniques applied specifically to medical and biological text, such as extracting drug names or identifying disease mentions from research papers.

Biomedical Reasoning

Behavior

The ability to understand and work with scientific concepts in biology and medicine, such as drug interactions and molecular structures.

Biomedical Text

Training

Written content from medical and life sciences domains, including clinical notes, research papers, and healthcare documentation.

Biomedical Vocabulary

Training

Specialized medical and scientific terms and concepts that the model has learned to understand from training on medical literature.

Biosecurity

Techniques

Protecting against misuse of biological research and AI in harmful ways.

Biosignal

Techniques

Electrical or physical signals produced by the body, such as heart rhythms or brain waves.

Bird's-Eye View (BEV)

Techniques

A top-down 2D representation of a 3D scene, showing spatial layout as if viewed from above.

Birkhoff Polytope

Techniques

The mathematical space of all doubly stochastic matrices; parameterizing this space exactly is the core challenge this paper addresses.

Bit Depth

Deployment

The number of bits used to represent each number in a model; lower bit depths (like 3-bit) create smaller files but may lose some accuracy compared to higher bit depths.

Bit Precision

Architecture

The number of bits used to represent each number in a model; lower bit precision (like 3-bit) means smaller file size but potentially less accurate calculations.

Bit-Width

Deployment

The number of bits used to represent each number in a model; lower bit-widths (like 6-bit) use less memory but may reduce precision compared to higher bit-widths.

Bit-width Adaptive Selection

Techniques

Automatically choosing the optimal number of bits for quantizing different parts of a model based on their importance.

Bits-per-byte (BpB)

Techniques

A compression metric measuring how many bits are needed to encode each byte of text.

Black-box Testing

Techniques

Evaluating a system's behavior by observing inputs and outputs without access to internal model structure or weights.

Blind-Spot Mass

Techniques

A measure of uncertainty in an agent's decision-making at a given state—how much of the decision space lacks statistical support from training data.

Blinded Evaluation

Techniques

Assessment where evaluators don't know which version or source produced the item being judged.

Block Attention Mechanism

Techniques

An attention technique that processes groups of items together to improve efficiency and capture relationships between them.

Block Floating Point (BFP)

Techniques

A quantization format that groups values into blocks and uses a shared exponent (scale) for each block to reduce precision while maintaining accuracy.

Block Output Embeddings

Techniques

Internal vector representations produced by a state space model's processing blocks that encode information about token sequences.

Block Scales

Techniques

Scaling factors computed for groups of values in low-precision formats to maintain numerical accuracy.

Block-Diffusion Language Model

Techniques

A language model that generates multiple tokens in parallel using diffusion, then refines them iteratively.

Block-Scaled Quantization

Techniques

A quantization method that divides values into groups and applies a shared scale factor to each group.

BM25

Techniques

A ranking function that scores document relevance based on term frequency and document length normalization.

Body-frame Velocity

Techniques

Movement commands relative to the drone's own orientation, rather than a fixed world direction.

Boundary Enforcement

Techniques

Mechanisms that prevent an LLM from crossing defined limits in reasoning or behavior.

Bounding Box

Formats

A rectangular coordinate set that marks the exact location and size of detected text or objects within an image.

Bradley-Terry Model

Techniques

A statistical model that ranks items based on pairwise comparison outcomes, commonly used for leaderboards.

Brainstorming Augmentation

Techniques

Using AI to enhance the exploratory ideation phase of research rather than automating solution design.

Branching Factor

Techniques

The average number of possible moves available at each decision point in a game.

Breakpoint

Techniques

A marker in code where a debugger pauses execution so you can inspect the program state.

Brier Skill Score

Techniques

A metric measuring forecast accuracy that compares a model's predictions to a baseline (like random guessing).

Broken Symmetry

Techniques

A situation where the underlying physics has symmetry, but observations reveal a preferred direction or asymmetry due to measurement constraints.

Budget Forcing

Techniques

A reinforcement learning technique that constrains model outputs to stay within a token budget, reducing response length while maintaining accuracy.

Byte-Level Tokenization

Formats

Breaking text into individual bytes (raw character codes) rather than words or subwords, which allows the model to handle any text without a predefined vocabulary.

Byzantine Robustness

Techniques

The ability of a system to function correctly even when some participants behave maliciously or unpredictably.

C

Calibration

Techniques

Adjusting a model's predictions using held-out data to correct for systematic biases or distribution differences.

Camera Pose Estimation

Techniques

Determining the position and orientation of a camera in 3D space relative to a scene.

Canonical Correlation Analysis

Techniques

A statistical technique that finds the strongest correlations between two sets of variables by discovering shared patterns.

Capability Elicitation

Techniques

Training process designed to extract or develop specific abilities from a model, like reasoning or tool use.

Capacity Scaling

Techniques

How the number of storable associations grows with the size of the memory matrix or system parameters.

Capital Market Assumptions

Techniques

Forecasts of future returns, volatility, and correlations for different asset classes used to guide investment decisions.

Capsule Neural Networks

Techniques

Neural networks with capsule units that learn hierarchical relationships and spatial properties better than traditional convolutional layers.

Cascaded Cross-Attention

Techniques

A mechanism that sequentially combines information from multiple sources (global context, object details, skill knowledge) to guide model decisions.

Cascaded Pipeline

Techniques

Sequential processing where output from one stage feeds into the next.

Cascaded ROI-Narrowing

Techniques

A strategy where each model focuses on progressively smaller regions of interest to improve accuracy.

Cascaded Routing

Techniques

A multi-stage process that progressively assigns incidents to the correct business team or service owner.

Case Sensitivity

Behavior

The model's ability to distinguish between uppercase and lowercase letters as meaningful differences, treating 'Москва' and 'москва' as separate tokens with different meanings.

Case-Insensitive (Uncased)

Behavior

A model that treats uppercase and lowercase letters as identical, so 'Apple' and 'apple' are processed the same way.

Case-Sensitive

Behavior

The model treats uppercase and lowercase letters as distinct, allowing it to recognize proper nouns and maintain capitalization distinctions.

Cased Text

Formats

Text processing that preserves the distinction between uppercase and lowercase letters, treating 'Apple' and 'apple' as different tokens.

Cased Text Handling

Behavior

The model's ability to distinguish between uppercase and lowercase letters, making it sensitive to proper nouns and capitalization patterns that carry meaning.

Catastrophic Forgetting

Techniques

When a model loses its original knowledge while learning a new task, like overwriting old skills.

Causal Generative Model

Techniques

A model that learns causal relationships between variables and can answer observational, interventional, and counterfactual questions.

Causal Identification

Techniques

The ability to determine true cause-and-effect relationships from data, typically guaranteed by randomization.

Causal Inference

Techniques

Determining whether a treatment actually caused an outcome, not just whether they're correlated.

Causal Intervention

Techniques

Deliberately modifying a model's internal features to measure their direct effect on outputs.

Causal Language Model

Architecture

A model that predicts the next word in a sequence by only looking at previous words, not future ones, making it suitable for text generation.

Causal Language Modeling

Training

A training approach where the model predicts the next word based only on previous words, commonly used for text generation tasks.

Causal Reasoning

Techniques

Understanding cause-and-effect relationships rather than just statistical correlations in data.

Causal Survival Forests

Techniques

A machine learning method that estimates personalized treatment effects from survival data using tree-based models.

CC-BY-4.0 License

Licensing

A permissive open-source license that allows anyone to use, modify, and distribute the model as long as they give credit to the original creator.

CC-BY-NC-4.0 License

Licensing

A Creative Commons license that allows free use and modification of the model for non-commercial purposes only, with attribution required.

CEGAR (Counterexample-Guided Abstraction Refinement)

Techniques

A problem-solving technique that starts with a simplified version of a problem and refines it when solutions fail.

Ceiling Compression

Techniques

A statistical phenomenon where most scores cluster near the maximum possible value, reducing the ability to distinguish between different quality levels.

Ceiling effect

Techniques

When a benchmark becomes too easy and models achieve near-perfect scores, making it impossible to compare their true abilities.

Censoring

Techniques

Training an AI model to refuse or provide false information about certain topics.

Centered Kernel Alignment (CKA)

Techniques

Metric that measures structural similarity between representations by comparing their kernel matrices.

Central Limit Theorem

Techniques

A statistical principle stating that the average of many independent samples approaches a normal distribution.

Chain-of-Thought

Techniques

A reasoning technique where an AI model shows its step-by-step thinking process before arriving at a final answer, making its logic transparent and verifiable.

Chain-of-Thought Reasoning

Techniques

A technique where a model works through a problem step by step, showing its reasoning process before arriving at a final answer.

Channel Circuit

Techniques

A quantum circuit composed of quantum channels (operations that map quantum states to quantum states) rather than unitary gates alone.

Channel State Information (CSI)

Techniques

Raw wireless signal data that describes how a Wi-Fi signal changes as it travels through space and bounces off objects.

Channel-wise Affine Transform

Techniques

A learnable operation that scales and shifts different feature channels independently in a neural network.

Channel-wise Decay

Techniques

Applying different forgetting rates to different feature channels in a neural network, allowing selective memory retention.

Chaotic Dynamics

Techniques

Systems where small changes in initial conditions lead to drastically different outcomes, making long-term prediction extremely difficult.

Character Consistency

Behavior

The ability of a model to maintain a character's voice, personality, and backstory throughout a conversation without contradicting itself.

Character Error Rate (CER)

Techniques

A metric measuring the percentage of characters incorrectly recognized by an OCR system.

Character Voice

Behavior

A model's ability to maintain distinct, consistent personality and speech patterns for different characters within a story.

Character-Level Processing

Architecture

Processing text one character at a time rather than by words, which is useful for catching individual character errors in languages like Chinese.

Chart-Grounded Reasoning

Techniques

The ability to extract information from visual charts and perform logical reasoning tasks based on what the chart displays.

Chat Model

Training

A language model specifically trained to have natural back-and-forth conversations with users rather than just completing text.

Chat-Optimized

Training

A model specifically trained and tuned to excel at conversational interactions rather than other tasks like analysis or reasoning.

Chat-Tuned

Training

A model optimized through training to excel at multi-turn conversations and dialogue, rather than single-turn text completion.

Checkpoint

Training

A saved snapshot of a model's weights and state at a specific point during training, allowing training to resume or the model to be evaluated at that stage.

Checkpoints

Training

Saved snapshots of a model at different stages of training, allowing researchers to study how the model's behavior changes as it learns.

Chunk-Based Processing

Techniques

Breaking long sequences into smaller segments and processing them sequentially while maintaining state between chunks.

Chunking

Techniques

The process of breaking large documents into smaller pieces so a model with a limited context window can process them separately.

Citation Networks

Training

A graph structure showing how research papers reference each other, used to understand relationships and influence between scientific works.

Citation Tracking

Behavior

The ability to identify, reference, and maintain accurate attribution to the sources used when generating a response.

Claim Frequency

Techniques

The number of insurance claims expected per policy or geographic area over a time period.

Class Activation Mapping (CAM)

Techniques

A technique that generates visual heatmaps showing which image regions a neural network uses to make predictions.

Class Imbalance

Techniques

When training data has unequal numbers of examples across categories, with some classes having far fewer samples than others.

Class Incremental Learning

Techniques

Learning to recognize new object classes over time while maintaining performance on previously seen classes.

Class-Level Code Synthesis

Techniques

Generating complete, structured classes with multiple methods and internal dependencies from a specification.

Class-Weighted Cross-Entropy Loss

Techniques

A loss function that penalizes misclassification of rare classes more heavily, useful when training data is imbalanced.

Classical Test Theory

Techniques

A statistical framework for designing and validating tests that measure psychological constructs reliably.

Classifier

Architecture

A machine learning model trained to assign input data into predefined categories or labels.

Classifier-Free Guidance (CFG)

Techniques

A technique that steers diffusion models toward desired outputs by comparing conditional and unconditional predictions.

Clinical Alignment

Techniques

How well an LLM's medical communication matches established clinical standards and physician practices.

Clinical Ethics

Techniques

The study of moral principles and values that guide medical decision-making and patient care.

Clinical Event Tokenization

Techniques

Converting clinical information (diagnoses, medications, procedures) into discrete tokens that a model can process.

Clinical Forecasting

Techniques

Using historical patient data to predict future health outcomes, disease progression, or treatment responses.

Clinical Language Understanding

Behavior

The ability to accurately interpret and reason about medical terminology, patient symptoms, and healthcare documentation.

Clinical NLP

Behavior

Natural language processing applied to medical and healthcare text, such as extracting diagnoses or findings from doctor's notes and radiology reports.

Clinical Reasoning

Behavior

The ability to analyze medical information, connect symptoms to conditions, and make logical healthcare decisions based on evidence.

Clinical Validation

Techniques

The process of confirming that an AI system's outputs meet clinical standards and are safe for use in healthcare.

CLIP (Contrastive Language-Image Pre-training)

Techniques

A model trained on image-text pairs to create shared vector representations for both images and text.

CLIP Architecture

Architecture

A neural network design that learns to match images and text by training them to have similar representations, enabling tasks like image search and visual understanding.

Closed-Form Head Adaptation

Techniques

Rapidly adjusting a model to new tasks using direct mathematical solutions rather than iterative training.

Closed-Form Solution

Techniques

A mathematical formula that directly computes an answer without iterative learning or optimization.

Closed-Loop Control

Techniques

A system that continuously adjusts its behavior based on feedback from its actions and outcomes.

Closed-Loop Policy

Techniques

A control strategy where the robot observes its current state and adjusts actions based on feedback, rather than executing a fixed sequence.

Co-activation

Techniques

When multiple features in a neural network are active at the same time, often because they represent related concepts.

Coalition-proof equilibrium

Techniques

An equilibrium where no group of players can jointly deviate and all benefit, even if they coordinate.

Coarse Correlated Equilibrium

Techniques

A game theory solution where no player benefits from unilaterally deviating from a recommended strategy.

Coarse-to-Fine Feature Encoding

Techniques

A strategy that first captures broad patterns, then progressively refines details for better understanding.

Coarse-to-fine reasoning

Techniques

A sequential decision-making approach that starts with broad estimates and progressively refines them to higher precision.

Coarse-to-Fine Training

Techniques

A curriculum learning approach that starts with learning simple components before progressing to optimizing complex global structures.

Code Clone Detection

Techniques

Identifying sections of code that perform the same function, even if written differently or in different programming languages.

Code Completion

Behavior

The ability to automatically suggest or generate the next lines of code based on what the programmer has already written.

Code Coverage

Techniques

Percentage of program code executed by a test suite, measured by lines or branches.

Code Editing

Behavior

A specialized task where a model modifies or refines existing code rather than creating new code, focusing on precision and surgical changes.

Code Embedding

Techniques

A specialized embedding designed specifically for source code that understands programming syntax and semantics, enabling tasks like code search and finding similar code snippets.

Code Generation

Behavior

The ability of a model to write, complete, or suggest programming code based on prompts or partial code input.

Code Pretraining

Training

Training a language model primarily on source code and technical documentation rather than general text, making it specialized for coding tasks.

Code Quality

Techniques

A measure of how well code meets standards for readability, maintainability, and correctness.

Code Reasoning

Behavior

The ability of a model to understand, analyze, and make logical inferences about source code and programming logic.

Code Refactoring

Techniques

Restructuring existing code without changing its external behavior to improve readability and maintainability.

Code Review

Techniques

Process of examining code changes for bugs, quality issues, and adherence to standards before merging.

Code Synthesis

Techniques

Automatically generating executable code (like plotting commands) from high-level specifications or natural language descriptions.

Code Understanding Verification

Techniques

Techniques to confirm a student actually understands the code they wrote, not just copied it.

Code-Focused Language Model

Training

A language model specifically trained on programming code to excel at tasks like code generation, completion, and understanding.

Code-Specialized

Training

A model trained with a focus on understanding and generating programming code across multiple languages.

Code-Specialized Language Model

Training

A language model trained specifically on programming code and related tasks, optimized to understand and generate code better than general-purpose models.

Code-Specialized Model

Training

A language model trained specifically on programming code and code-related tasks rather than general text.

Code-Switching

Behavior

The ability to naturally mix two languages within the same text or conversation, switching between them based on context rather than treating them as separate.

Codebook

Techniques

A lookup table mapping compressed values back to original data; avoided in this approach to save memory.

Coding Agent

Techniques

An AI system that autonomously writes, debugs, and executes code to solve tasks without human intervention.

Coefficient of Variation

Techniques

A normalized measure of variability that expresses standard deviation as a percentage of the mean, useful for comparing spread across different scales.

Cognate

Techniques

Words in different languages that share a common historical origin and similar meaning.

Cognate Detection

Techniques

Identifying words in different languages that share a common origin and similar meaning.

Cognitive Architecture

Techniques

A computational framework that models how an intelligent agent perceives, reasons, and acts in the world.

Cognitive Gating

Techniques

A mechanism that gates speculative execution based on model confidence, without requiring ground-truth labels.

Cognitive Load Theory

Techniques

A psychological framework explaining how working memory capacity affects learning and task performance.

Cognitive Support

Techniques

AI assistance that helps users think through problems and refine their goals rather than just executing stated requests.

Coherence

Behavior

The quality of maintaining consistent meaning and logical flow across multiple sentences or exchanges in a conversation.

Coherence Budget

Techniques

The maximum circuit depth a quantum computer can execute before quantum information is lost to noise and decoherence.

ColBERT Architecture

Architecture

A neural retrieval model design that stores multiple token-level embeddings per document and uses late interaction to achieve higher retrieval accuracy than single-vector approaches.

Cold-Start Stalling

Techniques

When a model trained with sparse rewards gets stuck early because initial success probability is too low to learn from.

Collinearity

Techniques

When input features are highly correlated with each other, making it difficult to isolate individual feature effects on predictions.

Combinatorial Optimization

Techniques

Finding the best arrangement or selection from a finite set of possibilities, like packing objects efficiently.

Common Ground

Techniques

Shared beliefs and mutually recognized facts that enable effective collaboration between people or AI systems.

Common Sense Reasoning

Behavior

The ability of a model to understand and apply everyday logic and practical knowledge about how the world works.

Communication Efficiency

Techniques

Minimizing the amount of data exchanged between devices or servers during distributed training.

Community Fine-Tune

Training

A model variant created and shared by the community rather than the original model creators, often with custom modifications.

Compact Model

Architecture

A smaller language model designed to use fewer computational resources while still performing useful tasks.

Competence-Aware Verification

Techniques

Evaluating reward quality relative to the current policy's skill level, recognizing that reward rankings change as the policy improves.

Competency Questions

Techniques

Natural language questions that define what an ontology should be able to answer, used to specify system requirements.

Complete Positivity

Techniques

A quantum physics constraint ensuring operations preserve valid quantum states and probabilities.

Completion Mode

Behavior

A text generation approach where the model continues or completes text from a given prompt, rather than engaging in back-and-forth conversation.

Completion Prompt

Behavior

A prompt style where you provide the beginning of text and the model continues it, rather than asking a direct question.

Complex Reasoning

Behavior

The ability to work through multi-step problems, analyze nuanced information, and draw logical conclusions.

Compliance Certification

Deployment

Official verification that a service meets specific regulatory or security standards required by industries like healthcare or finance.

Compliance Certifications

Deployment

Official verifications that a service meets specific security and regulatory standards (like HIPAA or SOC 2) required by certain industries.

Component Interaction Bias

Techniques

Discrimination that emerges from how separate system components work together, not from individual parts alone.

Component-Based Architecture

Architecture

A design pattern where UIs are built from reusable, self-contained pieces (components) that can be combined to create larger interfaces.

Compositional Generalization

Techniques

Model's ability to understand new combinations of learned concepts.

Compositional Prompts

Behavior

Text descriptions that specify multiple elements, their relationships, and spatial arrangements in the desired image.

Compositional Semantics

Techniques

The principle that the meaning of a complex expression is built from the meanings of its parts and how they combine.

Compositionality

Techniques

The ability to understand new combinations of concepts by learning how individual components combine.

Computational Budget

Deployment

The amount of processing power and memory available to run a model, which determines how much computation can be performed.

Computational Complexity

Techniques

The amount of computation (time and memory) required for an algorithm to solve a problem.

Computational Efficiency

Performance

The ability to deliver good results while using less processing power and memory than larger models.

Computational Exploration

Techniques

Using code and algorithms to test mathematical hypotheses and discover patterns empirically.

Computational Footprint

Deployment

The amount of memory, processing power, and time required to run a model; a smaller footprint means the model can run on less powerful hardware.

Computational Overhead

Performance

The extra processing power, memory, or time required to run a model, which impacts speed and resource consumption.

Computational Photography

Techniques

Using algorithms and AI during image capture to enhance photos beyond what the camera sensor alone can achieve.

Compute Allocation

Performance

The strategic distribution of a model's processing power—in this case, spending more computational effort on thinking through problems rather than other tasks.

Compute Efficiency

Performance

How well a model performs relative to the computational resources (processing power and memory) required to run it.

Compute-Efficient

Performance

A model designed to run with minimal processing power and memory, making it practical for devices with limited resources.

Compute-in-Memory

Techniques

Hardware architecture that performs computation directly within memory, reducing data movement bottlenecks.

Compute-Optimal

Techniques

Achieving the best performance for a given amount of computational resources.

Computer Use

Behavior

The ability for an AI model to interact with computer interfaces, navigate software applications, and execute actions on a user's behalf by understanding and responding to visual or textual representations of screens.

Concept Bottleneck Model (CBM)

Techniques

An interpretable model that makes predictions by routing inputs through a layer of human-understandable concepts rather than opaque features.

Concept Manifold

Techniques

A low-dimensional geometric structure where related concepts are organized continuously, like a curved surface in high-dimensional space.

Concept Normalization

Techniques

The process of mapping different textual expressions of the same idea to a single standardized representation, such as mapping 'MI' and 'myocardial infarction' to the same medical concept.

Conditional Advantage Estimation

Techniques

A reinforcement learning technique that estimates action value only within trajectories meeting specific conditions.

Conditional Coverage

Techniques

A property where prediction set coverage guarantees hold for specific subgroups or conditions, not just on average across all data.

Conditional Entropy

Techniques

A measure of uncertainty in predicted tokens given context; low entropy signals memorization, high entropy signals generalization.

Conditional Expected Distance

Techniques

The average distance between selected and validation target embeddings within a cluster, used to rank training examples.

Conditional Generation

Behavior

The ability of a model to generate output (like text) based on specific input conditions or prompts provided to it.

Conditional Mean

Techniques

The expected value of an output given specific input conditions, used as a deterministic baseline prediction.

Conditional Misalignment

Techniques

Misaligned behavior that only appears when inputs share features with the training data, while appearing safe on out-of-distribution prompts.

Conditional Neural Processes

Techniques

A probabilistic model that learns to make predictions by conditioning on observed examples, useful for few-shot learning and uncertainty estimation.

Conditional Text Generation

Behavior

The ability to generate text that follows specific conditions or constraints, rather than producing output freely.

Conditional Value-at-Risk (CVaR)

Techniques

A risk metric that focuses on the worst-case outcomes rather than average performance, useful for safety-critical tasks.

Conditional Variational Autoencoder (CVAE)

Techniques

A neural network that learns to generate new data matching specific conditions or constraints.

Conditioning

Techniques

Guiding a generative model's output by providing additional input signals like pose or depth maps.

Confidence Calibration

Techniques

Ensuring a model's confidence scores accurately reflect its true probability of being correct.

Confidence Estimation

Techniques

Assigning uncertainty scores to model predictions to identify outputs that may need human verification.

Confidence Intervals

Techniques

Statistical bounds around predictions that quantify uncertainty; here used to identify when model predictions are unreliable.

Confidence Thresholding

Techniques

A decoding strategy that stops refining tokens when model confidence exceeds a set threshold.

Confidence-based abstention

Techniques

Refusing predictions when the model's confidence score is below a threshold.

Confidence-Based Decoding

Techniques

A strategy that selects which tokens to generate next based on the model's prediction confidence, enabling adaptive and efficient generation.

Confidence-Driven Reinforcement Learning

Techniques

Training a model using rewards based on how well its confidence scores match its actual correctness.

Confidence-Informed Self-Consistency (CISC)

Techniques

Weighted majority voting where each candidate answer gets a confidence score from a critic model before selection.

Confirmation Bias

Techniques

The tendency to seek or interpret information in ways that confirm existing beliefs or outputs.

Conflicts of Interest

Techniques

Situations where an AI system has competing goals—like serving users well versus generating revenue for its creators.

Conformal Prediction

Techniques

Method providing prediction intervals with statistical guarantees on coverage.

Conformational Control

Techniques

The ability to direct a model to generate specific 3D shapes or structural states of proteins.

Conformational State

Techniques

A distinct 3D shape or arrangement that a protein can adopt, often with different biological functions.

Conformational Transfer

Techniques

Applying a learned conformational change from one protein to structurally similar proteins in the same family.

Confused Deputy Problem

Techniques

When an agent misuses its elevated permissions to perform actions it shouldn't, tricked by user input.

Consensus Architecture

Techniques

A routing pattern where multiple neurons must agree (be mutually exclusive) to activate a particular processing path.

Constitutional AI

Training

A safety training approach that guides a model to behave according to a set of principles or rules, helping it generate more helpful and harmless responses.

Constrained Decoding

Techniques

Restricting a model's token generation to a predefined set of allowed tokens during inference.

Constrained Generation

Techniques

Text generation that must follow specific rules or constraints, such as producing output in a particular format or structure.

Constrained Reinforcement Learning

Techniques

Training an AI system to maximize performance while respecting hard constraints (like deadlines or budgets).

Constraint Satisfaction

Techniques

Finding solutions that satisfy a set of constraints, used here to resolve conflicts between inferred events.

Constraint Solver

Techniques

A tool that finds valid solutions to problems with multiple constraints, used here to verify mechanical assembly feasibility.

Constraint-Guided Execution

Techniques

Validating each step of a plan by checking outputs against automatically derived constraints based on task requirements.

Constraint-Guided Repair

Techniques

Fixing errors in reasoning by making minimal changes that satisfy logical or evidential constraints.

Construct Validity

Techniques

Whether a study actually measures the real concept it's supposed to test, not something else.

Contact-Gating

Techniques

A mechanism that activates learned corrections only when the robot is physically touching the object.

Contact-Rich Dynamics

Techniques

Physical interactions where the robot frequently touches and manipulates objects, making control sensitive to small errors.

Contact-Rich Manipulation

Techniques

Robot tasks where success depends critically on precise control of forces and contact interactions with objects.

Content Filter

Deployment

A model or system that screens text before or after generation to block unsafe, harmful, or policy-violating content.

Content Filtering

Behavior

Safety mechanisms built into a model that prevent it from generating harmful, inappropriate, or restricted content.

Content Moderation

Behavior

The process of reviewing and filtering text or other content to remove or flag material that violates policies or safety guidelines.

Content Safety Classification

Behavior

The task of automatically detecting and categorizing text that violates policies or could cause harm, such as hate speech, violence, or misinformation.

Context Coherence

Behavior

The ability to maintain consistent meaning and logical flow when processing long sequences of text or conversation.

Context Consistency

Performance

A model's ability to maintain coherent understanding and recall of information across long passages of text without contradicting itself.

Context distillation

Techniques

Transferring knowledge from interaction trajectories into model parameters by learning from contextual examples.

Context Filtering

Techniques

Retaining only relevant information from execution history to reduce noise and improve decision-making in subsequent steps.

Context Gathering

Techniques

The process of collecting and organizing relevant information from history to answer specific questions or solve tasks.

Context Length

Architecture

The maximum amount of previous text a model can consider when generating its next output; longer context allows the model to maintain coherence over longer passages.

Context Management

Techniques

Organizing and maintaining relevant information for AI decision-making.

Context Parallelism

Techniques

A technique to process long sequences by distributing context across multiple devices or processing units in parallel.

Context Pollution

Techniques

Irrelevant or noisy information degrading model performance in a given context.

Context Retention

Performance

A model's ability to remember and use information from earlier parts of a conversation or document.

Context Truncation

Techniques

When an AI model's input context window fills up and earlier information is lost, requiring mechanisms to preserve key data.

Context Window

Architecture

The maximum number of tokens a model can process in a single conversation or prompt.

Context-Adaptive

Techniques

A system that adjusts its behavior based on the specific input or situation rather than using fixed, unchanging patterns.

Context-Aware ASR

Techniques

Speech recognition that uses surrounding information like conversation history to improve transcription accuracy.

Context-Free Grammar (CFG)

Techniques

A formal system of rules that defines which sequences of symbols are valid in a language.

Context-Intensive Tasks

Techniques

Problems requiring the model to extract and use large amounts of information from the input prompt to generate correct outputs.

Context-Specific Guidance

Techniques

Help or instructions tailored to the current situation rather than generic pre-stored information.

Contextual Adaptation

Techniques

Adjusting model behavior dynamically based on the specific input or context rather than using fixed settings.

Contextual Bandit

Techniques

A learning algorithm that selects actions based on context and learns from feedback to improve future decisions.

Contextual Embeddings

Architecture

Numerical representations of text that capture meaning based on surrounding context, rather than treating each word independently.

Contextual Invariance

Techniques

The assumption that a model produces consistent outputs when a task is reformulated in contextually equivalent ways.

Contextual Pressure

Techniques

Influence from surrounding information (like examples or previous actions) that pushes an agent away from its intended behavior.

Contextual Reasoning

Techniques

Making decisions by considering how individual observations relate to and inform each other within a broader context.

Contextual Representation

Architecture

A way of encoding text where the meaning of each word depends on the words around it, rather than being fixed for every occurrence.

Contextual Space

Techniques

The intermediate representation space in a diffusion model where semantic and structural information is encoded.

Contextual Topic Modeling

Techniques

Machine learning technique that identifies recurring themes in text while considering the surrounding context of words.

Contextual Trigger

Techniques

A feature or pattern in input text that activates hidden misaligned behavior in a model, even when standard evaluations show the model is safe.

Contextual uncertainty

Techniques

Uncertainty caused by changing conditions over time, like user preferences shifting.

Contextual Understanding

Behavior

The ability of a model to interpret the meaning of words and phrases based on surrounding text, rather than treating each word in isolation.

Contextualized Token Embeddings

Techniques

Vector representations of words that change based on surrounding context, capturing different meanings in different sentences.

Continual Fine-tuning

Techniques

Incrementally updating a neural network on new data as it arrives, rather than retraining from scratch.

Continual Learning

Techniques

Training models to learn new tasks without forgetting previously learned ones.

Continued Pretraining

Techniques

Further training a pretrained model on domain-specific data to specialize it for particular tasks.

Continuous Measurement

Techniques

Real-time monitoring of a quantum system that produces a stream of measurement data used to update state estimates.

Continuous Representation

Techniques

Encoding data as smooth, unquantized values rather than discrete tokens, preserving fine-grained temporal details.

Contraction

Techniques

A mathematical property ensuring a system's outputs converge to a stable state regardless of initial conditions.

Contractivity

Techniques

A mathematical property ensuring that a system brings nearby states closer together over time, guaranteeing stability.

Contrastive Learning

Techniques

A training technique that learns by comparing similar and dissimilar examples to create better representations.

Contrastive Loss

Techniques

Training objective that pulls similar examples together and pushes different ones apart.

Contrastive retrieval

Techniques

A method that learns shared embedding spaces by contrasting similar and dissimilar image pairs, then ranks candidates by similarity.

Contribution Decomposition

Techniques

Breaking down a neural network's output into individual contributions from different neurons or neuron groups.

Control Codes

Techniques

Special tokens added at the beginning of a prompt that tell the model what style, domain, or format to use for its output.

Control Tokens

Techniques

Special tokens inserted into sequences to guide model behavior, such as signaling whether to show an ad or organic content.

Controller Synthesis

Techniques

Automatically designing a decision-making system that controls when and how to execute actions.

ControlNet

Techniques

A technique that adds spatial control to diffusion models by conditioning generation on aligned input maps (like depth or property masks).

Convection-dominated

Techniques

Physics problems where fluid flow effects dominate over diffusion, creating sharp gradients and moving fronts.

Convergence

Training

The point during training when a model's performance stabilizes and stops improving significantly, indicating it has learned the patterns in the data.

Convergence Guarantees

Techniques

Mathematical proofs that an algorithm will reach a correct solution under specified conditions.

Convergence Rate

Techniques

How quickly an optimization algorithm approaches the optimal solution, typically expressed as a function of iterations.

Convergent Evolution

Techniques

When different models independently learn similar features or representations from different training signals.

Conversational AI

Behavior

AI systems designed to understand and respond to human language in natural, dialogue-like interactions.

Conversational AI Agent

Techniques

An AI system designed to conduct multi-turn dialogue with users to accomplish specific tasks, like medical interviewing.

Conversational Assessment

Techniques

Using dialogue with a chatbot or AI agent to probe and verify student understanding through questioning.

Conversational Coherence

Behavior

The model's ability to maintain logical consistency and relevance across multiple turns of dialogue, making responses feel natural and connected.

Conversational Fluency

Behavior

How naturally and coherently a model engages in back-and-forth dialogue, matching human conversation patterns.

Conversational Language Model

Training

A model specifically trained to understand and generate natural dialogue, optimized for back-and-forth interactions rather than one-off text generation.

Conversational Model

Behavior

A language model specifically trained and optimized to engage in multi-turn dialogue with users.

Convex Function

Techniques

A function where any line segment between two points on the curve lies above the curve, ensuring a single global minimum.

Convex Optimization

Techniques

Mathematical technique for finding the best solution to a problem with a single global optimum.

Convex Polytope

Techniques

A geometric shape formed by the intersection of linear inequalities, with vertices representing extreme points.

Convolutional Operations

Architecture

A technique that scans across input data using small filters to detect local patterns, commonly used in image processing but here applied to text for efficiency.

Coordinate Reference System (CRS)

Techniques

The geographic coordinate system (e.g., latitude/longitude) used to define spatial locations and ensure consistency across operations.

Coordination Games

Techniques

Game theory scenarios where agents benefit from matching actions but may also benefit from strategic differentiation.

Copy-on-Write

Techniques

An optimization where data is only copied when modified, allowing multiple references to share the same data until changes occur.

Core-Periphery Attention

Techniques

An attention mechanism where peripheral tokens (patches) interact only through central core tokens, reducing computation.

Coreference Resolution

Techniques

Identifying when different mentions in text refer to the same entity or concept.

Corpus

Techniques

A collection of documents or text used as the knowledge base for retrieval in RAG systems.

Corpus-Discriminative Retrieval

Techniques

Selecting query terms that best distinguish relevant documents from irrelevant ones in a specific corpus.

Correctness Gating

Techniques

A filtering mechanism that validates whether a proposed solution is correct before allowing it to advance in a search process.

Corruption Robustness

Techniques

A model's ability to maintain performance when input data is degraded (e.g., noise, blur, missing values).

Cosine Distance

Evaluation

A mathematical measure that compares how similar two embeddings are by calculating the angle between them, with values closer to 1 meaning more similar.

Cosine Similarity

Performance

A method of comparing two vectors based only on their direction, ignoring their magnitude, making it scale-invariant.

CosNet

Techniques

A learnable activation function using cosine waves with adjustable frequency and phase to process data nonlinearly.

Cost-Aware Attack

Techniques

An adversarial attack that accounts for the real-world cost or feasibility of modifying each feature.

Cost-Efficiency

Deployment

The ability to deliver useful results while using fewer computational resources, reducing the expense of running the model.

CoT-MAE

Training

A training methodology that combines chain-of-thought reasoning with masked autoencoder techniques to improve model understanding of text relationships.

Counterfactual Evaluation

Techniques

Testing what would happen if you changed a strategy, without actually running the experiment in the real world.

Counterfactual Explanation

Techniques

An explanation showing what input changes would alter a model's prediction to a different outcome.

Counterfactual Generation

Techniques

Creating alternative scenarios showing what would happen if something were different (e.g., if an object didn't exist).

Counterfactual Negatives

Techniques

Training examples where evidence is semantically related but contradicts the claim, testing if models truly use evidence.

Counterfactual Query

Techniques

A question about what would have happened if a variable had taken a different value (e.g., 'what if the patient had received treatment?').

Counterfactual Reasoning

Techniques

Reasoning about what would have happened under different actions or conditions than what actually occurred.

Covariance Estimation

Techniques

The process of learning or updating the statistical properties of measurement and process noise in a filtering system.

Covariance Matching

Techniques

Aligning a model's sensitivity structure to match the statistical structure of task-irrelevant variations in data.

Covariate Shift

Techniques

When the distribution of input data changes between training and real-world use, causing models to fail.

Coverage Estimation

Techniques

Measuring what proportion of a problem space a model can reliably handle.

Coverage Path Planning

Techniques

Finding an efficient route for a vehicle to visit all cells or areas in a region.

Coverage Verification

Techniques

The process of proving that testing has comprehensively covered all relevant operating conditions and edge cases.

Coverage-Guided Testing

Techniques

Testing approach that systematically explores different input regions to find edge cases and failures.

CPTP Operation

Techniques

A quantum operation that preserves physical validity by maintaining positivity and trace properties of quantum states.

CPU Inference

Deployment

Running a model's predictions using a computer's central processor rather than a specialized graphics card, which is slower but requires less specialized hardware.

Creative Utility

Techniques

A measure of how useful and novel the connections a model generates are for creative tasks.

Credit Assignment

Techniques

The process of determining which actions or steps in a sequence deserve reward or blame for the final outcome.

Criterion-level Feedback

Techniques

Detailed feedback that scores responses across multiple specific evaluation criteria rather than a single overall score.

Critique Agent

Techniques

An agent that reviews and validates the recommendations and execution plan of other agents to ensure correctness and coherence.

Cross Attention

Techniques

Mechanism allowing one sequence to attend to and focus on another sequence.

Cross-Architecture Transfer

Techniques

Transferring knowledge between models with fundamentally different designs, attention mechanisms, or tokenizers.

Cross-Attention Adapter

Techniques

A neural module that merges information from two sources by learning which parts of each are most relevant.

Cross-Dataset Transfer

Techniques

Testing whether a model trained on one dataset generalizes to perform the same task on a different dataset.

Cross-domain Mapping

Techniques

A creativity technique where ideas from one unrelated domain are applied to solve problems in another domain.

Cross-embodiment Transfer

Techniques

Learning to control one body type (like a humanoid robot) using data from a different body type (like humans).

Cross-Encoder

Architecture

A model architecture that takes a query and document together as input and directly outputs a relevance score, unlike dual-encoders that score them separately.

Cross-Entropy Loss

Techniques

A loss function that measures how well a predicted probability distribution matches a target distribution.

Cross-Environment Deployment

Techniques

Running an AI model in different network environments or systems than the one it was trained on.

Cross-Lingual

Behavior

The ability to understand relationships and transfer knowledge between different languages, such as answering a question in one language based on text in another.

Cross-Lingual Awareness

Behavior

The ability of a model to understand and relate concepts across different languages, allowing it to find similarities between text in different languages.

Cross-Lingual Capability

Behavior

The ability of a model to understand and work with multiple languages, sometimes even translating concepts between them.

Cross-Lingual Consistency

Behavior

The ability of a model to represent similar meanings in different languages as nearby points in its vector space, so translations and equivalent concepts are treated as semantically close.

Cross-lingual Generalization

Techniques

The ability of a model or probe trained on one language to work effectively on other languages.

Cross-Lingual Matching

Behavior

The ability to find and compare similar content across different languages by representing them in a shared mathematical space.

Cross-Lingual Retrieval

Behavior

The ability to find relevant documents or text in one language when searching with a query in a different language.

Cross-Lingual Semantic Similarity

Behavior

The ability to recognize that sentences or phrases in different languages have the same or similar meaning and represent them close together in numerical space.

Cross-Lingual Similarity

Behavior

The ability to measure how similar two sentences are even when they are written in different languages.

Cross-Lingual Transfer

Behavior

The ability of a model trained on multiple languages to apply knowledge learned from one language to understand or generate text in another language.

Cross-Lingual Understanding

Behavior

The ability of a model to comprehend relationships and meanings across different languages, enabling tasks like translation and multilingual reasoning.

Cross-Modal Alignment

Techniques

Connecting representations from different types of data (like speech and text) so they work together effectively.

Cross-Modal Attack

Techniques

An attack that manipulates multiple input types (like images and text) together to deceive a model.

Cross-modal Attention

Techniques

A mechanism that aligns and weights information between different modalities like images and text.

Cross-Modal Consistency

Techniques

Ensuring that representations across different modalities (images, 3D, text) align and reinforce each other.

Cross-Modal Convergence

Techniques

Alignment in how models from different modalities (e.g., vision and language) represent the same stimulus.

Cross-Modal Fusion

Techniques

The process of combining information from multiple modalities (e.g., vision and text) into a unified representation.

Cross-Modal Inconsistency

Techniques

When a model produces contradictory predictions for the same concept represented in different modalities.

Cross-Modal Matching

Behavior

The ability to find relationships between different types of content, such as matching natural language descriptions to code snippets.

Cross-Modal Reasoning

Behavior

The ability to connect and reason about information from different input types (like audio and video) together to draw conclusions.

Cross-Modal Retrieval

Techniques

The ability to search and find relevant items across different data types, such as finding images using text queries or vice versa.

Cross-modal Semantic Sharing

Techniques

The ability of a model to share semantic understanding between different input modalities like vision and text.

Cross-Modal Similarity

Behavior

The ability to measure how closely related content from different types of input (like images and text) are to each other.

Cross-Modality Message Passing

Techniques

Exchanging information between different input types (text and vision) to guide compression decisions.

Cross-module Reasoning

Techniques

The ability of AI tools to access information from other modules and make decisions based on shared context.

Cross-Price Effects

Techniques

How the demand for one product changes when the price of a different product changes.

Cross-Script Generalization

Techniques

The ability of a model to perform consistently when input text or audio switches between different writing systems or languages.

Cross-Source Reconciliation

Techniques

The process of comparing and resolving conflicting information from multiple sources to determine accurate answers.

Cross-subject generalization

Techniques

A model's ability to work on new individuals without retraining, despite differences in neural anatomy.

Cross-View Attention

Techniques

A mechanism that transfers motion information from one camera viewpoint to another while maintaining consistency.

Cross-View Correlation

Techniques

The degree to which internal representations align when processing the same task in different formats or modalities.

Cross-view matching

Techniques

Aligning images captured from different viewpoints (e.g., street-level and overhead) to find correspondences.

Cubic surface

Techniques

A 3-dimensional algebraic variety defined by a degree-3 polynomial equation.

CUDA

Techniques

NVIDIA's parallel computing platform that runs code on GPUs to process many tasks simultaneously.

Cuda Kernels

Techniques

Optimized GPU code that performs specific computational operations efficiently.

Cultural Reasoning

Techniques

The ability to understand and infer cultural context, significance, and metadata from visual or textual information.

Cumulants

Techniques

Statistical measures that describe probability distributions, used to track activation behavior.

Curated Dataset

Training

Training data that has been carefully selected and filtered to include only high-quality examples relevant to specific tasks or domains.

Curated Training Data

Training

Carefully selected and filtered training examples chosen for quality rather than quantity, often resulting in models that produce more structured and reliable outputs.

Curiosity-Driven Reinforcement Learning

Techniques

RL approach where agents explore by seeking states where their world model makes poor predictions.

Curriculum Design

Techniques

Training strategy that gradually increases task difficulty to help models learn robustly.

Curriculum Learning

Techniques

Training strategy that presents examples in increasing order of difficulty.

Curvature Regularizer

Techniques

A training constraint that penalizes curved or winding paths in the learned representation space.

Cycle Consistency

Techniques

A constraint requiring a model to reconstruct its original output after transforming through intermediate steps.

Cyclomatic Complexity

Techniques

A metric measuring how many different paths code can take; lower values mean simpler, easier-to-maintain code.

D

DAgger

Techniques

An interactive learning method where a human corrects the model's mistakes during training to fix distribution mismatch.

Data Attribution

Techniques

Measuring how much each training example contributes to a model's final performance using gradient-based methods.

Data Contamination

Techniques

When test data accidentally leaks into training, artificially inflating a model's measured performance.

Data Curation

Training

The process of carefully selecting, cleaning, and organizing training data to improve model quality; better curated data often leads to better model performance.

Data Deletion

Techniques

Predicting how a model would behave if specific training examples were excluded without retraining.

Data Deletion Problem

Techniques

Predicting how a model's behavior would change if specific training data were excluded without retraining.

Data Heterogeneity

Techniques

Variation in data distribution across different sources or groups.

Data Missingness

Techniques

Gaps or missing values in a dataset caused by sensor failures, blinks, or other interruptions.

Data Quality

Training

The relevance, accuracy, and usefulness of training data, which can be more important for model performance than simply having more data.

Data Quality Curation

Training

The practice of carefully selecting and filtering training data for relevance and accuracy rather than simply using larger amounts of raw data.

Data Registry

Techniques

A centralized catalog storing metadata about available data sources and their query interfaces.

Data Residency

Deployment

A guarantee that your data is stored and processed only in a specific geographic region, helping meet regulatory requirements.

Data Reuse

Techniques

When researchers use datasets from previous studies in their own research rather than collecting new data.

Data Selection

Techniques

Choosing a subset of training data based on quality or relevance metrics rather than using all available data.

Data Synthesis

Techniques

Automatically generating training data from existing datasets to teach models new tasks.

Data Validation

Techniques

Automated checks that verify data meets quality and correctness requirements before use.

Data-Parallel Training

Techniques

Distributing training data across multiple GPUs that compute gradients independently then synchronize.

Dataset Distillation

Techniques

Compressing a large dataset into a smaller synthetic version preserving key information.

DBRX Architecture

Architecture

A neural network design pattern that serves as the structural foundation for this model, determining how it processes and generates text.

De Novo Design

Techniques

Creating entirely new protein sequences from scratch rather than modifying or copying existing ones.

DeBERTa

Architecture

A transformer-based language model architecture that uses disentangled attention mechanisms to improve how the model weighs different parts of the input text when making predictions.

Decentralized Training

Training

A training approach where a model is developed across multiple independent computers or organizations rather than in a single centralized facility, allowing distributed collaboration.

Decision-Making System

Techniques

A mechanism that selects actions based on current state, goals, and expected outcomes to maximize success.

Decision-Support Mechanism

Techniques

A tool or system that provides information and analysis to help humans make better decisions without replacing human judgment.

Decision-Theoretic

Techniques

An approach that evaluates systems based on the quality of decisions they enable under different costs and benefits.

Decoder

Techniques

A component that converts compressed internal representations back into human-readable outputs like audio or images.

Decoder-based Language Model

Techniques

A type of LLM that generates text one token at a time, like GPT models.

Decoder-Only Architecture

Techniques

Language model design that generates text sequentially without a separate encoder, like GPT models.

Decoding

Techniques

Converting model outputs into human-readable text or structured predictions.

Decoding Strategies

Techniques

Methods for generating text from a language model, such as greedy selection, beam search, or temperature sampling.

Decompositional Verifiable Reward (DVReward)

Techniques

A reward system that breaks complex requests into atomic, checkable questions to provide interpretable feedback for model training.

Deduplication

Training

The process of removing duplicate or near-duplicate examples from training data to improve model efficiency and prevent overfitting to repeated content.

Deep Research Agent

Techniques

An AI system that performs multi-step research by reasoning through problems and making multiple search queries.

Defect Detection

Techniques

Automatically identifying problems or errors in software artifacts, such as incomplete or ambiguous descriptions.

Degrees of Freedom

Techniques

The number of independent ways a mechanical part can move or rotate in an assembly.

Delayed Feedback

Techniques

Consequences of an agent's actions that appear many steps later, making it harder to learn cause-and-effect relationships.

Delayed Verifier Signals

Techniques

Feedback or verification of agent actions that arrives after a delay, requiring the agent to maintain accountability over time.

Deliberative democracy

Techniques

A form of democracy where citizens and representatives engage in reasoned discussion to reach decisions.

Demand Modeling

Techniques

Using machine learning to predict how much of a product customers will buy given prices and other factors.

Demographic Blinding

Techniques

Removing or hiding demographic information (like gender) from model inputs to reduce bias in decision-making.

Demographic Importance Weighting

Techniques

Learning which demographic attributes (race, age, etc.) are most influential in predicting how annotators will judge subjective content.

Demonstration Data

Training

Training examples collected from real robots performing tasks, used to teach the model how to execute similar actions.

Denoising

Training

A training approach where the model learns to reconstruct clean audio from corrupted or noisy versions, improving its ability to extract meaningful features.

Denoising Autoencoder

Architecture

A neural network trained to reconstruct clean text from corrupted or noisy versions, learning to remove noise while preserving meaning.

Denoising Objective

Training

A training approach where a model learns to reconstruct clean audio from noisy versions, making it better at understanding speech in real-world conditions.

Denoising Process

Techniques

A technique where a model learns to gradually remove random noise from data to reconstruct meaningful content, used as an alternative to traditional token prediction.

Denoising Score Matching

Techniques

A training objective that learns to predict noise in corrupted data, used in diffusion models for stable gradient-based optimization.

Dense Captioning

Behavior

Generating detailed, comprehensive descriptions of images that capture rich visual information and relationships rather than brief summaries.

Dense Embedding

Architecture

A compact vector representation where most dimensions contain meaningful information, as opposed to sparse embeddings that are mostly zeros.

Dense Embeddings

Architecture

Vector representations where most or all of the numbers contain meaningful information, as opposed to sparse embeddings where most numbers are zero.

Dense Model

Architecture

A neural network where all parameters are active for every input, in contrast to sparse architectures like mixture-of-experts that selectively activate different parts.

Dense Passage Retrieval

Techniques

A technique that converts documents and queries into dense vectors so that relevant passages can be found by comparing their numerical representations rather than matching keywords.

Dense Representation

Architecture

A compact numerical format where meaning is captured in a fixed-size list of numbers, making it efficient for storage and similarity comparisons.

Dense Retrieval

Techniques

A search method that converts text into a single, compact numerical vector and finds similar documents by comparing these vectors.

Dense Retriever

Techniques

A retrieval system using learned embeddings to find semantically similar documents via vector similarity.

Dense Vector

Architecture

A compact numerical representation where most values are non-zero, used to efficiently store and compare the meaning of text.

Dense Vector Embedding

Architecture

A compact numerical representation of text that captures its meaning, allowing the model to compare how similar different pieces of text are to each other.

Dense Vector Embeddings

Architecture

Numerical representations of text where each word or sentence is converted into a list of numbers that capture its meaning, allowing the model to compare semantic similarity.

Dense Vector Representation

Formats

A compact numerical format where text is encoded as a list of numbers that capture its meaning, allowing efficient similarity comparisons.

Dense Vector Space

Architecture

A mathematical space where text is represented as vectors of numbers, positioned so that similar meanings are located close together.

Dense Vectors

Architecture

Compact numerical representations where most values are non-zero, used to encode the meaning of text in a form that computers can compare mathematically.

Dense vs. Sparse Embeddings

Architecture

Dense embeddings use all dimensions with non-zero values (like traditional neural embeddings), while sparse embeddings mostly contain zeros and are more interpretable and storage-efficient.

Density-Guided Response Optimization (DGRO)

Techniques

A method that aligns models by learning from the geometric clustering of accepted responses in the model's representation space.

Dependency Reasoning

Techniques

The ability to understand how multiple facts relate to and affect each other when making decisions.

Deployment Monoculture

Techniques

Risk that a single model's values or biases get applied uniformly at scale, eliminating the diversity of perspectives that would naturally exist with multiple decision-makers.

Depth Map

Techniques

An image where each pixel's brightness represents how far away that object is from the camera.

Depth-Upscaling

Training

A technique that creates a larger model by combining and stitching together layers from smaller pre-trained models rather than training a new model from scratch.

Dequantization

Techniques

The process of restoring a compressed model's weights to higher numerical precision, improving quality but requiring more memory.

Derivative Model

Training

A new model created by modifying or fine-tuning an existing base model rather than training from scratch.

Descriptor

Architecture

A numerical representation that captures the visual characteristics around a detected keypoint, allowing the model to match similar points across different images.

Descriptor-Based Generation

Techniques

Generating model weights using text or structured descriptions of the target architecture and task as input.

Deskilling

Techniques

The loss of professional expertise and judgment that occurs when workers rely on automated systems instead of developing their own capabilities.

Determinantal Point Process

Techniques

A mathematical model that generates diverse sets of items by penalizing similarity, useful for ensuring variety in generated outputs.

Deterministic Checks

Techniques

Automated verification rules that produce the same result every time, used when there is clear evidence of task completion.

Development Build

Deployment

An early, pre-release version of a model used for testing and refinement before public release.

Dexterous Manipulation

Techniques

Fine-grained, skillful robotic hand control requiring precise coordination of many joints.

Diagnostic Reasoning

Techniques

AI process of identifying root causes or problems from observed symptoms.

Dialogue Generation

Behavior

The process of an AI model creating natural conversational responses based on input text.

Dictionary Learning

Techniques

The process of finding a set of basis vectors (dictionary) that can reconstruct data through sparse combinations.

Diff Application

Techniques

The ability to understand and apply code changes (diffs) to existing files rather than generating code from scratch.

Differentiable

Techniques

A property of operations that allows gradients to flow through them during backpropagation for model training.

Differentiable Approximation

Techniques

Smooth mathematical function approximating non-differentiable operations for training.

Differentiable Loss Functions

Techniques

Mathematical functions that measure how far a model's output is from desired behavior, designed to be optimizable via gradient descent.

Differentiable Memory Stack

Techniques

A learnable memory retrieval mechanism that can be trained end-to-end to recall relevant past episodes for current decision-making.

Differentiable Physics

Techniques

A physics solver built into a neural network so that gradients can flow through physical laws during training.

Differentiable Reward

Techniques

A reward function whose gradients can be computed, allowing optimization of model outputs toward desired properties.

Differentiable Sparse Attention

Techniques

A sparse attention method that supports gradient computation, enabling end-to-end training with learned sparsity patterns.

Differential diagnosis

Techniques

A list of possible medical conditions ranked by likelihood, used by clinicians to guide further testing.

Differential Privacy

Techniques

A mathematical framework that adds controlled noise to data to protect individual privacy while enabling statistical analysis.

Difficulty amplification

Techniques

A technique to systematically increase problem complexity to better differentiate model capabilities.

Difficulty Estimation

Techniques

Predicting how hard a task is to automatically adjust the amount of computational effort needed.

Difficulty Signal

Techniques

An internal indicator that estimates how hard a problem is, used to guide model behavior.

Diffusion Language Models

Techniques

Language models that generate text by iteratively refining noisy predictions into coherent words.

Diffusion Model

Techniques

Generative model that creates images or videos by gradually removing noise from random data.

Diffusion Models

Techniques

AI models that generate images by learning to reverse a noise-adding process, starting from pure noise.

Diffusion Paradigm

Techniques

A generative approach that iteratively refines predictions by gradually removing noise from random initial states.

Diffusion Prior

Techniques

A learned distribution that guides diffusion models toward realistic outputs in a specific domain.

Diffusion Process

Architecture

A generation method that iteratively refines outputs by gradually removing noise, rather than predicting tokens one at a time from left to right.

Diffusion steps

Techniques

Iterations in a diffusion model that gradually refine noise into a final image or video output.

Diffusion Transformer

Techniques

A transformer architecture adapted to work with diffusion-based generation processes.

Diffusion-Based Architecture

Architecture

A neural network design that generates outputs by iteratively refining noisy predictions into clear results, rather than building text one token at a time like traditional language models.

Diffusion-Based Generation

Architecture

A method where a model generates text by iteratively refining noise into coherent output all at once, rather than predicting one word at a time.

Diffusion-Based Language Model

Architecture

A language model that generates text by iteratively predicting and refining masked (hidden) tokens across the entire output, rather than predicting one token at a time from left to right.

Diffusion-Based Trajectory Generation

Techniques

Using diffusion models to generate realistic robot motion sequences that can be used as training data.

Digital Twin

Techniques

A virtual simulation model of a physical system used to predict behavior and test changes before real-world deployment.

Dilated Convolution

Techniques

A convolutional operation that skips input elements to capture patterns at multiple scales without increasing parameters.

Dimension Reduction

Techniques

A technique to simplify high-dimensional parameter spaces by identifying and focusing on the most critical variables.

Dimensionality Reduction

Techniques

Techniques that compress high-dimensional data into fewer dimensions while preserving important patterns.

Direct Preference Optimization

Training

A training technique that teaches a model to prefer certain outputs over others by learning from examples of better and worse responses.

Directed Acyclic Graph (DAG)

Techniques

A graph structure representing causal relationships where arrows point from causes to effects with no cycles.

Directed Acyclic Graph (DAG)

Techniques

A workflow representation where tasks are nodes and dependencies are directed edges with no circular paths.

Dirichlet Energy

Techniques

A measure of smoothness on a graph that quantifies how much node values vary across connected edges.

Discourse Coherence

Techniques

The logical flow and consistency of ideas across sentences in a text or conversation.

Discourse Functional Analysis

Techniques

Examining how language serves specific communicative purposes in conversation, like validating feelings or paraphrasing.

Discovery-to-Application Gap

Techniques

The challenge of moving from discovering causal rules to engineering them into working systems.

Discrete Diffusion

Techniques

A generative model that iteratively removes noise from discrete tokens (like words) to generate text, as an alternative to autoregressive decoding.

Discrete Diffusion Models

Techniques

Generative models that iteratively denoise discrete tokens (like words) from noise to produce text.

Discrete Embeddings

Architecture

Compressed representations of audio data stored as specific, distinct values rather than continuous numbers, making them efficient for storage and processing.

Discrete Latent Space

Techniques

A compressed representation where continuous data is converted into distinct, countable tokens or categories.

Discrete memoryless channel

Techniques

A communication channel where each transmitted symbol is corrupted independently with no memory of past transmissions.

Discrete Tokens

Formats

Individual units of quantized information that represent audio in a compressed, symbolic form rather than continuous values.

Discretization

Techniques

Converting continuous numerical values into discrete bins or categories for processing by algorithms.

Discretization Invariance

Techniques

The ability of a model to generalize across different mesh resolutions or numerical discretizations of the same continuous problem.

Discriminative Direction

Techniques

A pattern in token gradients that effectively distinguishes high-reward responses from low-reward ones.

Disentanglement

Techniques

Separating different factors of variation (like expression and identity) in a model's learned representations.

DistilBERT

Architecture

A smaller, faster version of BERT that retains most of its language understanding ability while using fewer parameters and less computational power.

Distillation

Training

A technique that compresses a large, complex model into a smaller one by training it to mimic the larger model's behavior, resulting in faster inference with minimal loss of quality.

Distilled

Training

A model that has been compressed by training a smaller model to mimic a larger, more capable model, reducing size and computational requirements while retaining performance.

Distilled Model

Architecture

A smaller, faster version of a larger model created by training it to mimic the larger model's behavior, reducing computational requirements while maintaining reasonable performance.

Distributed Compute

Training

Using multiple computers or servers across a network to share the computational work of training or running a model, rather than relying on a single machine.

Distribution Mismatch

Techniques

When the data distribution used for training differs from the distribution encountered during deployment, causing performance degradation.

Distribution Shaping

Techniques

Modifying a model's output probability distribution at inference time to satisfy constraints without changing the model's weights.

Distribution Sharpening

Techniques

When a policy becomes overly specialized in reproducing successful behaviors without learning to handle diverse situations or recover from failures.

Distribution Shift

Techniques

When a model encounters data that looks different from what it was trained on, causing performance to drop.

Distributional Drift

Techniques

When a model's behavior diverges from the original training data distribution during fine-tuning or RL.

Distributional Embedding Space

Techniques

A mathematical space where words are represented as vectors based on their usage patterns in text, like GloVe or Word2Vec.

Distributional fairness

Techniques

Ensuring benefits and harms are equitably distributed across agents rather than concentrated in hubs or privileged positions.

Distributional Matching

Techniques

Forcing a model's output distribution to match a target distribution, here used to normalize reward structures across different tasks.

Distributional Modeling

Techniques

Learning to predict probability distributions over outputs rather than single deterministic predictions.

Distributional Shift

Techniques

When the statistical properties of data change over time, making old patterns unreliable for future predictions.

Divergence Constraint

Techniques

A regularization technique that limits how far a model's distribution can drift from a reference distribution during training.

Divergence-Free

Techniques

A mathematical property ensuring that a velocity field conserves mass (no fluid is created or destroyed at any point).

Diversity Coverage

Techniques

A metric measuring the quality of unique answers generated relative to the best possible answer set of the same size.

Diversity-aware Ranking

Techniques

A ranking method that prioritizes both relevance and variety, ensuring results cover different perspectives or approaches.

Document Attribution

Techniques

The ability to identify and explain which retrieved documents contributed to a generated answer.

Document Boundary

Techniques

The natural division between separate documents used as a constraint to group tokens for shared expert selection.

Document Chunking

Techniques

The process of breaking long documents into smaller pieces before embedding them, which this model is optimized to work with effectively.

Document Grounding

Techniques

Anchoring AI responses to specific source documents to ensure answers are based on provided content.

Document Intelligence

Behavior

The ability to automatically extract, understand, and convert information from document images (like scans or forms) into structured, machine-readable formats.

Document Layout Analysis

Techniques

The process of identifying and understanding the structure of a document, such as text regions, tables, and columns.

Document Parsing

Techniques

The process of automatically reading and extracting structured information like text, tables, and layout from documents.

Document Retrieval

Techniques

Finding the relevant documents or passages from a large collection that are needed to answer a question.

Document Structure Preservation

Behavior

The ability to maintain the original layout, formatting, and organization of a document when extracting text, rather than just outputting raw characters.

Document Understanding

Behavior

The ability to read and extract meaningful information from structured documents like receipts, invoices, and forms by recognizing both text and layout.

Document-Intensive Workflows

Techniques

Tasks that require processing, searching, and reasoning over large collections of documents to find answers.

Document-Level Reasoning

Techniques

Understanding and answering questions that require information from multiple parts of a full document.

Domain Adaptation

Training

Training a model on data from multiple specialized fields (like general text, scientific papers, and medical literature) so it works well across all of them.

Domain Expert

Techniques

A specialized expert in an MoE model trained to handle reasoning or task-specific knowledge rather than raw perception.

Domain Generalization

Techniques

Training models to work well on new, unseen domains beyond their training data.

Domain Generation Algorithm (DGA)

Techniques

A technique that automatically creates many fake domain names to evade detection and maintain control of malicious infrastructure.

Domain Grounding

Evaluation

How well a model's responses are anchored in accurate, specialized knowledge specific to a field rather than generic or hallucinated information.

Domain Knowledge

Training

Specialized expertise and facts about a particular field or subject area that an AI model has learned during training.

Domain Shift

Techniques

When a model encounters data from a different source or environment than it was trained on, causing performance to drop.

Domain Specialization

Training

When a model is trained to excel at a specific task or set of languages rather than being a general-purpose tool.

Domain Specific Languages

Techniques

Programming languages designed for specialized tasks in particular industries or fields.

Domain-Agnostic

Behavior

A model that works effectively across many different subject areas and use cases without needing to be retrained for each one.

Domain-Agnostic Conceptual Problems

Techniques

Abstract problem formulations that can be recognized and solved across multiple unrelated academic fields.

Domain-Aware

Behavior

A model's ability to understand and respond accurately to topics within a specific field or area of expertise it was trained on.

Domain-Independent Planner

Techniques

An AI planning algorithm that solves problems in any domain without domain-specific customization.

Domain-Specialized

Training

A model trained specifically on data and tasks from a particular field (in this case, chemistry) to achieve higher accuracy in that domain than general-purpose models.

Domain-Specific

Training

Tailored or optimized for a particular field or type of content, such as news, reviews, or scientific writing.

Domain-Specific Evaluation

Techniques

Assessment tailored to a particular field (like law) using metrics and error types relevant to that domain.

Domain-Specific Fine-Tuning

Training

Training a model on specialized data from a particular field (like medicine) so it becomes expert at tasks in that domain rather than being a generalist.

Domain-Specific Generation

Behavior

The ability to generate text tailored to a particular field or context, such as legal documents, Wikipedia articles, or product reviews.

Domain-Specific Knowledge

Techniques

Specialized expertise required for a particular field, like vendor-specific scanner operations in medical imaging.

Domain-Specific Language

Behavior

Specialized vocabulary and terminology unique to a particular field or industry, like medical jargon in healthcare or mathematical notation in physics.

Domain-Specific Language Model

Training

A language model trained exclusively on text from a particular field or subject area, making it much better at understanding and generating content in that domain than general-purpose models.

Domain-Specific Model

Training

A language model trained specifically on data from one field (like biomedical research) rather than general internet text, making it excel at specialized tasks.

Domain-Specific Optimization

Training

Training a model to excel at tasks within a particular field (like legal documents) rather than being a general-purpose model.

Domain-Specific Pretraining

Training

Training a model on specialized data from a particular field (like biomedical literature) rather than general internet text, making it much better at understanding that field's concepts.

Domain-Specific Procedures

Techniques

Specialized workflows and methodologies unique to a particular field that require expert knowledge to execute correctly.

Domain-Specific Training

Training

Training a model exclusively on data from a narrow domain (like Python code) rather than general text, making it highly specialized but less versatile.

Domain-Specific Tuning

Training

Training or adapting a model to specialize in a particular field (like biomedicine) rather than performing equally well across all topics.

DoRA (Weight-Decomposed Low-Rank Adaptation)

Techniques

A fine-tuning method that adapts model weights by separately learning magnitude and direction changes, extending LoRA.

Dot-Product Similarity

Performance

A method of comparing two vectors by multiplying their components and summing the results, where vector magnitudes (length) affect the final score.

Doubly Stochastic Matrix

Techniques

A square matrix where all rows and columns sum to 1, used to represent valid probability distributions for mixing multiple streams.

Downsampling

Techniques

Reducing an image's resolution by removing pixels, making it smaller and faster to process.

Downstream Model

Architecture

A specialized AI model that receives requests routed to it by another system and performs the actual task or generates the final response.

Downstream Tasks

Behavior

Specific applications or problems that use the output of a pretrained model, such as predicting protein structure or identifying protein function.

Draft Head

Architecture

The smaller neural network component in speculative decoding that quickly generates candidate tokens before verification by the main model.

Draft Model

Architecture

A smaller, faster model used in speculative decoding to quickly propose token sequences before a larger model verifies them.

Draft Tree

Techniques

A tree structure of multiple candidate token sequences proposed by a draft model, allowing parallel verification of multiple continuations.

Driving Pattern Recognition

Techniques

The process of automatically identifying and classifying different driving behaviors (e.g., aggressive vs. normal) from sensor data.

Dual Encoder Architecture

Architecture

A system with two separate neural networks—one that processes questions and one that processes documents—both converting their inputs into comparable vector embeddings.

Dual ML/Software Lifecycles

Techniques

The parallel development and deployment processes for machine learning models and traditional software components.

Dual Use Risk

Techniques

The danger that AI technology can be misused for harmful purposes despite benign original intent.

Dual-Encoder Architecture

Techniques

A model with separate encoders for two input modalities that map them into a shared embedding space.

Dual-Granularity

Techniques

Organizing information at two levels of detail: high-level task guidance and low-level step-by-step actions.

Dual-Process Framework

Techniques

An approach combining two complementary methods—one for logical reasoning and one for learning patterns—to solve a problem better than either alone.

Dual-Purpose Model

Architecture

A single model trained to perform multiple distinct tasks, such as both text generation and embedding, rather than being specialized for just one.

Dual-Temporal Pathway

Techniques

An architecture using two parallel processing streams with different time scales—one dense and one sparse.

Dummy Model

Evaluation

A minimal, non-functional model used for testing infrastructure and workflows without the computational cost of a real model.

Duration Control

Techniques

The ability to generate responses with a specific target length or speaking time.

Dynamic Curriculum

Techniques

Training approach that evaluates which skills remain helpful during learning and selectively retains only those that improve the current policy.

Dynamic Epistemic Logic (DEL)

Techniques

A formal system for reasoning about how beliefs and knowledge change when new information is revealed.

Dynamic Graph Construction

Techniques

Building a network representation that changes over time to reflect evolving relationships, like road connectivity adjusted for traffic incidents.

Dynamic Merging

Techniques

Combining task-specific model parameters at inference time based on input features, rather than using a fixed merged model.

Dynamic Method Selection

Techniques

Automatically choosing the best execution approach (LLM reasoning, tool use, or code) for each step based on task requirements.

Dynamic Parameter Scaling

Architecture

The ability to automatically adjust how many of a model's parameters are actively used based on available computational resources, allowing the same model to run efficiently on different hardware.

Dynamic Programming

Techniques

An optimization method that breaks problems into smaller subproblems and solves them recursively, storing results to avoid recomputation.

Dynamic Pruning

Techniques

Removing training samples during training based on their importance or quality, rather than before training starts.

Dynamic Quantization

Techniques

A quantization approach that adjusts precision levels during inference based on the input data, optimizing the balance between speed and accuracy on-the-fly.

Dynamic Question Generation

Techniques

Automatically creating questions of varying difficulty that adjust in real time based on learner responses and comprehension.

Dynamic Range Expansion

Techniques

The process of recovering or reconstructing the full range of brightness values lost when converting from HDR to standard video formats.

Dynamic Regret

Techniques

A measure of how well an algorithm performs compared to the best possible strategy that adapts to changing conditions.

Dynamic Routing

Techniques

Choosing packet paths through a network in real-time based on current network conditions.

Dynamical systems

Techniques

Mathematical models describing how systems evolve over time according to fixed rules.

Dynamical Systems Reconstruction

Techniques

Building neural network models that accurately capture the underlying rules governing how a system evolves over time.

Dynamics-aware Latent Space

Techniques

A compressed representation of states that captures how the environment changes over time.

E

E-graph Rewriting

Techniques

A technique for verifying program equivalence by representing multiple equivalent forms in a graph structure.

Early Exit

Techniques

Stopping a model's computation before completion when sufficient confidence is reached, reducing computational cost.

Early Fusion

Techniques

Combining multimodal inputs (like text and images) at early layers of a model rather than after separate encoding.

Early Scalarization

Techniques

Combining multiple objectives into a single weighted sum before training, which locks in a fixed trade-off.

eBPF

Techniques

Extended Berkeley Packet Filter; a technology for running sandboxed programs in the OS kernel to monitor system behavior.

ECG (Electrocardiogram)

Techniques

A recording of the electrical signals produced by the heart, used to detect heart problems.

Edge Case Handling

Behavior

The ability to anticipate and address unusual or boundary conditions in code that might cause errors.

Edge Computing

Deployment

Processing data locally on a device at the edge of a network rather than sending it to a central cloud server, improving speed and reducing dependency on internet connectivity.

Edge Deployment

Deployment

Running a model directly on local devices like phones, tablets, or IoT hardware rather than sending data to a remote server.

Edge Device

Deployment

A computing device at the edge of a network (like a smartphone or IoT device) that runs AI models locally rather than sending data to a remote server.

Edge Refinement

Techniques

The process of validating and removing unnecessary connections in a graph to improve its quality and interpretability.

Edge-to-Cloud Continuum

Techniques

Computing infrastructure spanning from edge devices (sensors, local hardware) to centralized cloud servers.

Efficient Attention Architectures

Techniques

Attention mechanisms designed to reduce computational or memory complexity compared to standard quadratic-scaling attention.

Egocentric Perception

Techniques

Visual understanding from a first-person viewpoint, as seen from the wearer's perspective.

Egocentric Perspective

Techniques

Understanding a scene from the viewpoint of a camera or observer positioned within the environment.

EHR-Embedded AI Agent

Techniques

An AI system integrated directly into electronic health record software to assist clinicians with documentation or decision-making.

Eigenfunction

Techniques

A special function that remains proportional to itself when transformed by an operator, used to decompose system behavior.

Elastic Context Orchestration

Techniques

Dynamically adjusting the detail level and size of stored information based on current task relevance.

Elastic Modeling

Techniques

Simulating how deformable materials stretch, bend, and return to shape based on physical material properties.

Elastic Weight Consolidation

Techniques

A technique that protects important weights from previous tasks by adding a penalty term during learning.

ELBO (Evidence Lower Bound)

Techniques

A training objective used in probabilistic models to maximize the likelihood of observed data.

ELECTRA

Architecture

A pre-trained language model that learns by predicting which tokens in a sentence have been replaced, making it efficient and effective for downstream tasks.

Electric Vehicle Routing Problem (EVRPTW)

Techniques

Finding optimal delivery routes for electric vehicles that must visit customers within time windows and recharge at stations.

Electroencephalogram (EEG)

Techniques

A recording of electrical brain activity used to detect neurological conditions like seizures.

Electronic Health Records (EHRs)

Techniques

Digital records of patient medical history, diagnoses, medications, and clinical events stored in structured formats.

Embedded Device

Techniques

A specialized computing device with limited resources designed to run specific applications, often integrated into physical systems.

Embedding

Architecture

A dense numerical vector that represents a word, sentence, or concept in a high-dimensional space.

Embedding Clustering

Techniques

Organizing vector representations of tokens into groups based on their semantic similarity.

Embedding Dimension

Architecture

The size of the numerical vector produced by an embedding model; larger dimensions capture more detail but require more storage and computation.

Embedding Dimensions

Architecture

The number of numerical values used to represent a piece of text (1792 in this case), where more dimensions allow for more detailed semantic information to be captured.

Embedding Geometry

Techniques

The spatial structure and relationships between data points in a learned vector space.

Embedding Layer Learning Rate

Techniques

The learning rate specifically applied to the embedding layer, which can be scaled independently from other layers.

Embedding Model

Architecture

A model that converts text into numerical vectors that capture semantic meaning, allowing computers to understand and compare the similarity between different pieces of text.

Embedding Output

Formats

The model produces dense numerical vectors that represent the semantic meaning of text, which can be used for similarity comparisons or as input to other models.

Embedding Perturbation

Techniques

Adding controlled noise to vector representations of text to obscure sensitive information.

Embedding Representation

Techniques

A numerical vector representation of text that captures semantic meaning for comparison and analysis.

Embedding Similarity

Techniques

A metric that measures how similar two pieces of content are by comparing their numerical vector representations.

Embedding Space

Architecture

A mathematical space where text is represented as vectors, allowing similar texts to be positioned close together and enabling operations like similarity search and clustering.

Embedding Strategies

Techniques

Different ways to represent words as vectors (semantic, acoustic, or phonetic).

Embedding-Based Deduplication

Techniques

Removing duplicate or near-duplicate examples by comparing their vector representations in embedding space.

Embedding-Based Matching

Techniques

Comparing semantic representations (embeddings) to find similar content without reprocessing raw data.

Embeddings

Architecture

Numerical representations of text that capture semantic meaning, allowing the model to measure similarity between different words or phrases.

Embodied AI

Behavior

AI systems designed to interact with and understand the physical world through robotic bodies or sensors, rather than just processing text.

Embodied Decision Routing

Techniques

The process of choosing which action a robot should execute next based on perceived state and task context.

Embodied Efficiency

Techniques

Real-world performance metrics for robots like task completion time, motion smoothness, and energy consumption.

Embodied Manipulation

Techniques

Robot learning and control for physical interaction tasks using integrated sensing and actuation.

Embodied Model

Training

An AI model trained on real-world physical interactions and sensor data from robots, rather than text or simulations alone.

Embodied Reasoning

Behavior

The ability to understand and reason about physical tasks and spatial relationships in the real world, not just abstract concepts.

Embodiment-agnostic

Techniques

A representation or model that works across different body types or physical forms without being specific to one.

Emergence

Techniques

The point during training when a model suddenly gains the ability to perform a task above a threshold accuracy.

Emergent Fitness

Techniques

A measure of solution quality that arises from system dynamics rather than being explicitly defined beforehand.

Emergent Misalignment

Techniques

When a model trained on narrow misaligned behavior generalizes to more severe harmful behaviors outside its training distribution.

Emotional Contagion

Techniques

The spread of emotions from one agent to others through interaction and observation.

Emotional Framing

Techniques

Using emotionally-toned language or affective phrasing in prompts to influence model behavior.

Emotional Valence

Techniques

The positive or negative quality of an emotion, ranging from negative to positive.

Empathetic Alignment

Training

Training a model to recognize and respond to emotional context in conversations, prioritizing understanding and emotional connection over purely factual responses.

Empathy-Oriented Prompting

Techniques

Instructing an LLM to generate responses with emotional awareness and compassion for patient concerns.

Empirical Risk Minimization

Techniques

An algorithm approach that finds the best solution by minimizing errors on observed data.

Emulator

Techniques

A neural network trained to mimic the behavior of a complex physical model or simulation.

Encoder

Architecture

A model component that transforms input sequences (like protein amino acids) into meaningful numerical representations without generating new sequences.

Encoder Architecture

Architecture

A neural network component that transforms input text into a compressed numerical representation, focusing on understanding and extracting meaning rather than generating new text.

Encoder Component

Architecture

A model designed to convert inputs (like images or text) into numerical representations for understanding, rather than generating new content.

Encoder Model

Architecture

A neural network that transforms input data into a compressed representation, rather than generating new text or making predictions.

Encoder-based models

Techniques

Models like RoBERTa that process text to understand meaning, typically used for classification tasks.

Encoder-Decoder

Architecture

A neural network architecture with two parts: an encoder that processes input text and a decoder that generates output text, allowing the model to transform one sequence into another.

Encoder-Only Architecture

Architecture

A neural network design that processes input text to understand and represent it, but cannot generate new text from scratch.

End Effector

Techniques

The tool or gripper at the end of a robot arm that physically interacts with objects in the environment.

End-Result Supervision

Techniques

Training data that only provides the final correct answer without showing the reasoning steps used to reach it.

End-to-End Driving

Techniques

An autonomous driving approach that directly maps sensor inputs to control outputs without explicit intermediate representations.

End-to-End Learning

Training

Training a model to solve a complete task directly from raw input (like document images) to final output, without breaking it into separate intermediate steps.

End-to-End Processing

Architecture

A system that takes raw input (like an image) and produces final output (like structured text) in one unified model, rather than chaining multiple separate tools together.

Energy Conserving Descent

Techniques

An optimization algorithm that preserves energy while descending to escape local minima.

Energy Function

Techniques

A function that assigns a scalar value to each point in a space, defining an unnormalized probability distribution.

Engagement Patterns

Techniques

Recurring behaviors showing how users interact with content or systems over time.

Ensemble Distillation

Training

A training technique where knowledge from multiple models is combined and compressed into a single, smaller model for better efficiency.

Ensemble Methods

Techniques

Combining multiple models to make better predictions than any single model alone.

Ensemble Voting

Techniques

A safety technique that combines outputs from multiple models and selects the most agreed-upon result.

Ensemble Weights

Techniques

Probabilistic scores assigned to multiple documents that determine their relative contribution to the final answer.

Enterprise Language Model

Deployment

A language model specifically optimized for business and organizational use cases, prioritizing reliability, consistency, and professional output over other characteristics.

Entity Alignment

Techniques

The task of recognizing that different names or phrases refer to the same real-world concept, such as matching 'MI' with 'myocardial infarction'.

Entity consistency

Techniques

Maintaining the same appearance and identity of characters, objects, and locations across different scenes in a video.

Entity Extraction

Techniques

Automatically identifying and pulling out specific names, places, or things from text.

Entity Linking

Techniques

The task of identifying mentions of real-world concepts in text and connecting them to their canonical definitions in a knowledge base or ontology.

Entity Matching

Techniques

The task of identifying when different text references refer to the same real-world concept, such as matching variant spellings of a drug name to a single clinical entity.

Entity-based QA

Techniques

A question-answering evaluation framework that tests whether models can retrieve factual information about specific entities.

Entity-Relational Model

Techniques

A data structure that represents entities (like users or devices) and the typed relationships between them.

Entropic Optimal Transport

Techniques

A regularized version of optimal transport that adds entropy constraints to encourage smoother, more balanced assignments between sources and destinations.

Entropy Gradient

Techniques

The gradient of prediction uncertainty with respect to visual embeddings, used to identify ambiguous regions.

Entropy Maximization

Techniques

Encouraging an agent to explore diverse state-action pairs by maximizing the entropy of its occupancy measure.

Entropy Shaping

Techniques

Controlling the randomness of a model's outputs to prevent it from becoming too deterministic or too random during training.

Entropy Sum Strategy

Techniques

A decoding approach that continues unmasking tokens until cumulative entropy exceeds a threshold, balancing generation speed and quality.

Entropy-Limited Operation

Techniques

System state where the ability to generate random numbers becomes the limiting factor rather than arithmetic computation.

Environment Generation

Techniques

Automated creation of task specifications and evaluation settings for training or testing agents.

Episodic Memory

Techniques

AI system's ability to store and recall specific past events or experiences.

Epistemic Asymmetry

Techniques

A situation where different participants have different information or knowledge about the same topic.

Epistemic integrity

Techniques

The preservation of an agent's ability to form accurate beliefs and maintain truthful internal representations.

Epistemic orientation

Techniques

The degree to which discourse relies on evidence-based reasoning versus intuition and subjective belief.

Epistemic Uncertainty

Techniques

Uncertainty from lack of knowledge that can be reduced with more data or better models.

Equilibrium Internalization

Techniques

A phenomenon where the model learns to place its initial output near the fixed point, allowing inference without iteration.

Equivariant Graph Neural Networks

Techniques

Neural networks designed to respect geometric symmetries and transformations in molecular or crystal structures.

Ergonomic Compliance

Techniques

How well a design follows established principles for human comfort, safety, and efficient use of space.

Error Analysis

Techniques

Systematic examination of model failures to identify patterns and root causes beyond aggregate metrics.

Error Management

Techniques

Firmware algorithms that detect and correct errors in memory to maintain reliability as storage density increases.

Error Propagation

Techniques

How mistakes in early steps of a process accumulate and worsen downstream results.

Error Taxonomy

Techniques

A structured classification system that categorizes different types of errors to enable systematic analysis and mitigation.

Euler Characteristic

Techniques

A topological invariant that counts connected components, holes, and voids in a shape to characterize its structure.

Evaluation Faking

Techniques

When an evaluator systematically biases its judgments based on contextual information rather than actual content quality.

Evaluation Illusion

Techniques

When AI judges appear to agree on scores but are actually using shallow patterns rather than substantive reasoning about quality.

Evaluation Metric

Techniques

A quantitative measure used to assess how well a model or system performs on a specific task.

Evaluation Model

Evaluation

A specialized language model trained to assess and score the quality of outputs from other AI models, acting as an automated judge.

Evasion Attack

Techniques

An attack where an adversary modifies input features at test time to fool a deployed classifier.

Event Camera

Techniques

A sensor that captures pixel-level brightness changes asynchronously, producing sparse temporal event streams.

Event curves

Techniques

Temporal representations that capture when and how much change occurs in music or video.

Event Inference

Techniques

Automatically detecting higher-level events from lower-level timestamped observations using logical rules.

Event Linking

Techniques

Grouping related incident reports together to identify a single underlying problem from multiple user descriptions.

Event Sourcing

Techniques

Recording all changes to data as a sequence of immutable events for full history tracking.

Event Template

Techniques

A generalized pattern representing a class of similar log messages with variable fields.

Evidence Aggregation

Techniques

Combining information from multiple frames or observations to make a single robust decision or diagnosis.

Evidence Contradiction

Techniques

When a model's answer directly contradicts the provided evidence or clinical guidelines.

Evidence Dependence

Techniques

A model's ability to change its predictions based on whether evidence supports or contradicts a claim.

Evidence Grounding

Techniques

Linking AI outputs to specific source documents or facts that support them.

Evidence Portfolio

Techniques

A collection of diverse, complementary pieces of evidence retrieved to support multi-faceted reasoning.

Evidence-Guided Repair

Techniques

Fixing errors in code or theory by using specific signals like test failures and reviewer feedback to target the root cause.

Evidential Fusion

Techniques

A method that combines multiple predictions while quantifying uncertainty using evidence theory.

Evol-Instruct

Training

A training method that gradually increases the complexity of instructions given to a model, helping it learn to handle increasingly difficult tasks.

Evolutionary Search

Techniques

An AI optimization technique that mimics natural selection to explore and improve solutions over many iterations.

Exchangeability

Techniques

A statistical property ensuring that the order of data points doesn't matter, required for conformal prediction to provide valid guarantees.

Executable Code Reuse

Techniques

Saving and reusing working code solutions instead of text descriptions for repeated tasks.

Executable Environments

Techniques

Stateful, runnable systems that simulate real-world tool interactions and can verify agent actions.

Execution Diagnosis

Techniques

Detailed analysis of why an action succeeded or failed, beyond just binary success/failure signals.

Execution Grounding

Techniques

Anchoring AI-generated questions and explanations to actual runtime behavior and concrete execution traces.

Execution Plan

Techniques

A detailed strategy for solving a problem, which can be implemented and tested before committing to a final answer.

Execution trace

Techniques

A record of every step a program takes as it runs, including variable values and function calls.

Execution Trace Feedback

Techniques

Detailed information about what happened during a program's execution, used to diagnose failures.

Execution-Based Verification

Techniques

Validating agent behavior by running code and checking if outputs match expected results, rather than relying on static analysis.

Exogenous Variable

Techniques

A variable in a causal model that is not caused by any other variables in the model; represents external sources of randomness.

Expected Improvement

Techniques

An acquisition function that selects points likely to improve over the current best solution.

Experiential knowledge

Techniques

Useful patterns and insights extracted from real-world interactions and deployment experience.

Experiential Learning

Techniques

Learning through direct interaction with the environment and feedback from actions taken.

Experimental Design

Techniques

Strategically choosing which experiments to run to maximize information gain given a limited budget.

Experimental Discovery

Techniques

The process of testing hypotheses through controlled experiments to uncover causal relationships.

Experimental Release

Deployment

An early version of a model released for testing and feedback, which may have bugs or incomplete features compared to stable versions.

Expert Importance

Techniques

A measure of how much each expert in an MoE model contributes to the final output, used to decide which experts need higher precision.

Expert Routing

Architecture

The mechanism in a mixture-of-experts model that decides which specialized sub-networks should process each piece of input.

Expert Specialization

Techniques

The process where different experts in an MoE learn to handle distinct types of inputs or tasks (e.g., code vs. math).

Expert Utilization

Techniques

How evenly the workload is distributed across experts; balanced utilization prevents some experts from being unused.

Explainability

Techniques

The ability to understand and interpret why an AI model made a specific decision or prediction.

Explanation Consistency

Techniques

Whether a model applies the same reasoning strategy (highlights the same regions) across different instances of the same class.

Explicit Thinking

Behavior

A mode where a model generates visible reasoning steps before producing a final answer, allowing you to see its problem-solving process.

Explicit Thinking Mode

Behavior

A feature that allows a model to show its reasoning process step-by-step before providing an answer, useful for complex problems that benefit from deliberate problem-solving.

Exploitability

Techniques

The maximum gain a player can achieve by deviating from an equilibrium strategy.

Exploration

Techniques

The process of trying diverse actions during training to discover which ones lead to better outcomes.

Exploration-Exploitation Tradeoff

Techniques

Balancing between exploiting known good solutions and exploring new possibilities to find better ones.

Exponential Moving Average

Techniques

A weighted average that gives more importance to recent values than older ones.

Expression Generalization

Techniques

A model's ability to handle facial expressions it wasn't explicitly trained on by learning underlying expression patterns.

Extended Context Processing

Architecture

The capability to work with and maintain understanding across large amounts of text or multiple documents during reasoning.

Extended Object Tracking

Techniques

Estimating both the position and shape of objects that occupy multiple sensor measurements.

Extended Reasoning

Behavior

A capability that allows a model to think through complex problems step-by-step internally before providing a final answer.

Extended Thinking

Techniques

A reasoning technique where a model works through a problem step-by-step internally before providing an answer, improving accuracy on complex tasks.

External Rewards

Techniques

Reward signals based on computational verification methods rather than the model's own internal signals.

External Validity

Techniques

Whether results from a controlled study apply to real-world situations outside the lab.

Extrapolation

Techniques

Predicting model behavior in a region (like very large training runs) based on observations from smaller regions.

Extrapolative Prediction

Techniques

Making predictions beyond the range of training data, such as forecasting system behavior at untested excitation levels.

F

Face Recognition

Techniques

Technology that identifies or verifies people by analyzing facial features in images.

Facility-Location Coverage

Techniques

An optimization technique that selects diverse items by maximizing how well they represent the full set of options.

Fact-checking

Techniques

The process of verifying claims against reliable sources to determine their accuracy.

Fact-checking without retrieval

Techniques

Verifying if claims are true using only an LLM's internal knowledge, without searching external databases.

Factored Norm

Techniques

A decomposition of norm computation into smaller intermediate terms to avoid materializing large dense matrices.

Factual Accuracy

Techniques

How often an AI model produces correct, verifiable information without errors or false claims.

Factual Grounding

Behavior

Anchoring a model's responses to verified, real-world information rather than relying solely on patterns learned during training.

Factual Recall

Techniques

An LLM's ability to accurately retrieve and output factual information from its training data.

Failure Domain

Techniques

A group of related system components or subsystems that share common failure modes and characteristics.

Failure Probability

Techniques

The quantified likelihood that an AI system will make a harmful or incorrect decision in real-world deployment.

Fairness Audit

Techniques

Systematic evaluation of an AI system to detect and measure bias across demographic groups or decision scenarios.

Faithfulness

Techniques

Whether an AI model's stated reasoning actually explains how it arrived at its answer, or if it's post-hoc justification.

Fake News Detection

Techniques

The task of identifying false or misleading news articles, typically framed as a classification problem.

False Memory Propagation

Techniques

When incorrect or outdated information from past interactions influences future reasoning.

False Premise Detection

Techniques

The ability to identify when a question contains incorrect assumptions or fabricated facts before answering.

Farthest-Point Sampling

Techniques

A greedy algorithm that selects points by always choosing the one farthest from previously selected points.

Fast Weight Update

Techniques

A method for efficiently updating model parameters or memory states during forward passes without full recomputation.

Fast Weights

Techniques

Model parameters that are quickly adapted during inference to capture task-specific or input-specific patterns.

Fault Localization

Techniques

Pinpointing the exact location of bugs or errors in code or systems.

Fault Propagation Graph

Techniques

A graph showing how errors flow through transformer components from their origin to observable symptoms.

Fault Tolerance

Techniques

The ability of a system to continue operating correctly even when components fail.

Feasibility Screening

Techniques

Automatically checking whether a problem instance has at least one valid solution before using it for testing.

Feature Augmentation

Techniques

Enhancing a model by adding hand-crafted or extracted features (like linguistic metrics) alongside learned representations.

Feature Caching

Techniques

Storing intermediate computed features during inference to reuse them in later steps, reducing redundant computation.

Feature Engineering

Techniques

The process of selecting and designing input features that a machine learning model uses to make predictions.

Feature Extraction

Behavior

The process of using a model to convert raw input text into numerical representations (features) that capture the meaning of the text.

Feature Fragmentation

Techniques

When a single concept is scattered across many separate features instead of being cleanly captured by one or a coherent group.

Feature Importance

Techniques

A measure of how much each input variable contributes to a model's predictions.

Feature Interaction

Techniques

How multiple input features combine together to influence a model's prediction, beyond their individual effects.

Feature Interaction Analysis

Techniques

A method to identify how combinations of input features jointly influence a model's predictions, beyond individual feature effects.

Feature Linear Separability

Techniques

A measure of how well different visual concepts can be distinguished in a model's learned feature space.

Feature-wise Linear Modulation (FiLM)

Techniques

A technique that dynamically adjusts learned representations by scaling and shifting features based on problem-specific conditions.

Federated Learning

Techniques

Training models across multiple devices without centralizing sensitive data in one place.

Feed-Forward Network (FFN)

Techniques

A standard neural network layer in transformers that processes information independently at each position.

Feed-forward transformer

Techniques

A neural network that processes input in a single forward pass without recurrence or iterative refinement.

Feedback Model

Techniques

The method used to apply feedback text to refine and improve a search query representation.

Feedback Source

Techniques

Where the text used to improve a search query comes from, such as LLM-generated text or actual documents.

Feedback-Driven Control

Techniques

Using execution results and error signals to adaptively adjust agent behavior and improve reliability over time.

Few-shot Learning

Techniques

Training or prompting a model with only a small number of examples to perform a new task.

Fidelity

Performance

The degree to which a quantized or compressed model preserves the quality and accuracy of the original full-precision model.

Fidelity gate

Techniques

A filtering mechanism that only includes accurately generated entity appearances in consistency evaluation metrics.

Fidelity Metric

Techniques

A measure of how well an explanation captures the true reasoning of a model by testing prediction changes.

Fill-in-the-Middle

Techniques

A code completion technique where the model predicts missing code between existing lines, rather than only generating code forward from a starting point.

Fine-grained Classification

Techniques

Distinguishing between very similar categories, like telling apart different bird species rather than just identifying 'bird vs. not bird'.

Fine-Grained Text Rendering

Performance

The ability to accurately generate readable text and small details within generated images.

Fine-Grained Visual Details

Behavior

Small, specific visual elements in an image, such as text within a photo or subtle differences between similar objects.

Fine-Tunable

Training

The ability to further train or customize a pre-trained model on your own data to adapt it for specific tasks or domains.

Fine-Tune

Training

A model created by training an existing pre-trained model on new data to specialize it for specific tasks or behaviors.

Fine-Tuned

Training

A pre-trained model further trained on a smaller, task-specific dataset to improve performance on that task.

Fine-Tuning

Training

The process of further training a pre-trained model on new data to adapt it for specific tasks or domains.

Finite Element Method (FEM)

Techniques

A numerical technique that breaks a complex domain into small pieces to solve physics equations approximately.

Finite fields

Techniques

Mathematical structures with finitely many elements where arithmetic operations follow specific rules.

Finite Horizon

Techniques

A problem setting with a fixed, known endpoint in time, as opposed to indefinite or infinite-horizon problems.

Firing-Rate Neural Network

Techniques

A recurrent neural network model where neurons output continuous activation rates rather than discrete spikes.

First-order Logic

Techniques

A formal language for expressing rules and constraints using predicates, variables, and logical operators.

First-Passage Time

Techniques

The time it takes for a stochastic process to reach a target state for the first time.

First-Stage Retriever

Techniques

The initial search system that finds candidate documents before refinement techniques are applied.

Fisher Discrepancy

Techniques

A metric measuring the difference between score functions of two distributions.

Fitted Dynamic Programming

Techniques

A variant of dynamic programming that first estimates unknown functions (like demand) from data, then uses those estimates for optimization.

Fixation

Techniques

A moment when the eye pauses on a specific location while viewing an image, typically lasting 100-500 milliseconds.

Fixed-point iteration

Techniques

Repeatedly applying a function until it converges to a stable value, used here for test-time computation in looped models.

Fixed-Point Solving

Techniques

Finding a stable state where a function's output equals its input, used here to refine embeddings iteratively.

Fixed-Size Embeddings

Architecture

Embeddings that always produce vectors of the same length regardless of input length, which limits how much detail can be captured for very long documents.

Flagship Model

Behavior

A company's primary, most capable model designed to showcase their best technology and handle the most demanding use cases.

Flash Translation Layer

Techniques

Software abstraction that maps logical addresses to physical memory locations in SSDs, managing wear and errors.

Flexible Spectrum Access

Techniques

Dynamically allocating wireless frequencies based on real-time demand instead of fixed assignments.

Floorplanning

Techniques

The process of deciding where to place components on a chip to meet design constraints and performance goals.

Flow Based Generation

Techniques

Generating data by learning reversible transformations between simple and complex distributions.

Flow Matching

Techniques

A generative modeling technique that learns to transform random noise into realistic data by following learned flow paths.

fMRI

Techniques

Functional magnetic resonance imaging; a non-invasive technique measuring brain activity through blood flow changes.

Focal-Contrastive Fine-tuning

Techniques

A training approach combining focal loss (which focuses on hard examples) with contrastive learning to handle imbalanced datasets.

Focal-Contrastive Fine-tuning

Techniques

A training approach combining focal loss (which emphasizes hard examples) with contrastive learning to handle imbalanced datasets.

Foley

Techniques

Custom sound effects created to match specific actions or movements in video, like footsteps or door slams.

Forgetting Factor

Techniques

A parameter that controls how quickly a filter discounts old data, balancing between adapting to new conditions and maintaining stability.

Fork Verification

Techniques

Testing reward hypotheses by branching from shared policy checkpoints and comparing short-horizon performance to assess reward quality.

Formal Specification

Techniques

Expressing system requirements or policies in a precise mathematical language that tools can automatically verify.

Formal Verification

Techniques

Mathematical proof that a system meets its specifications, here implemented in Lean 4 to certify material stability predictions.

Formative Feedback

Techniques

Real-time guidance given to students during learning to help them improve, rather than just assigning a final grade.

Forward Dynamics Propagation

Techniques

Simulating a robot's future states by repeatedly applying its dynamics model to predict outcomes of candidate actions.

Forward KL Divergence

Techniques

A training objective that penalizes the model for assigning probability to regions the true distribution doesn't cover.

Forward Pass

Architecture

A single computation cycle where input data flows through the model's layers to produce an output prediction.

Forward-looking Intent

Techniques

An agent's reasoning about future consequences and goals rather than just reacting to past events.

Forward-Mode Automatic Differentiation

Techniques

An efficient method for computing derivatives by propagating changes forward through a computation graph.

Foundation Model

Architecture

A large pre-trained model that serves as a starting point for building other models, rather than being trained from scratch.

Foundation Model Architecture

Architecture

The underlying structural design of a neural network that determines how it processes and learns from data, distinct from standard transformer designs.

Foundation Models

Techniques

Large pre-trained AI models that can be adapted to many different tasks without starting from scratch.

Fourier Domain

Techniques

Mathematical representation showing which frequencies (periodic patterns) are present in data.

FP16 Precision

Formats

A data format that stores model weights using 16-bit floating-point numbers, preserving full model accuracy while using less memory than 32-bit formats.

FP4 (4-bit Floating Point)

Formats

A low-precision numerical format that uses only 4 bits to represent numbers, enabling faster computation and smaller model sizes compared to standard 32-bit precision.

FP4 Floating Point

Formats

A 4-bit number format used in quantization that represents values with minimal precision, significantly shrinking model size while maintaining reasonable accuracy.

FP4 Format

Formats

A 4-bit floating-point number format that represents model weights with very low precision, enabling extremely efficient inference on compatible hardware.

FP4 Precision

Formats

A ultra-low precision format using 4-bit floating-point numbers to represent model weights, enabling extreme compression.

FP4 Quantization

Formats

A compression technique that represents model weights using only 4-bit floating-point numbers instead of larger formats, reducing memory usage and speeding up inference.

FP8 (8-bit Floating Point)

Formats

A compressed number format that uses 8 bits instead of the standard 32 bits, dramatically shrinking model size at the cost of slightly reduced precision.

FP8 Dynamic Quantization

Techniques

A specific quantization method that uses 8-bit floating-point numbers and adjusts precision dynamically based on the data being processed, balancing speed and accuracy.

FP8 Floating Point

Formats

An 8-bit numerical format that stores numbers with reduced precision compared to standard formats, enabling smaller model sizes and faster computation.

FP8 Precision

Formats

A data format that stores numbers using 8 bits instead of the standard 32 bits, significantly reducing memory requirements with minimal quality loss.

FP8 Quantization

Formats

A compression technique that reduces model size by representing weights using 8-bit floating-point numbers instead of higher precision, making it faster and more memory-efficient.

Fractal Attractor

Techniques

A set that an optimization trajectory converges to, with self-similar structure at multiple scales rather than converging to a single point.

Free-Text Generation

Techniques

A model's ability to produce answers without predefined options, requiring genuine recall and reasoning.

Frequency Distribution

Techniques

How often different facts or tokens appear in training data, which affects what models learn.

Frequency Separation

Techniques

Decomposing signals into high-frequency (details, edges) and low-frequency (overall structure, semantics) components.

Frequency-Stratified Evaluation

Techniques

Evaluating model performance separately for rare, medium, and common classes to reveal patterns hidden by overall metrics.

Frontend Generation

Behavior

The automated creation of user interface code and visual elements based on descriptions or specifications.

Frontier Model

Evaluation

A state-of-the-art AI model representing the cutting edge of what's currently possible in terms of capability and performance.

Frontier Models

Evaluation

State-of-the-art, cutting-edge AI models that represent the current best performance in the field.

Frontier-Class

Performance

A model that represents the current state-of-the-art or cutting edge in AI capabilities, competing with the most advanced models available.

Frontier-Scale Models

Architecture

The largest and most advanced language models available, representing the cutting edge of AI capabilities.

Frontier-Tier Model

Performance

A cutting-edge AI model representing the current state-of-the-art in performance and reasoning capabilities.

Frozen Encoder

Techniques

A pre-trained model component that is kept unchanged during training to preserve its learned knowledge.

Full-Precision

Formats

A model using standard 32-bit floating-point numbers to represent weights, providing maximum accuracy but requiring more memory.

Full-Precision Weights

Deployment

Model parameters stored at maximum numerical accuracy (typically 32-bit floating point), which provides the best quality but requires more memory and computation.

Function Calling

Behavior

The ability of a model to output structured requests to invoke external tools or APIs rather than generating free-form text.

Function Vector Representations

Techniques

Internal model representations that encode what tasks do, allowing comparison of task similarity and prediction of learning trajectories.

Function Vectors

Techniques

Vector representations of tasks extracted from model activations during in-context learning.

Function-Preserving Expansion

Techniques

Growing a model's capacity while mathematically guaranteeing it behaves identically to the original at the start.

Function-preserving Transforms

Techniques

Mathematical operations like rotations that rearrange a model's weights without changing what the model computes.

Functional Requirements

Techniques

Specifications describing what a software system should do and its specific behaviors and features.

Functional Token

Techniques

A discrete token that encodes both an agentic operation and latent visual reasoning capability without explicit visual supervision.

Funnel Attention

Architecture

An attention mechanism that progressively compresses and simplifies the input sequence, reducing computational cost while maintaining important information.

Fused Kernels

Techniques

GPU operations combined into a single kernel to reduce memory traffic and improve computational efficiency.

Fuzzy Rules

Techniques

Logic-based rules that handle uncertainty and gradual membership rather than strict true/false classifications.

Fuzzy String Matching

Techniques

Comparing text strings by measuring character-level similarity rather than exact matches.

G

Gain Modulation

Techniques

A mechanism where a context signal scales the magnitude of state-dependent responses without changing their underlying structure.

Game Description Language

Techniques

A formal notation for encoding game rules so different AI systems can play the same game consistently.

Game-Theoretic Equilibrium

Techniques

A stable state where no agent can improve their outcome by unilaterally changing their strategy.

Gated Correction

Techniques

A learned mechanism that selectively applies corrections to predictions based on per-dimension scaling factors.

Gateway Neuron

Techniques

A neuron that controls whether tokens are routed to standard or exception processing paths.

Gating Mechanism

Techniques

A learned or rule-based function that selectively enables or disables components based on input conditions.

Gauge Invariance

Techniques

A mathematical property ensuring a model's predictions remain consistent regardless of arbitrary coordinate system choices or numerical representations.

Gaussian Process

Techniques

A statistical model that learns patterns from data and provides uncertainty estimates for predictions.

Gender Bias

Techniques

Systematic tendency of models to favor one gender over others in language generation and translation tasks.

General Reasoning

Behavior

The capability to think through problems logically, break down complex questions, and arrive at conclusions across a wide variety of topics.

General-Purpose

Behavior

Designed to handle a wide variety of different tasks rather than being specialized for one specific domain.

General-Purpose Language Model

Architecture

A model trained to handle a wide variety of text tasks—like writing, answering questions, and reasoning—rather than being specialized for one specific task.

General-Purpose Model

Behavior

An AI model designed to handle many different types of tasks well, rather than being specialized for one specific domain.

Generalist Model

Behavior

A model trained to perform well across many different types of tasks rather than being specialized for one specific domain.

Generalist Robot

Techniques

A robot trained to perform many different everyday tasks rather than being specialized for one specific job.

Generalization

Performance

A model's ability to perform well on new, unseen data that differs from what it was trained on.

Generalization Error

Techniques

The difference between a model's performance on training data versus unseen test data.

Generalized Procrustes Algorithm

Techniques

A mathematical method for aligning and comparing representations across different neural networks by finding optimal rotations.

Generate-then-Answer (GtA)

Techniques

An inference approach where a model generates an intermediate image before answering a question about it.

Generative Embeddings

Techniques

Vector representations of text created by generative language models that capture semantic meaning.

Generative Flow Networks (GFlowNets)

Techniques

A probabilistic framework that generates samples with probability proportional to a reward function, useful for optimization tasks like molecule discovery.

Generative Language Model

Architecture

A model trained to generate new text by predicting the next word or sequence of words based on patterns it learned during training.

Generative Model

Techniques

An AI model trained to create new data (like images) that resembles its training data.

Generative Post-training

Techniques

Additional training phase after initial pretraining that uses generative tasks to improve model capabilities.

Generative Process

Techniques

A model's procedure for creating new outputs (like floor plans) based on learned patterns from training data.

Generative Safety

Techniques

A methodology that grows phenomena from micro-level interaction conditions to identify sufficient mechanisms, detect thresholds, and design safety interventions.

Generator matrices

Techniques

Matrices used to encode data into codewords in error-correcting codes.

Geodesic Distance

Techniques

The shortest path between two points along a curved surface, as opposed to straight-line distance.

Geographic Plausibility

Techniques

Checking that spatial analysis results are realistic (e.g., no negative distances, valid coordinate ranges, sensible geographic relationships).

Geometric Algebra

Techniques

A mathematical framework (Clifford algebras) that extends vectors with operations for rotations, reflections, and higher-dimensional relationships.

Geometric Biases

Techniques

Structural constraints added to a model to encode domain knowledge about geometry, such as crystal lattice properties.

Geometric Consistency

Techniques

Maintaining structural and spatial accuracy across multiple views or representations of a 3D object.

Geometric Coupling

Techniques

The alignment between router weight directions and expert weight directions that emerges during training.

Geometric Reconstruction

Techniques

Building a 3D model of a scene from video or images by estimating depth and camera motion.

Geometric Separability

Techniques

Property where data points can be separated into groups using a linear boundary in vector space.

Geometry-Grounded Tokens

Techniques

Multimodal representations that preserve spatial and geometric information about the scene to maintain disambiguating context.

Geospatial Analytics

Techniques

Using machine learning and statistics to analyze data tied to geographic locations.

GGUF

Formats

A file format for quantized models designed for efficient CPU and GPU inference with llama.cpp.

GGUF Format

Formats

A file format designed for efficient storage and loading of large language and embedding models, optimized for fast inference on various hardware.

Girsanov Change of Measure

Techniques

A mathematical technique for reweighting probability distributions along trajectories without computing gradients.

Global Attention

Techniques

Attention mechanism where each token can attend to all preceding tokens in the sequence.

Global Majority

Techniques

Populations and nations that represent the numerical majority of the world but are historically marginalized in Western-dominated systems.

Goal Drift

Techniques

When an AI agent gradually abandons its original objective and pursues different goals instead.

Goal Embedding

Techniques

A low-dimensional vector that captures task identity and enables rapid adaptation to new tasks without retraining.

Goal Misspecification

Techniques

When an AI system's stated objective doesn't match the actual intended outcome, leading to unintended behaviors.

Governance Constraints

Techniques

Rules and policies that limit AI autonomy to ensure oversight, safety, and alignment with organizational values.

Governance Framework

Techniques

A set of rules and structures that constrain and guide AI behavior to ensure reliability and consistency.

GPL-3.0 License

Licensing

An open-source license that allows free use and modification of software, but requires any derivative works to also be open-source under the same license.

GPT Architecture

Architecture

A transformer-based neural network design that processes text sequentially and predicts the next word based on previous context.

GPT-2 Architecture

Architecture

A transformer-based neural network design from OpenAI that processes text sequentially to predict and generate the next word in a sequence.

GPT-2 Architecture

Architecture

An older transformer-based design for language models that generates text by predicting one word at a time, simpler and smaller than modern alternatives.

GPT-2 Variant

Architecture

A modified version of the GPT-2 architecture that changes the original design, such as by reducing size or adjusting training.

GPT-3-Style Architecture

Architecture

A transformer-based design that follows the same structural principles as OpenAI's GPT-3 model, using layers of attention mechanisms to process text.

GPT-Family Architecture

Architecture

A class of transformer-based language models descended from the original GPT design, characterized by autoregressive text generation and broad general-purpose capabilities.

GPT-J Architecture

Architecture

A transformer-based neural network design that uses self-attention to process and generate text, serving as the structural blueprint for this model.

GPT-NeoX

Architecture

An open-source large language model architecture based on the GPT design, created as an alternative to closed-source models.

GPT-NeoX Architecture

Architecture

An open-source transformer-based architecture designed for training large language models, similar in structure to GPT models.

GPT-Style Architecture

Architecture

A neural network design based on transformer technology that processes text sequentially and generates one word at a time.

GPTQ

Formats

A quantization technique that compresses model weights to lower precision, reducing file size and memory requirements while maintaining reasonable performance.

GPU Allocation

Techniques

Assigning GPU resources to different models or tasks to optimize throughput and latency.

GPU Contention

Techniques

Performance degradation that occurs when multiple inference requests compete for the same GPU's memory and compute resources.

GPU Memory

Deployment

The high-speed memory on a graphics processor used to store and process model weights and computations during inference.

GPU Optimization

Deployment

Designing and tuning a model to run efficiently on graphics processing units (GPUs), which are specialized hardware that accelerates AI computations.

Gradient Alignment

Techniques

A technique ensuring that gradient updates from different tasks point in compatible directions to avoid conflicts.

Gradient Approximation

Techniques

Estimating how model parameters should change without actually computing full gradients or updates.

Gradient Based Optimization

Techniques

Improving model performance by following the direction of steepest improvement in parameters.

Gradient Bias

Techniques

Systematic error in gradient estimates that prevents optimization from reaching the true optimum.

Gradient Boosting

Techniques

Building models sequentially where each new model corrects errors from previous ones.

Gradient Clipping

Techniques

Limiting the magnitude of gradients during training to prevent extreme updates and improve stability.

Gradient Communication

Techniques

Sending model weight updates between devices and servers during distributed training, a major bottleneck on bandwidth-limited networks.

Gradient Compression

Techniques

Reducing the size of gradient data to speed up training on distributed systems.

Gradient Conflict

Techniques

When different training objectives pull model updates in opposing directions, causing optimization to fail or degrade.

Gradient Normalization

Techniques

Scaling gradient values to maintain consistent learning rates across different parameter groups or layers.

Gradient Reversal

Techniques

A training technique that flips gradient signs to force a model to learn features that fool an adversarial classifier.

Gradient Surgery

Techniques

Technique that selectively modifies or blocks gradient flow to prevent interference between different learning objectives.

Gradient-Based Explanation (GradCAM)

Techniques

An explainability technique that uses model gradients to identify which input features most influence predictions.

Gradient-Based Initialization

Techniques

Setting starting values for trainable parameters using information from model gradients to improve convergence and final performance.

Gradient-Free Optimization

Techniques

Optimizing a function without computing gradients, using only function values or rankings.

Grammatical Error Correction

Behavior

A task where a model identifies and fixes grammar, spelling, and syntax mistakes in written text.

Grammatical Gender

Techniques

A linguistic system where nouns and related words are classified into categories requiring specific agreement patterns.

Graph Attention

Techniques

An attention mechanism that learns weighted interactions between nodes in a graph structure.

Graph Classification

Techniques

The task of assigning a label or category to an entire graph based on its structure and node features.

Graph Domain Adaptation

Techniques

Transferring knowledge from a labeled source graph to an unlabeled target graph when their structures or distributions differ.

Graph Edit Distance (GED)

Techniques

A measure of how different two graphs are, based on the minimum edits needed to transform one into the other.

Graph Encoding

Techniques

Converting a graph structure into a compact text representation that preserves its properties.

Graph Neural Network

Techniques

A neural network that operates on graph-structured data by passing messages between connected nodes to learn relational patterns.

Graph Neural Networks

Techniques

Neural networks designed to process graph-structured data by learning representations of nodes and edges.

Graph Representation Learning

Techniques

Methods for converting graph structures into numerical vectors that preserve meaningful information about nodes and edges.

Graph-Based Multistep Process

Techniques

A structured approach that represents events and their relationships as a graph and processes them in sequential stages.

Greedy Decoding

Techniques

Generating text by always selecting the highest-probability next token, without exploring alternatives.

Ground Truth

Techniques

Accurate reference labels or measurements used to train and evaluate machine learning models.

Ground Truth Factors

Techniques

The actual underlying causes or features that explain observed data in a system.

Grounded Reasoning

Techniques

AI reasoning that relies on specific documents or data provided to the model, rather than just its training knowledge.

Grounding

Behavior

The practice of ensuring a model's responses are based on and supported by provided source documents rather than generated from general knowledge.

Group Entropy

Techniques

A generalized measure of uncertainty or disorder that follows mathematical group rules, extending beyond standard entropy.

Group Relative Policy Optimization

Techniques

A training method that improves model reasoning by comparing outputs and rewarding better explanations.

Group Size

Deployment

In quantization, the number of weights that share a single scaling factor; smaller groups preserve more precision but use more memory, while larger groups save more memory but may lose detail.

Group Wise Quantization

Techniques

Reducing model size by compressing weights in groups rather than individually.

Group-Level Simulation

Techniques

Predicting aggregate behavior of a group of users rather than individual users, useful for testing business strategies.

Grouped-Query Attention

Architecture

An optimization technique that reduces memory usage and speeds up inference by having multiple query heads share the same key and value heads instead of each having their own.

GRPO

Techniques

Group Relative Policy Optimization, a reinforcement learning algorithm for fine-tuning language models with reward signals.

Guardrails

Behavior

Safety mechanisms built into a model to refuse harmful requests or prevent it from generating unsafe content.

Gui Agent

Techniques

An AI system that interacts with computer interfaces by clicking, typing, and navigating screens.

GUI Grounding

Behavior

The ability to identify and locate specific elements (like buttons or text fields) within a graphical user interface based on natural language descriptions.

Guidance

Techniques

A technique to steer AI generation toward desired outputs by providing additional control signals during inference.

Guidance Mechanism

Techniques

A technique that steers a model's output toward desired behavior by balancing multiple objectives during inference.

Guided Decoding

Techniques

Steering a model's text generation process using external signals or constraints without modifying the model itself.

Guided In-Sample Selection (GIST)

Training

A training technique that intelligently selects the most informative examples from your training data to improve model efficiency and performance.

Gumbel-Softmax Sampling

Techniques

A differentiable relaxation technique that approximates discrete choices to enable gradient-based optimization.

H

Hallucination

Behavior

When a model generates plausible-sounding but factually incorrect or fabricated information.

Hallucination Detection

Evaluation

The ability to identify when a model generates false or unsupported information that isn't grounded in the provided source material.

Hamilton Jacobi Bellman Equation

Techniques

A mathematical equation solving optimal decision-making problems over time.

Hamiltonian Path

Techniques

A route that visits every location exactly once without repeating any node.

Hamiltonian Simulation

Techniques

A quantum computing technique that simulates the evolution of a physical system described by a Hamiltonian.

Handwriting Recognition

Behavior

The ability of a model to identify and interpret handwritten characters and words from images, accounting for variations in writing style and quality.

Hard Constraint

Techniques

A rule that must always be satisfied during optimization, rather than being treated as a soft penalty that can be violated.

Hard Negatives

Training

Challenging negative examples that are similar to the target but still incorrect, used during training to make the model learn more nuanced distinctions.

Hardware Optimization

Deployment

Tuning a model's design or training to run more efficiently on specific hardware (like NVIDIA GPUs), reducing memory usage and inference time.

Harm Taxonomy

Training

A structured system that categorizes different types of harmful content (like violence, hate speech, or misinformation) so a model can recognize and classify them.

Harmonic Reasoning

Architecture

An architecture that alternates between thinking (reasoning about a problem) and acting (taking physical steps), allowing the model to plan and execute robot actions iteratively.

Harness Engineering

Techniques

The design and implementation of control systems that manage agent behavior and task execution.

Hazard Analysis

Techniques

Systematic process of identifying potential failures and dangerous scenarios in a system.

Hazard Function

Techniques

The instantaneous rate of an event occurring at a given time, conditional on survival up to that time.

Head-wise Causal Intervention

Techniques

Systematically disabling individual attention heads to determine which ones are causally responsible for specific model behaviors.

Hermite Expansions

Techniques

Mathematical technique to approximate probability distributions using orthogonal polynomials.

Hessian Spectrum

Techniques

The complete set of eigenvalues of the loss function's second-derivative matrix, describing the curvature in all directions.

Heterogeneous Preferences

Techniques

Systematic differences in how different groups (by language, task, etc.) rank or prefer models.

Heterogeneous Treatment Effects (HTE)

Techniques

Differences in how a treatment affects different individuals based on their characteristics.

Heuristic

Techniques

A practical problem-solving method that finds good solutions quickly without guaranteeing optimality.

Hidden Dimension

Architecture

The size of the internal vector representation used by a neural network to process and store information about the input.

Hidden Representations

Techniques

The internal numerical values a neural network computes at each layer as it processes input.

Hidden Size

Architecture

The dimensionality of the internal representations that a neural network uses to encode information about text.

Hidden State Poisoning Attack

Techniques

An adversarial attack that injects malicious tokens to corrupt a model's internal memory and degrade performance.

Hidden States

Techniques

Internal representations computed by neural networks that capture learned patterns.

Hierarchical Aggregation

Techniques

Combining multiple independent predictions or estimates using a structured approach that accounts for differences in their reliability.

Hierarchical Attention

Techniques

A multi-stage attention approach that first selects relevant tokens coarsely, then applies fine-grained attention on the selected subset.

Hierarchical Calibration

Techniques

A statistical technique using Platt scaling with a hierarchical prior to adjust model confidence while preventing over-shrinking of extreme predictions.

Hierarchical Clustering

Techniques

An unsupervised learning method that builds a tree of nested clusters by repeatedly merging or splitting groups based on similarity.

Hierarchical Encoder

Architecture

A neural network component that processes images at multiple levels of detail simultaneously, capturing both fine details and broad patterns.

Hierarchical inference

Techniques

A multi-level approach to reasoning where information is processed and combined across different levels of abstraction.

Hierarchical Memory

Techniques

Storage system using multiple memory tiers (e.g., fast GPU memory and slower CPU memory) to balance speed and capacity.

Hierarchical Planning

Techniques

Planning at multiple levels of abstraction, where high-level plans are refined into low-level actions.

Hierarchical Reasoning

Techniques

Breaking down a complex decision into multiple levels, like deciding family → genus → species in order.

Hierarchical Reinforcement Learning

Techniques

Breaking complex tasks into simpler sub-tasks organized in levels, where agents learn high-level strategies and low-level actions separately.

Hierarchical Representation Extraction

Techniques

A technique that aggregates features from multiple layers of a neural network to create multi-scale guidance signals.

Hierarchical Verification

Techniques

Testing correctness at multiple levels: properties, interactions, and full rollouts to ensure system correctness.

High-Level Synthesis (HLS)

Techniques

The process of automatically converting algorithmic descriptions into hardware designs, typically using pragmas and code transformations.

Higher-Order Derivatives

Techniques

Derivatives beyond the first order (gradients) that capture more complex relationships in how inputs affect outputs.

Hilbert-Space Capacity

Techniques

The exponential growth in the number of quantum states available as more qubits are added to a quantum system.

Hindsight Utility Signals

Techniques

Performance feedback derived from comparing baseline and skill-enhanced rollouts to guide skill and policy updates.

Hitting Time

Techniques

The expected number of steps for an algorithm to reach a target state from a starting point.

Homographic Adaptation

Training

A training technique that simulates viewing images from different angles and perspectives to teach the model to recognize the same features under geometric transformations.

Honesty Elicitation

Techniques

Techniques to make AI models produce truthful responses instead of false or misleading ones.

Hopfield Network

Techniques

A type of recurrent neural network with symmetric connections used for associative memory and optimization.

Householder Reflection

Techniques

A linear algebra operation that reflects vectors across a hyperplane, used here to align word direction vectors.

HuBERT

Training

A self-supervised learning approach for audio that learns meaningful speech representations by predicting masked portions of audio, similar to how language models learn from text.

Human Motion Prediction

Techniques

Forecasting future body positions and movements based on past motion sequences.

Human Uplift Study

Techniques

A controlled experiment measuring how much an AI system improves human performance compared to working without it.

Human-AI Collaboration

Techniques

A workflow where humans and AI agents work together, with AI assisting at multiple stages rather than just solution generation.

Human-in-the-Loop

Techniques

A system where AI predictions are reviewed and validated by human experts before final decisions.

Humanoid policy learning

Techniques

Training robot control policies by learning from human movement demonstrations.

Hybrid Architecture

Architecture

A model that combines two different neural network designs (in this case, Mamba2 and attention mechanisms) to balance speed and performance.

Hybrid Mamba-Transformer Architecture

Architecture

A neural network design that combines Mamba (a fast, efficient sequence model) with Transformer components to balance speed and capability.

Hybrid Memory

Techniques

A memory system combining learnable parameters with non-learnable mechanisms to balance flexibility and efficiency.

Hybrid Thinking Mode

Behavior

A capability that allows a model to switch between fast, direct responses and slower, more deliberate reasoning depending on task complexity.

Hyperbolic Geometry

Techniques

A non-Euclidean geometry where space curves negatively, naturally suited for representing hierarchical and tree-like structures.

Hypergraph Network

Techniques

A neural network that models relationships between multiple elements simultaneously, capturing high-order interactions beyond pairwise connections.

Hypernetwork

Techniques

A neural network that generates weights for another neural network instead of learning them directly.

Hyperparameter

Techniques

A configuration setting (like learning rate or network size) that you choose before training a model.

Hyperparameter Transfer

Techniques

Using optimal hyperparameters found at small scale to train larger models without expensive retuning.

Hypersimplex

Techniques

A geometric shape in high-dimensional space used in optimization and probability theory.

Hypersphere Optimization

Techniques

Training method that constrains weight matrices to lie on a fixed-norm hypersphere for improved stability and scaling.

Hyperspherical Geometry

Techniques

Mathematical structure where points lie on the surface of a high-dimensional sphere, preserving directional relationships.

Hyperspherical Structure

Techniques

A geometric arrangement where data points lie on the surface of a high-dimensional sphere, preserving directional relationships.

I

Identifiability

Techniques

The ability to uniquely determine a model's parameters from observed data.

Identity Governance

Techniques

The policies, processes, and controls that manage who (or what) can access systems and data, and what actions they are authorized to perform.

Identity Persistence

Techniques

Maintaining consistent, unique identifiers for entities across different systems and time periods.

Identity Preservation

Techniques

Keeping a person's unique facial characteristics unchanged while editing other attributes like expressions.

Identity-Expression Decoupling

Techniques

Separating what makes a face unique (identity) from how it moves (expression) so each can be controlled independently.

Image Captioning

Behavior

The task of automatically generating a text description of what appears in an image.

Image Editing

Techniques

Modifying specific parts of an existing image while preserving other elements.

Image Encoder

Architecture

A neural network component that converts images into numerical representations that capture visual features and patterns.

Image Segmentation

Evaluation

A computer vision task that divides an image into regions or labels each pixel to identify different objects or areas.

Image Signal Processor (ISP)

Techniques

Hardware in cameras that processes raw sensor data into final images, increasingly using AI for enhancement.

Image Tokenization

Architecture

The process of converting images into discrete tokens (small units) that a language model can process, similar to how it handles text.

Image-Text Reasoning

Behavior

The ability to understand and answer questions that require analyzing both visual content and textual information together.

Image-to-Code Generation

Behavior

The ability to analyze a visual image and automatically produce source code that recreates or represents that image's structure and content.

Image-to-Text Generation

Behavior

The task of automatically generating natural language descriptions of images, converting visual information into written words.

Imitation Learning

Techniques

Training a model to copy behavior from expert examples without understanding the reasoning behind decisions.

Imitation Policy

Techniques

A learned behavior that mimics actions from human demonstrations or other expert examples.

Impact Analysis

Techniques

Identifying which parts of a system are affected by a proposed code change.

Imperfect-Information Games

Techniques

Games where players don't know all relevant information, like hidden opponent cards or future draws.

Implicit Constraint

Techniques

A limitation that emerges naturally from the training setup rather than being explicitly specified.

Implicit Curriculum

Techniques

A hidden, structured order in which models naturally learn skills during pretraining, without explicit curriculum design.

Implicit Differentiation

Techniques

Computing gradients through an implicit equation without unrolling iterations, keeping memory constant.

Implicit Intention

Techniques

A user's underlying goal or need that is not directly stated but must be inferred from context.

Implicit Neural Networks

Techniques

Neural networks defined by equations that must be solved rather than computed layer-by-layer, enabling parameter efficiency.

Implicit Patterns

Techniques

Structured behaviors that emerge naturally from an LLM's token-level decisions without being explicitly programmed or instructed.

Implicit Prediction

Techniques

Inferring unobserved values or outcomes from historical patterns in data without explicit instruction.

Implicit Preference Signal

Techniques

Information about what a community values inferred from their behavior (like engagement and acceptance) rather than explicit feedback.

Impoliteness Framework

Techniques

Culpeper's framework analyzing how language can intentionally or unintentionally cause offense or disrespect.

Importance Resampling

Techniques

A technique to adjust samples drawn from one distribution to match another by weighting them by their probability ratio.

Importance Reweighting

Techniques

Adjusting sample weights to correct for sampling from the wrong distribution.

Importance Sampling

Techniques

A technique to estimate gradients by reweighting samples from one distribution to match another.

In Context Learning

Techniques

Learning from examples provided in a prompt without updating model weights.

In Situ Compression

Techniques

Data compression performed during simulation execution rather than after data is written to disk.

In-Batch Negatives

Training

A training technique where negative examples (dissimilar samples) come from other items in the same training batch, helping the model learn to distinguish between similar and dissimilar texts.

In-Weight Retrieval

Techniques

A mechanism where relevant information is retrieved from model parameters themselves rather than from external memory or attention, helping reduce computational bottlenecks.

Incentive Alignment

Techniques

Ensuring that the goals and rewards of different agents or system components work toward the same overall objective.

Incentive Sensitivity

Techniques

How well a model adjusts its behavior when the rewards or payoffs for different actions change.

Incongruity-Resolution Theory

Techniques

A theory of humor based on identifying mismatches in expectations and then resolving them in unexpected ways.

Indic Scripts

Behavior

Writing systems used for South Asian languages like Hindi, Tamil, Telugu, and Bengali that have distinct characters and phonetic rules.

Indicators of Compromise (IOCs)

Techniques

Artifacts or evidence left behind by attackers (like malicious URLs, IP addresses, or file hashes) that reveal a security breach.

Indirect Prompt Injection

Techniques

An attack where malicious instructions are hidden in data an AI agent retrieves, causing unintended actions.

Inductive Bias

Techniques

Built-in assumptions about how data should behave, like physics rules, that help models learn faster with less data.

Inertial Measurement Unit (IMU)

Techniques

A sensor that measures acceleration and rotation to track motion without external references.

Inference

Deployment

The process of running a trained model to generate predictions or outputs from new inputs.

Inference Accelerator

Techniques

Specialized hardware designed to speed up the execution of trained AI models.

Inference Compute

Deployment

The computational resources and processing power required to run a model on new data after it has been trained.

Inference Cost

Performance

The computational resources and time required to run a model on new inputs, typically measured in memory usage and processing time.

Inference Efficiency

Performance

The ability of a model to generate outputs quickly and with low computational resource consumption during real-world use.

Inference Engine

Deployment

Software that runs a trained model to generate predictions or outputs; vllm is an optimized inference engine for large language models.

Inference Framework

Deployment

Software that optimizes how a trained model runs on specific hardware; MLX is an Apple-optimized framework for efficient inference on Apple Silicon.

Inference Latency

Performance

The time it takes for a model to generate a response after receiving an input.

Inference Optimization

Deployment

Techniques and design choices that make a model faster and more efficient to run on hardware, prioritizing speed and resource usage over training flexibility.

Inference Precision

Deployment

The numerical precision (number of bits) used when running a model to generate outputs; lower precision is faster but may reduce quality.

Inference Serving

Techniques

A system that hosts trained ML models and processes incoming prediction requests on deployed hardware like GPUs.

Inference Speed

Performance

How quickly a model can generate predictions or outputs after being given an input, measured in time per token or tokens per second.

Inference Speedup

Techniques

Reduction in time needed to run a model and get results, measured as a multiple of the original speed.

Inference Time

Performance

The amount of time it takes for a model to process input and generate output after it has been trained.

Inference-Time Computation

Performance

Extra processing power spent by the model while generating a response to think through problems more carefully before answering.

Inference-time Compute

Techniques

The computational resources used when a model generates answers, as opposed to during training.

Inference-Time Error Correction

Techniques

Detecting and fixing model mistakes during generation without retraining, using only the current forward pass.

Inference-Time Reward Model

Techniques

A model used during generation to score outputs without requiring retraining of the main system.

Inference-Time Scaling

Performance

A technique where a model allocates more computational resources and time during inference (when generating answers) to improve quality and accuracy on harder problems.

Informal Proof

Techniques

A mathematical proof written in natural language rather than formal logical notation.

Information Aggregation

Techniques

Combining fragmented knowledge from multiple sources to make better collective decisions than any single source could.

Information Asymmetry

Techniques

When one party in a transaction has more or better information than the other, creating imbalanced power.

Information Bottleneck

Techniques

A point in a system where information capacity is severely limited, constraining overall performance.

Information Density

Techniques

The amount of useful, non-redundant information contained in a token or representation.

Information Extraction

Behavior

The task of automatically identifying and pulling out specific data or facts from documents, such as names, dates, or amounts from forms.

Information Gain

Techniques

The reduction in uncertainty about a target achieved by knowing a feature.

Information Geometry

Techniques

Mathematical framework treating probability distributions as points in curved space, measuring optimization difficulty via curvature.

Information Leakage

Techniques

When a model accidentally learns from information it shouldn't have access to, like future data or test set details.

Information Retrieval

Evaluation

The task of finding relevant documents or passages from a large collection in response to a user query.

Information Synthesis

Behavior

The process of gathering data from multiple sources and combining it into a coherent, unified response or summary.

Inline Deployment

Deployment

Running a model as an intermediate processing layer within an application pipeline, typically to filter or validate data before it reaches the main system.

Inoculation Prompting

Techniques

A safety intervention using statements with specific linguistic forms to prevent misaligned behavior, but which can paradoxically trigger misalignment on similar-form inputs.

Inpainting

Techniques

The task of filling in missing or masked regions of an image while maintaining coherence with the surrounding content.

Input Convex Neural Network (ICNN)

Techniques

A neural network architecture designed to be convex in its inputs, useful for constrained optimization and learning convex functions.

Input Modality

Architecture

The type of data a model can accept as input, such as text, images, or audio.

Input Resolution

Architecture

The pixel dimensions (448×448 in this case) at which the model processes images, affecting the level of visual detail it can perceive.

Input Validation

Techniques

Checking that input data meets basic requirements (correct format, expected properties, no obvious errors) before processing it.

Input-dependence

Techniques

How much a model's behavior changes in response to different inputs, crucial for generalization.

Input/Output Modalities

Architecture

The types of data a model can accept as input and produce as output, such as text, images, or audio.

Insight Generation

Techniques

The process of producing additional relevant information or perspectives that extend or improve an initial answer.

Insight Recognition

Techniques

Identifying the core techniques or key ideas needed to solve a complex problem.

Instance Detection

Techniques

Identifying and locating individual objects of the same class separately in an image.

Instance-Level Control

Techniques

The ability to apply different settings or modifications to individual objects within a scene independently.

Instruction Hierarchy

Techniques

The ability of a model to follow primary instructions even when secondary or conflicting instructions are present.

Instruction-Following

Behavior

The ability of a model to understand and execute specific tasks or commands given in natural language prompts.

Instruction-Tuned

Training

A model fine-tuned on instruction-response pairs so it follows user prompts more reliably.

Instruction-Tuning

Training

A training process that teaches a model to follow specific user instructions and commands, improving its ability to respond appropriately to requests.

Instrumental Convergence

Techniques

The prediction that advanced AI agents will pursue certain goals (like self-preservation) regardless of their final objectives.

Instrumental Validity Chain

Techniques

A sequence of checks replacing ground-truth labels: responsiveness to safe/unsafe contrasts, dominance of target variance, and stability across reruns.

Int4 (4-bit Integer)

Formats

A specific quantization format that represents model weights using only 4 bits per value, significantly reducing model size while maintaining reasonable performance.

INT4 Precision

Formats

A quantization method that represents model weights using only 4-bit integers instead of full-precision floating-point numbers, dramatically shrinking the model's memory footprint.

INT4 Quantization

Deployment

A compression technique that reduces a model's size and memory usage by storing weights as 4-bit integers instead of higher-precision numbers, making it faster and cheaper to run with minimal accuracy loss.

Int8 Precision

Techniques

Using 8-bit integers instead of floating-point numbers to represent model weights and activations.

Integer Linear Program (ILP)

Techniques

A mathematical optimization technique that finds the best solution among discrete options subject to linear constraints.

Integrability

Techniques

A mathematical property ensuring that estimated demand relationships are economically consistent and don't violate basic economic laws.

Integral Probability Metrics (IPM)

Techniques

A class of distance measures between probability distributions that use function classes to define divergence.

Intent Alignment

Techniques

The ability of an AI system to understand and match user goals, especially when requirements are unclear or evolving.

Intent Classification

Behavior

The process of analyzing user input to determine what the user is trying to accomplish so it can be handled appropriately.

Intent Extraction

Techniques

The process of identifying and structuring the user's underlying goal or request from natural language input.

Intent Formation

Techniques

The process of users clarifying and developing their goals through interaction rather than starting with fully-formed objectives.

Intent Recognition

Behavior

The model's capability to understand what a developer actually wants to accomplish, even when the request is vague or expressed in informal language.

Intent-First Design

Techniques

Specifying what you want to accomplish rather than writing detailed code to implement it.

Inter-Annotator Agreement

Techniques

A measure of how consistently multiple human annotators label or judge the same data.

Inter-channel Interactions

Techniques

Dependencies and relationships between different variables or channels in multivariate data.

Inter-evaluator Agreement

Techniques

A measure of how consistently different judges rate the same outputs, typically using metrics like correlation or ICC.

Inter-Part Relations

Techniques

The spatial, functional, or semantic relationships and dependencies between different parts of a composed object.

Inter-rater Agreement

Techniques

A measure of how consistently different evaluators score or judge the same items, often using metrics like Kendall's tau.

Inter-task Gradient Equity

Techniques

Ensuring that learning signals from different tasks contribute equally to model updates, preventing any single task from dominating training.

Inter-Teacher Agreement

Techniques

A measure of how much multiple teacher models agree on their predictions, used to assess supervision reliability.

Interaction Awareness

Techniques

A model's understanding of how conversations naturally flow and how users respond to assistant outputs.

Interaction history

Techniques

The sequence of past user actions and system responses that inform current decision-making.

Interactive AI System

Techniques

An AI tool designed for back-and-forth collaboration with humans, refining intent and outputs through dialogue.

Interactive Dialogue

Behavior

A conversational interface where users can ask follow-up questions and receive responses based on previous context, rather than just one-shot predictions.

Interactive Imitation Learning (IIL)

Techniques

Training a policy by having humans intervene and correct the robot, then learning from those corrections.

Interdisciplinary Reasoning

Techniques

Combining insights and methods from multiple academic disciplines to solve problems in a target domain.

Interleaved Inputs

Architecture

The ability to mix images and text in any order within a single prompt, rather than requiring all images first or all text first.

Intermediate Rewards

Techniques

Giving feedback at multiple steps during reasoning, not just at the final answer, to guide the model's thinking process.

Internal Reasoning Process

Behavior

A deliberate step-by-step thinking mechanism that occurs before generating a response, helping the model work through complex problems more carefully.

Internal representations

Techniques

The hidden patterns and knowledge stored inside a model's layers that it uses to understand and generate text.

Internal Thinking Process

Architecture

A hidden computation phase where the model reasons through a problem before producing its final answer, improving accuracy on complex tasks.

Internal Validity

Techniques

Whether a study actually measures what it claims to measure, without confusing factors distorting the results.

Interpretability

Evaluation

The ability to understand and explain how a model makes decisions and what it has learned from its training data.

Interpretable Models

Techniques

Machine learning models designed to be understandable to humans, showing why they make specific predictions.

Interruption Timing

Techniques

Determining the appropriate moment to interject in a conversation based on natural dialogue cues.

Intervention Calibration

Techniques

The ability to decide when an agent should proactively act, when to seek user consent, and when to remain silent.

Intra-Class Consistency

Techniques

Whether a model applies the same reasoning strategy when classifying different instances of the same category.

Intra-Group Consistency

Techniques

Ensuring that related elements (like a person's face across frames) maintain consistent properties throughout.

Intra-Modal Dispersion

Techniques

The degree of disagreement in how different models within the same modality (e.g., vision models) represent a single stimulus.

Intra-modal similarity

Techniques

Measuring how similar consecutive frames or audio segments are within a single modality.

Intra-Utterance Variation

Techniques

Changes in paralinguistic features within a single spoken sentence, like shifting emotion mid-sentence.

Intrinsic Decomposition

Techniques

Breaking down an image into fundamental components like albedo (color), shading (lighting), and residuals (fine details).

Intrinsic Geometry

Techniques

The geometric properties of a space as measured from within, independent of how it's embedded in higher-dimensional space.

Intrinsic Motivation

Techniques

A reward signal that encourages an agent to explore and discover new states, separate from task-specific rewards.

Intrinsic Rewards

Techniques

Reward signals generated from the model's own internal signals, like confidence scores, rather than external verification.

Invariant Transformation

Techniques

A change that preserves key properties or predictions of a model.

Invariant-enforcing tool protocol

Techniques

A specification that defines preconditions and postconditions for tool calls to prevent invalid action sequences.

Inverse execution

Techniques

Predicting what inputs or earlier program states must have been to produce a given output.

Inverse Problem

Techniques

Finding the input that produces a known output, when the forward process is complex or many-to-one.

Inverse Problems

Techniques

Finding input causes from observed output effects, often ill-posed.

Inverse Reasoning

Techniques

Working backward from a desired outcome to determine what actions would produce that result.

Inverse Specification Reward

Techniques

A reward signal that measures quality by having an LLM recover the original task specification from generated outputs.

Inverse-Probability Weighting

Techniques

A technique that reweights observations to remove confounding bias by accounting for treatment assignment probabilities.

Inverted Index

Deployment

A data structure that maps terms to the documents containing them, enabling fast keyword-based search similar to how a book's index works.

Inverted Index Retrieval

Deployment

A search technique that maps vocabulary terms to documents containing them, enabling fast keyword-based lookups commonly used in search engines.

Invisible Architect

Techniques

An AI system that shapes decisions and outcomes without users recognizing its influence on the information or criteria they use.

Invisible Failures

Techniques

Errors or misalignments in AI outputs that go undetected because the user accepts the result without critical evaluation.

Ion Diffusivity

Techniques

A measure of how quickly ions move through a material, critical for battery charging and discharging speed.

IsoFLOP Curves

Techniques

Graphs showing model performance across different configurations while keeping total computational operations constant.

Isolation Forest

Techniques

An unsupervised algorithm that isolates anomalies by randomly selecting features and split values.

Isomorphism-Invariant

Techniques

A property that remains the same for graphs with identical structure, regardless of how nodes are labeled or arranged.

Item Response Theory (IRT)

Techniques

Statistical method to estimate latent abilities, question difficulty, and model proficiency from test performance.

Iterative Denoising

Techniques

The process of gradually removing noise from a noisy input through multiple refinement steps to generate clean outputs.

Iterative Development

Behavior

A workflow where code is refined through multiple rounds of small, targeted changes rather than complete rewrites.

Iterative refinement

Techniques

Repeatedly improving an output by generating versions, evaluating them, and using feedback to create better versions.

Iterative Search

Techniques

A process where the model performs multiple rounds of web searches, each building on previous results to refine and deepen its understanding of a topic.

J

Jacobian Regularization

Techniques

A technique that limits how much a model's output changes when inputs change slightly, making it more stable and predictable.

Jailbreaking

Techniques

Crafting adversarial inputs designed to bypass a model's safety guardrails and trigger harmful outputs.

Japanese Tokenization

Techniques

The process of breaking Japanese text into meaningful units (tokens), accounting for the language's unique writing systems including kanji, hiragana, and katakana.

JEPA (Joint-Embedding Predictive Architecture)

Techniques

A self-supervised learning approach that predicts future embeddings from video without reconstructing pixels.

JIT Compilation

Techniques

Converting code to machine instructions at runtime, enabling Python code to run efficiently on GPUs.

Joint Embedding Predictive Architecture

Architecture

A training approach where a model learns to predict missing parts of video by understanding both spatial and temporal patterns without reconstructing actual pixels.

Joint Embedding Space

Architecture

A shared mathematical space where different types of data (like sounds and text descriptions) are represented so similar concepts are positioned close together, enabling direct comparison.

Joint Embeddings

Architecture

A shared numerical space where different types of data (such as audio and text) are represented together, allowing the model to find relationships between them.

Joint Processing

Techniques

Processing multiple input types together in an integrated way rather than separately, allowing the model to reason about how they relate.

K

K-means Clustering

Techniques

An unsupervised algorithm that groups data points into k clusters by minimizing distance to cluster centers.

k-space

Techniques

The raw frequency domain data collected directly by an MRI scanner before conversion to images.

k-sparse probing

Techniques

A technique to analyze neural networks by identifying which neurons or experts are most important for specific tasks.

Kalman Filter

Techniques

A recursive algorithm that estimates the state of a dynamic system by optimally combining noisy measurements with a mathematical model.

Kaplan-Meier Estimator

Techniques

A nonparametric method for estimating survival curves from censored data without assuming a specific distribution.

Kernel Density Estimator (KDE)

Techniques

A non-parametric method that estimates probability distributions by smoothing data points with kernel functions.

Kernel Fusion

Techniques

Combining multiple GPU operations into a single optimized computation to reduce memory overhead and improve speed.

Kernel Optimization

Techniques

Tuning kernel functions to improve performance in kernel-based models.

Kernel RKHS

Techniques

A mathematical framework using reproducing kernel Hilbert spaces for classification and regression with theoretical guarantees.

Key-value caches

Techniques

Internal memory structures in transformers that store computed representations to speed up inference and enable agent communication.

Key-Value Heads

Architecture

Attention mechanism components that store and retrieve information; fewer heads means reduced model capacity and faster computation.

Keyframe

Techniques

A reference frame in a video that serves as an anchor point for propagating edits or information to surrounding frames.

Keypoint Correspondence

Techniques

Matching specific visual landmarks (like object corners) between a demonstration and a new scene to align actions.

Keypoint Detection

Behavior

The task of automatically identifying and locating distinctive points of interest in an image that remain stable across different angles and lighting conditions.

KL Divergence

Techniques

A measure of how different one probability distribution is from another, used to evaluate sampling quality.

Knowledge Augmented Evaluation

Techniques

Assessing models using external knowledge sources for better judgment.

Knowledge Base

Techniques

A structured or unstructured collection of documents and facts that a system retrieves from to answer queries.

Knowledge Ceiling

Behavior

The limit to how much factual information a model can reliably know or recall, often constrained by its size and training data.

Knowledge Component

Techniques

A discrete unit of knowledge or skill that can be identified and measured in student work.

Knowledge Consolidation

Techniques

The process of organizing, storing, and synthesizing insights from multiple experiments to improve future decision-making.

Knowledge Cutoff

Behavior

The date up to which a model has been trained on data; it cannot reliably answer questions about events or information after this date.

Knowledge Distillation

Training

A technique that compresses a large, complex model into a smaller one by training the smaller model to mimic the larger model's behavior.

Knowledge Gap Identification

Techniques

An agent's ability to recognize what information or skills it lacks to solve a problem.

Knowledge Graph

Architecture

A structured database that stores facts as relationships between entities (like 'Einstein' connected to 'Physics'), enabling machines to reason about real-world knowledge.

Knowledge Graph Completion

Evaluation

The task of filling in missing facts or relationships in a knowledge graph by predicting what connections should exist based on patterns in existing data.

Knowledge Tracking

Techniques

Monitoring and recording what a student has demonstrated they understand over time.

Knowledge Transfer

Techniques

Applying knowledge learned from one task to improve performance on another.

Knowledge-grounded

Techniques

Requiring external factual information beyond what is directly observable to solve a task correctly.

Knowledge-Guided Learning

Techniques

Incorporating domain expertise or physical laws into machine learning models to improve accuracy and generalization.

Kolmogorov-Arnold Network

Techniques

A neural network architecture designed to provide flexible, expressive function approximation with interpretable structure.

Koopman operator

Techniques

A mathematical operator that transforms observable functions of a dynamical system to reveal its underlying structure and eigenvalues.

Kraus Representation

Techniques

A mathematical way to describe quantum operations that guarantees they produce physically valid quantum states.

Kronecker-Factorized Approximation

Techniques

An efficient but approximate method for parameterizing doubly stochastic matrices that sacrifices some expressivity for computational speed.

Kurdyka-Łojasiewicz Property

Techniques

A mathematical property that guarantees convergence of optimization algorithms to stationary points.

Kv Cache

Techniques

A store for previously computed key-value pairs that speeds up text generation in transformers.

KV Heads

Architecture

The number of attention head pairs used for storing and retrieving key-value information in a transformer model's attention mechanism.

KV-Cache Offloading

Techniques

Moving key-value cache data to slower storage (CPU/disk) to reduce GPU memory usage during inference.

L

Label Bias

Techniques

Systematic unfairness in training labels that causes models to learn and reproduce those biases.

Label Noise

Techniques

Errors or inaccuracies in training data labels that can degrade model performance and cause the model to memorize incorrect information.

Label-Efficient

Techniques

A learning approach that achieves good performance with minimal labeled training examples.

Label-Flipping Attack

Techniques

A poisoning attack where attackers deliberately mislabel training examples to mislead the model.

Label-Free Reward

Techniques

A training signal derived from model behavior itself rather than human-annotated labels.

Lagrangian Dual Ascent

Techniques

An optimization technique that enforces constraints by incorporating them as penalty terms into the objective function.

Landmark Cover

Techniques

A subset of representative points selected to efficiently represent a larger dataset for computation.

Langevin Dynamics

Techniques

An optimization technique that uses gradient information and randomness to explore a reward landscape.

Language Backbone

Architecture

The core language model component that processes text and generates responses based on information from other parts of the system.

Language Family

Behavior

A group of languages that share a common ancestor and similar grammatical structures, such as Romance or Slavic languages.

Language Fluency

Performance

The model's ability to generate grammatically correct, coherent, and natural-sounding text that reads as if written by a human.

Language Mixture Ratios

Techniques

The proportion of each language included in a multilingual training dataset.

Language Model

Architecture

An AI model trained to predict and generate text by learning patterns from large amounts of written data.

Language Modeling

Training

The task of predicting the next word or token in a sequence based on previous words, which is the core objective used to train text models.

Language Optimization

Training

Training or fine-tuning a model to excel at a specific language by using more native-language data and task-specific adjustments.

Language Specialization

Training

Training a model to excel at a specific language rather than trying to handle many languages equally well.

Language Typology

Techniques

The study of how languages vary in their structural features and which combinations are common across human languages.

Language-Agnostic

Behavior

A model's ability to work across multiple languages without requiring separate training for each language.

Language-Agnosticity

Techniques

The property of a representation or model component working effectively across different languages without language-specific tuning.

Language-Specific Model

Training

A language model trained primarily or exclusively on text from a single language to achieve better performance on that language than a multilingual model.

Language-Specific Pretraining

Training

Training a model on text from a particular language (Dutch, in this case) so it learns that language's unique grammar, vocabulary, and nuances rather than treating it as a variation of English.

Language-Specific Training

Training

Training a model primarily on data from a particular language, which makes it especially fluent and accurate in that language.

Language-Specific Tuning

Training

Training a model to specialize in one particular language, which makes it perform better on that language but worse on others.

Large Action Model

Behavior

A specialized AI model designed to understand instructions and convert them into structured function calls and tool interactions rather than generating free-form text.

Large Audio Language Model (LALM)

Techniques

An LLM extended with an audio encoder to understand and reason about sound and audio content.

Large Language Model

Architecture

A neural network trained on vast amounts of text data to understand and generate human language.

Late Acceptance Hill Climbing (LAHC)

Techniques

A local search algorithm that accepts solutions if they improve upon a solution from several iterations ago, balancing exploration and exploitation.

Late Fusion

Techniques

Combining predictions from separate models trained on different data sources, merging results after individual processing.

Late Interaction

Techniques

A retrieval technique that compares individual tokens between a query and document separately, then combines the results, rather than comparing pre-computed single vectors.

Late Interaction Search

Techniques

A retrieval approach that compares individual token embeddings between query and document at search time, rather than comparing pre-computed single vectors.

Late-Interaction Retrieval

Techniques

A retrieval approach that compares individual token embeddings between query and document at search time, rather than comparing pre-computed single vectors, allowing more precise matching of specific phrases and rare terms.

Latency

Performance

The time delay between sending a request and receiving the first response token from a model.

Latency Constraint

Techniques

A strict deadline requirement for how quickly data must travel from source to destination.

Latency Estimation

Techniques

Predicting how long an inference request will take to complete, accounting for hardware contention and concurrent execution.

Latency-Optimized

Performance

A model designed to produce results as quickly as possible, prioritizing speed over other factors like accuracy or feature breadth.

Latency/Throughput Predictor

Techniques

A model that estimates how fast a system can process requests and how many it can handle per unit time.

Latent Bottleneck

Techniques

The compressed representation layer in an autoencoder that forces the model to learn efficient, meaningful encodings of input data.

Latent communication

Techniques

Agents exchanging information through internal representations like embeddings or cache states rather than explicit text.

Latent Denoising

Techniques

A generative process that iteratively refines compressed representations of data by removing noise to produce coherent outputs.

Latent Diffusion Models

Techniques

Generative models that create images by learning to denoise random noise in a compressed latent space rather than pixel space.

Latent Dynamical System

Techniques

A system of equations describing how a model's hidden state evolves over time through iterative updates.

Latent Dynamics

Techniques

Hidden patterns of change in a system that cannot be directly observed but must be inferred from available data.

Latent Manifold

Techniques

A lower-dimensional surface where high-dimensional data naturally lies.

Latent Reasoning

Techniques

Reasoning performed in continuous or discrete hidden representations rather than explicit natural language.

Latent Representation

Techniques

A compressed, learned encoding that captures the essential features of data in a compact form.

Latent Representations

Techniques

Compressed, learned feature vectors that capture underlying patterns in data without explicit labels.

Latent Space

Techniques

A compressed, learned representation of data that captures its essential features in fewer dimensions.

Latent Space Representation

Techniques

A compressed, learned representation of data in a lower-dimensional space that captures hidden patterns not visible in raw observations.

Latent State

Techniques

A learned hidden representation that evolves through computation to capture task-relevant information.

Latent World Model

Techniques

A neural network that learns to predict future video frames in a compressed representation space rather than raw pixels.

Latent-Anchored GRPO (LA-GRPO)

Techniques

A training method that stabilizes reinforcement learning by anchoring functional tokens with a weighted auxiliary objective for stronger gradient updates.

Latent-Space Decomposition

Techniques

A technique to break down what a model learns internally into individual concepts or features it uses to make decisions.

LaTeX

Formats

A markup language commonly used to write mathematical equations and scientific documents in a format that renders beautifully.

LaTeX Markup

Formats

A text-based format for writing mathematical and scientific documents with precise formatting and symbolic notation.

Layer-wise Probing

Techniques

Analyzing what information is encoded in each layer of a neural network by testing intermediate representations.

Layout-Aware

Behavior

The ability to understand and use information about how text is positioned and structured on a page, not just the words themselves.

Lazy Loading

Techniques

Deferring the loading of full tool schemas until they are actually needed, keeping context compact.

Leaderboard

Techniques

A public ranking showing how different models perform on a standardized task, updated as new submissions arrive.

Leakage

Techniques

When concept representations unintentionally encode task-relevant or inter-concept information beyond their intended semantics, compromising interpretability.

Learnable Gating Sparsification

Techniques

A learned mechanism that adaptively selects which parameters to keep and which to remove in compressed task vectors.

Learning Pipeline Error Decomposition

Techniques

Framework separating total forecast error into estimation error (from training) and approximation error (from architecture).

Learning Progression

Techniques

A research-based description of how students' understanding develops in a subject over time, from novice to expert.

Learning Rate Schedule

Techniques

A predefined plan for how the learning rate changes during training to improve convergence.

Learning Rate Transfer

Techniques

Using the same learning rate setting across models of different sizes without retuning.

Ledoit-Wolf Shrinkage

Techniques

A statistical technique for improving covariance matrix estimation by shrinking it toward a simpler structure.

Leech Lattice

Techniques

A 24-dimensional mathematical structure with optimal sphere packing properties, used here to compress model weights efficiently.

Legal Reasoning

Techniques

The ability to interpret and apply legal concepts accurately, requiring understanding of domain-specific rules and nuances.

Legibility Tax

Techniques

The cost or performance loss from making a model more interpretable.

Length Scaling

Techniques

A model's ability to handle longer or more complex problem sequences than those seen during training.

Leniency Bias

Techniques

A systematic tendency to give softer or more favorable judgments, often due to awareness of negative consequences.

Level-of-Detail (LoD)

Techniques

A hierarchy of representations of the same object at different resolutions, commonly used in graphics for rendering efficiency.

Levenshtein Distance

Techniques

A measure of how different two text strings are, counting the minimum character insertions, deletions, or substitutions needed.

Lexicogrammatical Features

Techniques

Linguistic properties combining vocabulary and grammar patterns used to analyze and classify text style and register.

LiDAR

Techniques

A sensor that uses laser pulses to measure distances and create 3D maps of environments.

Lie Detection

Techniques

Methods to identify whether an AI model's response is false or misleading.

Lifelong Personalization

Techniques

Continuously adapting recommendations to a user's evolving preferences over extended periods without forgetting past patterns.

Lightweight Footprint

Performance

A model that uses fewer computational resources and memory, making it practical to run on less powerful hardware.

Lightweight Model

Architecture

A smaller, more efficient model designed to run quickly and use less memory than larger alternatives, often with some trade-off in reasoning capability.

Likelihood

Techniques

A mathematical measure of how probable the model considers a given sample, enabling exact probability calculations.

Line Coverage

Techniques

A measure of how many lines of code are executed by a test suite, indicating test completeness.

Linear Activation Steering

Techniques

A steering technique that applies learned linear transformations to model activations to control behavior.

Linear Attention

Techniques

An attention mechanism with linear complexity instead of quadratic.

Linear Bellman Completeness

Techniques

A property where the Bellman backup operation preserves linearity in value functions.

Linear Complexity

Techniques

An algorithm whose computational cost grows proportionally to input size, rather than quadratically.

Linear Compute

Techniques

Computational cost that grows proportionally with sequence length, rather than quadratically like Transformers.

Linear Function Approximation

Techniques

Using linear combinations of features to represent value functions or policies in RL.

Linear Matrix Inequality (LMI)

Techniques

A mathematical condition expressed as a matrix inequality that can be efficiently checked to verify system properties like stability.

Linear Probe

Techniques

A simple classifier trained on top of a model's internal representations to detect specific properties.

Linear Probes

Techniques

Simple machine learning classifiers trained on model internal states to detect specific properties like deception.

Linear Program

Techniques

An optimization problem where the objective and constraints are linear equations or inequalities.

Linear Regressor

Techniques

A simple model that maps input features to continuous numeric outputs using a linear function.

Linear Representation Hypothesis

Techniques

The idea that concepts are linearly separable in neural network embeddings.

Linear Span

Techniques

The set of all possible combinations of vectors, describing the geometric space covered by a group of features.

Linear Temporal Logic (LTL)

Techniques

A formal language for specifying how systems should behave over time, commonly used in security and software verification.

Linear time-invariant dynamics

Techniques

Systems whose behavior follows linear equations that don't change over time.

Linearized Attention

Techniques

An attention mechanism with linear computational complexity instead of quadratic, enabling faster inference.

Linguistic Competence

Techniques

A speaker's implicit knowledge of language rules and structure, distinct from actual language use.

Link Prediction

Evaluation

A task where a model predicts missing relationships between entities in a knowledge graph, such as guessing that two people are colleagues based on existing connections.

Liquid Foundation Model

Architecture

An alternative neural network architecture that uses continuous, adaptive transformations instead of fixed layers, allowing efficient processing with fewer parameters.

Liquid Neural Networks

Architecture

A neural network architecture that uses continuous, adaptive functions to process information, allowing the model to adjust its behavior dynamically based on input.

Listwise Ranking

Techniques

Ranking multiple items together as a group, rather than scoring each item independently.

Literate Image Comprehension

Behavior

The capability to read and understand text and written content within images, rather than just recognizing objects or scenes.

Live Benchmark

Techniques

A continuously updated evaluation system that scores models on new data as it arrives, rather than a fixed test set.

Llama Architecture

Architecture

A transformer-based neural network design optimized for efficient language modeling and text generation.

LLaVA Architecture

Architecture

A design pattern that connects a vision encoder to a language model, enabling the language model to understand and describe images.

LLM critic

Techniques

A language model trained to evaluate and judge outputs (like comedy sketches) based on learned human preferences.

LLM Judge

Techniques

A frozen language model used to evaluate and score other model outputs according to predefined criteria.

LLM-as-a-Judge

Techniques

Using a language model to automatically evaluate the quality of outputs from other AI systems instead of human reviewers.

LLM-as-Judge

Techniques

Using a language model to automatically evaluate or score outputs from other AI systems instead of human reviewers.

LLM2Vec

Training

A training approach that adapts a generative language model to produce high-quality text embeddings by repurposing its existing knowledge without building from scratch.

Load Balancing (Expert Utilization)

Techniques

Ensuring experts are used evenly across the model to avoid some experts being overused while others sit idle.

Local Attention

Techniques

Attention mechanism where each token only attends to a bounded window of preceding tokens instead of all previous tokens.

Local Deployment

Deployment

Running a model directly on your own computer or server instead of sending requests to a remote service.

Local Inference

Deployment

Running an AI model directly on your own computer rather than sending data to a remote server, keeping data private and reducing latency.

Local Outlier Factor

Techniques

An algorithm that identifies outliers by comparing the local density of a point to its neighbors.

Local Sufficiency

Techniques

The observation that a large model's preferred token appears in a small model's top-K predictions even when not ranked first.

Locality-Sensitive Hashing (LSH)

Architecture

A technique that groups similar items together using hashing, allowing the model to attend to relevant parts of long text without comparing every token to every other token.

Locality-Sensitive Hashing Attention

Architecture

An efficient attention mechanism that groups similar tokens together to reduce computation, allowing the model to handle longer texts without excessive memory use.

Localization

Techniques

In conformal prediction, the process of identifying similar examples to condition uncertainty estimates on local neighborhoods rather than global statistics.

Localization Fidelity

Techniques

How well an explanation's highlighted regions match ground-truth annotations from experts.

Log Anomaly Detection

Techniques

Identifying unusual or suspicious patterns in system logs that indicate errors, attacks, or failures.

Log-concave distribution

Techniques

A probability distribution whose logarithm is a concave function, ensuring nice mathematical properties.

Logical Consistency

Techniques

Ensuring that different signals or judgments from a model don't contradict each other and follow coherent logical rules.

Logical Inconsistency Detection

Techniques

Identifying misalignment by finding contradictions in a model's reasoning across equivalent scenarios with different framings.

Logical Options

Techniques

Pre-defined action sequences or skills expressed using logical rules that guide an agent toward specific goals.

Logical Subspace

Techniques

A low-dimensional region within a model's internal representations that captures reasoning logic independent of language form.

Logical Vulnerability

Techniques

A security flaw in program logic rather than memory safety that causes incorrect behavior.

Logit-Adjusted Loss

Techniques

A loss function that adjusts for class imbalance by modifying the model's output scores.

Logit-based approaches

Techniques

Methods that use the model's raw prediction scores to make decisions, rather than analyzing deeper internal patterns.

Logit-Level Distillation

Techniques

Knowledge distillation that transfers the raw model outputs (logits) rather than higher-level representations.

Logit-Space Shrinkage

Techniques

A method for combining multiple forecasts by averaging them in logit space with a data-dependent prior to reduce variance.

Long-Context

Performance

The ability of a model to process and understand very long sequences of text while maintaining coherence across distant parts of the input.

Long-Context Embedding

Architecture

An embedding model designed to process and maintain meaningful representations across very long documents (thousands of tokens), rather than just short snippets.

Long-Context Handling

Performance

The ability to process and understand very long documents or conversations without losing track of earlier information.

Long-Context Inference

Techniques

Processing input sequences much longer than a model's training context window while maintaining accuracy and efficiency.

Long-Context Reasoning

Behavior

The ability to process and understand very long input texts (thousands of tokens) while maintaining coherent reasoning across the entire passage.

Long-Context Synthesis

Behavior

The ability to process and integrate information from many sources or a large amount of text, then combine it into a coherent summary or report.

Long-Form Content Generation

Behavior

The capability to produce extended, coherent text such as articles, reports, or documents while maintaining consistency and structure throughout.

Long-Form Generation

Behavior

The capability to produce extended, coherent text outputs like essays, articles, or detailed explanations rather than just short responses.

Long-Form Text Generation

Behavior

The capability to produce extended, coherent written content such as essays, articles, or detailed explanations rather than short responses.

Long-Horizon Evaluation

Techniques

Testing an AI system's ability to maintain context and preferences across many sequential interactions over time.

Long-Horizon Retrieval

Techniques

Finding relevant information across many steps or a large dataset to answer complex multi-part questions.

Long-Horizon Tasks

Techniques

Complex goals requiring many sequential steps or decisions to complete successfully.

Long-Range Interactions

Techniques

Forces between atoms that are far apart from each other, which are harder for models to capture.

Long-Sequence Processing

Performance

The ability to handle very long input texts (thousands or more tokens) efficiently, which standard models struggle with due to computational constraints.

Long-tail knowledge

Techniques

Rare or uncommon facts that appear infrequently in training data, making them harder for models to remember accurately.

Long-tailed Distribution

Techniques

A dataset where a few common classes have many examples while rare classes have very few, causing models to bias toward common categories.

Long-tailed Distribution

Techniques

A data distribution where a few common categories dominate while many rare categories have few examples.

Long-term Memory (LTM)

Techniques

Stored structured knowledge (like diagnostic criteria) that an AI system can access during reasoning.

Look-back Dependencies

Techniques

When a step in a procedure requires referencing or using values computed in earlier steps.

Looped transformer

Techniques

A transformer that iterates multiple times at test time, spending more computation on harder problems.

LoRA (Low-Rank Adaptation)

Techniques

A technique that adds small, trainable layers to a pre-trained model instead of retraining the entire model, making fine-tuning faster and more memory-efficient.

LoRA Adapter

Techniques

A lightweight method to customize a frozen language model for specific tasks without retraining the entire model.

LoRA Fine-tuning

Techniques

Parameter-efficient fine-tuning method that adapts a pre-trained model using low-rank updates.

Loss Trajectory

Techniques

The sequence of loss values for a sample across multiple training steps, showing how the model's error on that sample changes over time.

Lossless Compression

Techniques

Reducing file size while preserving all original data perfectly, so decompression recovers the exact original.

Lost-in-the-Middle Problem

Techniques

A phenomenon where LLMs struggle to retrieve or process information from the middle of long documents or lists.

Low Latency

Performance

The ability to generate responses very quickly with minimal delay between when you send a prompt and when you receive an answer.

Low Rank Approximation

Techniques

Representing data using fewer dimensions while preserving key information.

Low-code platform

Techniques

A tool that lets non-programmers build applications by writing minimal code or using visual interfaces.

Low-Pass Propagation

Techniques

A graph signal processing technique that smooths node features by averaging information across neighborhoods.

Low-rank branch

Techniques

A lightweight neural pathway that processes information through a compressed representation to reduce computation.

Low-Rank Projection

Techniques

Compressing high-dimensional data into fewer dimensions, which can lose important information needed for accurate inference.

Low-Resource Language

Techniques

A language with limited training data and AI tools compared to English or other major languages.

Low-Resource Languages

Behavior

Languages with relatively little training data available compared to major languages like English, making them harder for AI models to learn.

LP Relaxation

Techniques

A continuous approximation of a mixed-integer program where binary constraints are relaxed, used to bound solution quality.

Lp Spaces

Techniques

Mathematical spaces of functions where the p-norm (a measure of size) is finite and well-defined.

Lyapunov Exponent

Techniques

A measure of how quickly nearby trajectories diverge in a dynamical system; determines stability and predictability.

M

Mach-Zehnder Interferometer

Techniques

An optical device that splits light into two paths and recombines them to create interference patterns for computation.

Machine Identity

Techniques

Digital credentials (API tokens, service accounts, certificates) that AI agents and automated systems use to authenticate and act in enterprise environments.

Machine Learning Force Field

Techniques

A neural network trained to predict atomic forces and energies, enabling fast simulations of molecular behavior.

Machine Learning Interatomic Potential (MLIP)

Techniques

An AI model that learns to predict forces and energies between atoms in molecules and materials.

Machine Translation

Techniques

Automated translation of text from one language to another using computational systems.

Machine Unlearning

Techniques

Removing the influence of specific poisoned data from a trained model without full retraining.

Machine-Learned Interatomic Potentials (MLIPs)

Techniques

Neural network models trained to predict forces and energies between atoms, used to simulate materials without expensive quantum calculations.

Macro Placement

Techniques

The task of arranging large functional blocks on a chip to optimize performance and minimize wiring.

Mahalanobis Distance

Techniques

A measure of distance between a point and a distribution that accounts for correlations between variables.

Mamba

Architecture

A state-space model architecture designed to process long sequences faster and with less memory than traditional transformer models.

Mamba Architecture

Architecture

A neural network design that uses state-space models as an alternative to transformers, offering faster processing and lower memory usage.

Mamba-Transformer Architecture

Architecture

A hybrid model design that combines Mamba (a state-space model) with Transformer components to process long sequences more efficiently than pure Transformers while maintaining strong performance.

Mamba-Transformer Hybrid Architecture

Architecture

A neural network design that combines selective state spaces (Mamba) with traditional attention mechanisms to process text more efficiently while maintaining strong performance.

Managed Service

Deployment

A cloud service where the provider handles infrastructure, updates, and maintenance so you only focus on using the service rather than managing it.

Manifold Hypothesis

Techniques

The assumption that high-dimensional data lies on a lower-dimensional curved surface (manifold) rather than filling the entire space.

Manifold Learning

Techniques

Discovering the underlying low-dimensional structure of high-dimensional data.

Mantissa Bits

Techniques

The fractional part of a floating point number that stores the significant digits of the value.

Margin Bound

Techniques

A theoretical guarantee on classification error based on how well-separated different classes are in the learned representation.

Marginal Likelihood

Techniques

The probability of observed data averaged over all possible model parameters, representing the true statistical objective for learning.

Markov Chain

Techniques

A sequence of events where the next state depends only on the current state, not on the history.

Markov Chain Monte Carlo (MCMC)

Techniques

A statistical sampling technique that intelligently explores parameter space to find realistic values.

Markov Chain Monte Carlo (MCMC)

Techniques

A sampling method that generates sequences of dependent samples to approximate probability distributions.

Markov Decision Process

Techniques

A framework for sequential decision-making with probabilistic state transitions.

Masked Language Modeling

Training

A training technique where random words in text are hidden, and the model learns to predict them based on surrounding context.

Masked Next-Token Prediction

Training

A training technique where parts of text are hidden and the model learns to predict what should fill those gaps, helping it understand context and meaning.

Masked Pre-training

Techniques

A self-supervised training method where parts of input data are hidden and the model learns to predict them from context.

Masked Prediction

Training

A training technique where parts of the input are hidden, and the model learns to predict what was masked, helping it understand underlying patterns.

Masked Self Attention

Techniques

Attention that only looks at past tokens, preventing future information leakage.

Masked Token Prediction

Techniques

A technique where the model learns to predict hidden or blanked-out words in text, allowing it to reason about context from multiple directions at once.

Masked Tokens

Architecture

Placeholder positions in text that are hidden or unknown, which the model learns to fill in or refine during generation.

Masking and Unmasking

Techniques

A process where the model hides (masks) and then progressively reveals (unmasks) parts of text to refine and improve the entire sequence iteratively.

Massive Activations

Techniques

Extreme outlier values in a small number of tokens and channels within a neural network layer.

Master Weight Splitting

Techniques

Separating model weights into components for efficient distributed training.

Materialized View

Techniques

Pre-computed results stored for fast retrieval instead of computing on demand.

Math-Aware Retrieval

Techniques

Finding mathematically equivalent or structurally similar problems in a dataset, rather than just keyword-based matching.

Math-Specialized

Training

A model that has been optimized and trained specifically for mathematical reasoning and problem-solving tasks, rather than general-purpose language understanding.

Mathematical Notation

Behavior

Symbolic representations of mathematical expressions and equations (like formulas and symbols) that need special handling to be correctly interpreted by AI models.

Mathematical Notation Parsing

Techniques

The process of analyzing and interpreting visual mathematical symbols and equations to convert them into a structured, computer-readable format.

Mathematical Reasoning

Behavior

The ability to solve multi-step math problems by breaking them down logically and showing intermediate steps rather than just guessing the answer.

MathML

Formats

An XML-based markup language designed specifically for representing mathematical notation in a way that computers can understand and display.

Matrix Factorization

Techniques

Decomposing a matrix into a product of smaller matrices, commonly used for dimensionality reduction and pattern discovery.

Matryoshka Embeddings

Techniques

A technique that allows embedding vectors to be shortened (truncated) to smaller dimensions while maintaining quality, letting you trade off between accuracy and storage/speed needs.

Matryoshka Representation Learning

Training

A training technique that allows a single embedding model to produce high-quality results at multiple vector sizes, letting you shrink the embedding dimensions to save storage and speed without retraining.

Maximal Update (μP)

Techniques

A parameterization method that keeps optimal learning rates approximately constant across different model sizes.

Maxout Network

Techniques

A neural network layer that outputs the maximum value across a set of linear functions, enabling piecewise linear approximations.

Mean Average Precision (mAP)

Techniques

Standard metric measuring detection accuracy by comparing predicted object locations to ground truth across different confidence thresholds.

Mean Pooling

Architecture

A technique that combines multiple token embeddings into a single representation by averaging them, producing one embedding for an entire text sequence.

Mecha-nudges

Techniques

Subtle changes to how choices are presented that systematically influence AI agents without degrading the decision environment for humans.

Mechanism Design

Techniques

Designing rules for interactions between parties to achieve desired outcomes like fairness or efficiency.

Mechanism Linked Evidence

Techniques

Proof that a model's behavior stems from a specific internal mechanism.

Mechanistic Analysis

Techniques

Studying how a model's internal computations and representations lead to specific behaviors or failures.

Mechanistic Interpretability

Evaluation

The study of understanding how a language model's internal components and computations work to produce its outputs.

Medical Reasoning

Behavior

The ability to apply clinical knowledge and logic to interpret medical data, such as understanding what symptoms indicate about a patient's condition.

MEG (Magnetoencephalography)

Techniques

Non-invasive brain imaging that measures magnetic fields produced by neural activity.

Membership Inference Attack

Techniques

An attack that determines whether a specific data point was used to train a model.

Memorization

Behavior

When a model learns to reproduce exact training examples rather than learning general patterns it can apply to new situations.

Memorization-to-Generalization Transition

Techniques

The shift from a model reproducing training data to creating novel outputs, triggered by increasing dataset size.

Memory Capacity

Techniques

The maximum amount of information a model can store and retrieve.

Memory Efficiency

Performance

How well a model uses available RAM or GPU memory, allowing it to run on smaller or less expensive hardware.

Memory Footprint

Performance

The amount of RAM or storage space a model requires to run, which is critical for deployment on resource-constrained devices.

Memory Transformer

Techniques

A neural component that selects and refines relevant knowledge from long-term memory based on the current context.

Memory-Adaptive Scheduling

Techniques

Dynamically partitioning computation into tasks that fit within available device memory constraints.

Memory-augmented generation

Techniques

A generation system that stores and retrieves visual references during creation to maintain consistency across outputs.

Mention Noise

Techniques

Errors or corruptions in detected entity mentions that affect downstream processing.

Merged Weights

Training

The combination of a base model's weights with additional trained weights (like from LoRA adapters) into a single unified model file.

Message Passing

Techniques

The core mechanism in GNNs where nodes exchange and aggregate information from their neighbors iteratively.

Meta-Agent

Techniques

A higher-level agent that monitors and improves other agents by comparing their outputs against reality and updating their code or instructions.

Meta-cognitive

Techniques

The ability to reflect on and manage one's own thinking processes and decision-making.

Meta-Cognitive Deficit

Techniques

An agent's inability to reflect on and make wise decisions about when to use its own knowledge versus when to seek external help.

Meta-learning

Techniques

Training a model to learn how to learn, so it can quickly adapt to new tasks or changing conditions.

Metacognitive Features

Techniques

Self-awareness about thinking processes, including goal assessment, domain awareness, and strategic exploration.

Metacognitive Gap

Techniques

The difference between how well models assess their own confidence versus how well humans evaluate belief certainty against evidence.

Metaheuristic

Techniques

A general problem-solving strategy that explores solutions without guaranteeing optimality but finds good answers quickly.

Metamodel

Techniques

A model that defines the structure and rules for creating other models in model-driven engineering.

Metamorphic Testing

Techniques

A testing approach that checks if a system maintains consistent behavior under semantically equivalent input transformations.

Metastable

Techniques

A state that appears stable but is easily disrupted by small changes or perturbations.

Method Lineage

Techniques

The causal relationships and dependencies showing how one research method evolved from or influenced another.

Methodological Evolution Graph

Techniques

A structured database mapping how research methods emerge, adapt, and build upon one another over time.

Metric Misspecification

Techniques

Using an evaluation metric that doesn't align with true objectives.

Metric-Consistent Digital Twins

Techniques

Virtual replicas of real objects that preserve accurate physical dimensions and properties for faithful simulation.

Metric-Scale Pose Estimation

Techniques

Determining a robot's position and orientation in real-world units rather than relative or scaled coordinates.

Micro-expressions

Techniques

Brief, involuntary facial expressions lasting 0.25-0.5 seconds that reveal genuine emotions.

Microservice Architecture

Techniques

A system design where independent, containerized services handle specific tasks and communicate together.

Mid-Tier Model

Deployment

A model positioned between lightweight and flagship versions, balancing capability with efficiency rather than maximizing raw performance.

Middleware

Techniques

Software layer that sits between services to translate, transform, or coordinate their interactions.

MIMO Formulation

Techniques

Multi-input, multi-output architecture that processes multiple data streams in parallel to improve model expressiveness without increasing latency.

MiniLM Architecture

Architecture

A lightweight transformer-based architecture designed to be computationally efficient while maintaining strong performance for text understanding tasks.

Minimax Algorithm

Techniques

A game-playing algorithm that minimizes the opponent's maximum advantage by exploring all possible moves.

Minimax Training

Techniques

A training method where one part tries to break the model (maximization) while another part fixes it (minimization) to build robustness.

Minimum-energy control

Techniques

Control strategy that achieves desired system behavior using the least amount of control effort.

Mirror Descent

Techniques

An optimization algorithm that uses geometric transformations to adapt learning to different data distributions.

Mirror Duality

Techniques

A property allowing optimization algorithms to switch between different geometric transformations while maintaining convergence.

Misinformation

Techniques

False or inaccurate information spread online, whether intentionally or unintentionally.

Missing Modality Generalization

Techniques

A model's ability to work when one or more input modalities are unavailable at test time.

Mistral Architecture

Architecture

A specific design pattern for transformer-based language models that uses efficient attention mechanisms and grouped query attention to balance performance and speed.

MIT License

Licensing

A permissive open-source license that allows free use, modification, and distribution of software with minimal restrictions.

MITRE ATT&CK

Techniques

A knowledge base of adversary tactics and techniques based on real-world observations, used to classify and understand cyberattacks.

Mixed Precision

Techniques

Using different numerical precisions for different parts of computation.

Mixed Precision Training

Techniques

Training with lower precision for speed while maintaining higher precision where needed.

Mixed State

Techniques

A quantum state representing uncertainty or entanglement with an environment, described by a density matrix rather than a pure state vector.

Mixed-Batch Pre-training

Techniques

Training on multiple datasets with different structures and properties in the same training batch.

Mixed-Integer Linear Programming (MILP)

Techniques

A mathematical optimization approach for problems with both continuous and discrete variables subject to linear constraints.

Mixed-Precision Quantization

Techniques

Using different numerical precisions (e.g., 8-bit, 4-bit) for different parts of a model to reduce memory and computation.

Mixed-Quality Data Training

Training

A training approach that uses datasets containing varying levels of quality and accuracy, rather than only perfectly curated examples, to improve efficiency and real-world performance.

Mixed-State Representation

Techniques

A quantum state that is a probabilistic mixture of pure quantum states rather than a single definite state.

Mixed-Truth Content

Techniques

Misinformation that blends accurate information with false claims to appear credible and evade detection.

Mixture of Experts

Architecture

An architecture where a model contains multiple specialized sub-networks (experts) and selectively activates only a few for each input, improving efficiency without sacrificing capability.

MLLM-as-a-Judge

Techniques

Using multimodal large language models to evaluate outputs by assessing both visual and semantic correctness with rubrics.

MLX

Deployment

A machine learning framework optimized for running models efficiently on Apple Silicon chips.

MLX Deployment

Deployment

Running a model locally on Apple Silicon hardware using the MLX framework, which is optimized for efficient inference on Mac devices.

MLX Format

Formats

A model format designed specifically for efficient inference on Apple Silicon devices, optimized for the MLX machine learning framework.

MLX Framework

Deployment

A machine learning framework specifically designed for running AI models efficiently on Apple Silicon hardware.

MLX Optimization

Deployment

A framework that optimizes AI models to run efficiently on Apple Silicon chips (like M1, M2, M3), taking advantage of their specific hardware capabilities.

Mobile Manipulation

Techniques

A robot's ability to move around an environment while using its arms to pick up and interact with objects.

Modality

Architecture

A type of input or output data a model can process, such as text, images, or audio.

Modality Collapse

Techniques

When a multimodal system stops using some of its input types and relies only on one or a few.

Modality Gap

Techniques

The performance difference between a model's reasoning using text versus visual information.

Modality Imbalance

Techniques

Unequal influence or representation of different data types (like images vs. text) in a multimodal model.

Modality Transfer

Techniques

Adapting a model trained on one type of data (like video) to work with a different type (like tactile signals) efficiently.

Modality-wise Optimization

Techniques

Training approach that handles each data type (audio, video, text) with separate, tailored optimization strategies.

Mode Connectivity

Techniques

The property that different trained models can be connected through a continuous path in weight space.

Model Adaptation

Techniques

Techniques for customizing a pre-trained model's behavior for specific tasks or use cases.

Model Architecture

Architecture

The underlying structural design of a neural network that determines how data flows through it and how it processes information.

Model Backbone

Architecture

The core underlying architecture of a model that serves as the foundation for specialized versions or fine-tuned variants.

Model Capability Tier

Deployment

A ranking level within a model family that indicates relative power, speed, and cost trade-offs.

Model Capacity

Architecture

The size and complexity of a model, which determines how much information it can learn and store; smaller capacity means fewer parameters and less computational power needed.

Model Checkpoint

Formats

A saved snapshot of a trained model's weights and parameters, stored in formats like safetensors or PyTorch for later use or deployment.

Model Collapse

Techniques

When a language model's training performance suddenly degrades due to overconfidence in incorrect predictions.

Model Compression

Deployment

Techniques used to make models smaller and faster to run, allowing them to work on devices with limited memory or processing power.

Model Deployment

Deployment

The process of configuring and launching a trained model in a cloud environment so it can receive requests and generate responses.

Model Disagreement

Techniques

Differences in predictions across multiple models on the same input.

Model Distillation

Training

A technique where a smaller, faster model is trained to mimic the behavior of a larger, more capable model to reduce computational costs.

Model Drift

Techniques

Degradation of model performance over time due to changes in data distribution or real-world conditions.

Model Efficiency

Performance

How well a model performs relative to its computational cost and resource requirements, important for deployment on devices with limited hardware.

Model Family

Architecture

A group of related AI models developed by the same organization that share similar architecture and training approaches but may differ in size or capabilities.

Model Footprint

Performance

The amount of memory and computational resources required to run a model, with smaller footprints being more efficient.

Model Footprint

Deployment

The amount of memory and computational resources required to run a model, determined primarily by its size and architecture.

Model Format

Formats

The file format used to store and load a model's weights; common formats like safetensors and PyTorch determine compatibility with different tools and frameworks.

Model Free Learning

Techniques

Learning optimal behavior without explicitly modeling the environment.

Model Inference

Deployment

The process of running a trained model on new input data to generate predictions or outputs, as opposed to training the model.

Model Initialization

Training

The process of setting a model's weights to starting values before training; random initialization means weights are set to random numbers rather than learned values.

Model Layers

Architecture

The stacked computational components in a neural network that progressively transform input data; fewer layers means faster processing but potentially less ability to capture complex patterns.

Model Merging

Techniques

A technique that combines the learned knowledge from two or more trained models into a single model.

Model Modularity

Techniques

Designing models so independent components can be used, removed, or composed separately without performance loss.

Model Optimization

Training

Techniques used to make a model smaller, faster, or more efficient while maintaining acceptable performance.

Model Parameters

Architecture

The internal numerical values (weights) that a neural network learns during training and uses to make predictions.

Model Precision

Formats

The numerical accuracy used to store a model's weights and calculations—higher precision (like float32) is more accurate but uses more memory, while lower precision (like int4) is more efficient but less precise.

Model Predictive Control (MPC)

Techniques

A control method that predicts future system behavior and optimizes actions over a time horizon.

Model Predictive Control (MPC)

Techniques

A control method that predicts future system behavior and optimizes actions based on a mathematical model.

Model Pruning

Techniques

Removing unnecessary parameters or connections from a model to reduce size and computation.

Model Quantization

Deployment

A technique that reduces a model's size and memory requirements by using lower-precision numbers, enabling it to run on resource-limited devices.

Model Scale

Architecture

The size of a model measured by the number of parameters it contains; smaller models are faster but less capable than larger ones.

Model Scaling

Training

The practice of increasing a model's size (parameters, training data, or compute) to improve its capabilities and performance.

Model Size

Performance

The total number of parameters (learnable values) in a model, which affects its memory usage, speed, and capability.

Model Specialization

Training

Training a model to excel at a narrow set of tasks rather than performing well across many different domains.

Model Stub

Evaluation

A minimal, simplified version of a model used for testing code and infrastructure without the computational cost of a full model.

Model Suite

Training

A collection of related models of varying sizes or configurations released together for comparative research and analysis.

Model Transparency

Behavior

The ability to examine and understand how a model works, including access to its weights, architecture, and training details.

Model Validation

Deployment

The process of testing a model to ensure it works correctly within a framework or pipeline before deploying it for real tasks.

Model Variant

Architecture

A modified version of a base model that changes its size, capabilities, or behavior while maintaining the same core architecture.

Model Weights

Architecture

The learned numerical parameters inside a neural network that determine how it processes input and generates output.

Model-Agnostic

Techniques

A technique that works across different model architectures without requiring architecture-specific modifications.

Model-Based Reinforcement Learning

Techniques

Learning approach where an agent builds a model of how the environment works, then uses it to plan actions.

Model-Internal Signals

Techniques

Information derived from a model's own computations (like attention patterns or confidence scores) without external tools.

Moderation Layer

Deployment

A specialized model or component that filters and evaluates user inputs or outputs to prevent harmful content from reaching users or being generated.

Modular Deployment

Techniques

Using only relevant subsets of a model's components independently or in combination for specific tasks or domains.

Modular Transfer

Techniques

Reusing learned or numerical components across different problems by swapping modules without full retraining.

Modularity

Techniques

The ability to use and compose independent subsets of a model without requiring the full system or human-defined rules.

Molecular Design

Techniques

Process of creating new molecules with desired properties for applications like drug discovery.

Molecular Dynamics (MD) Simulation

Techniques

A computational technique that simulates how atoms move and interact over time.

Molecular Language Model

Training

A specialized AI model trained to understand and process chemical structures by learning patterns from molecular data, similar to how text language models learn from words.

Molecular Property Prediction

Techniques

Task of predicting chemical or physical properties of molecules based on their structure.

Molecular Reasoning

Behavior

The ability to understand and predict how molecules behave, interact, and transform based on their chemical structure and properties.

Moment Matching

Techniques

A distillation technique that aligns statistical properties (moments) between a teacher and student model.

Momentum

Techniques

An optimization technique that accumulates gradients to accelerate convergence.

Momentum-Based Adaptation

Techniques

A technique that smoothly updates model parameters using accumulated historical changes for stability.

Monocular Depth Estimation

Techniques

Predicting 3D depth information from a single 2D image without stereo or multiple views.

Monocular Reconstruction

Techniques

Inferring 3D structure and depth from a single 2D image or video frame without stereo or multi-view input.

Monosemanticity

Techniques

When a neuron or expert performs a single, well-defined function rather than handling multiple unrelated tasks.

Monotonic Improvement

Techniques

A guarantee that each update to a policy increases or maintains performance, never decreases it.

Monte Carlo Approximation

Techniques

Using random sampling to estimate quantities that are expensive or impossible to compute exactly.

Monte Carlo Dropout

Techniques

A technique using dropout during inference to estimate model uncertainty by sampling multiple predictions.

Monte Carlo Sampling

Techniques

Estimating expected values by drawing random samples and averaging results.

Monte Carlo Simulation

Techniques

A computational technique using repeated random sampling to estimate probability distributions and outcomes.

Monte Carlo Tree Search (MCTS)

Techniques

An algorithm that explores game possibilities by randomly simulating many future moves to estimate the best action.

Moral Disengagement

Techniques

Psychological mechanisms that allow people to justify harmful behavior by reframing it as acceptable or necessary.

Moral Hazard

Techniques

When one party takes excessive risks because another party bears the consequences, reducing incentive to act carefully.

Moral Reasoning

Techniques

A model's ability to understand and apply ethical principles to make judgments about right and wrong.

Morphological Analysis

Techniques

The ability to understand and process word structure, including prefixes, suffixes, and inflections that change word meaning or grammatical function in languages like Russian.

Morphological Complexity

Behavior

The linguistic challenge of handling languages where words change form significantly based on grammar, tense, and case—common in Polish and other inflected languages.

Morphological Paradigm

Techniques

The complete set of inflected forms of a word, showing how it changes across different grammatical contexts.

Morphology

Behavior

The structure and rules of how words are formed and modified in a language, which is especially important for languages like Korean with complex word composition.

Motion Capture

Techniques

Recording and digitizing human body movement for analysis or animation.

Motion Causality

Techniques

The relationship between user-driven actions and their physical consequences in a scene.

Motion-Adaptive Threshold

Techniques

A dynamic decision boundary that adjusts based on detected motion to determine when cached features can be safely reused.

MPNet Architecture

Architecture

A neural network design that combines masked language modeling with permutation language modeling to better understand relationships between words in text.

Multi Agent Systems

Techniques

Multiple independent agents interacting and learning in a shared environment.

Multi Hop Reasoning

Techniques

Solving problems by chaining multiple reasoning steps together sequentially.

Multi-Agent Coordination

Techniques

Techniques for making multiple autonomous agents work together toward shared goals.

Multi-Agent Ensemble

Architecture

A system where multiple AI agents work together, cross-checking and debating each other's reasoning before producing a final answer.

Multi-agent framework

Techniques

A system where multiple AI agents with different roles work together to solve a problem.

Multi-Agent Interaction

Techniques

Structured communication and mutual influence between multiple AI agents that shapes collective behavior over time.

Multi-Agent Plan Execution

Techniques

A system where multiple agents coordinate to execute complex plans by breaking them into steps and validating each one.

Multi-Agent Reinforcement Learning (MARL)

Techniques

Training multiple agents simultaneously so they learn to cooperate and improve together toward shared goals.

Multi-agent system

Techniques

Multiple AI agents working together, each with different roles or goals, to solve a problem collaboratively.

Multi-armed bandit

Techniques

A decision problem where an agent repeatedly chooses between options to maximize rewards while learning which is best.

Multi-Depot Vehicle Routing Problem (MDVRP)

Techniques

A logistics optimization task where vehicles start from multiple depots and must visit customers while minimizing cost or distance.

Multi-Domain Training

Training

Training a model on question-answer pairs from many different topics or fields to make it work well across diverse subjects.

Multi-Epoch Training

Techniques

Training a model by repeating the same dataset multiple times rather than using each sample once.

Multi-File Context

Architecture

The ability to understand and work with code spread across multiple files in a project, maintaining awareness of how different files relate to each other.

Multi-Image Reasoning

Techniques

The ability to connect and synthesize information from multiple images to solve a problem or answer a question.

Multi-label Classification

Techniques

A classification task where each example can belong to multiple categories simultaneously, unlike single-label classification.

Multi-Label Text Classification

Techniques

Assigning multiple categories to a single text document, where labels can overlap or co-occur.

Multi-Language Support

Behavior

The ability to understand and generate code across many different programming languages.

Multi-modal Prediction

Techniques

Generating multiple plausible different outcomes rather than a single deterministic prediction.

Multi-Object Tracking

Techniques

Following multiple moving objects across video frames to maintain consistent identities over time.

Multi-Objective Optimization

Techniques

Finding solutions that balance multiple competing goals simultaneously.

Multi-Objective Reinforcement Learning (MORL)

Techniques

Training an AI system to optimize multiple competing goals simultaneously rather than a single objective.

Multi-Pass Reasoning

Techniques

An iterative approach where an LLM revisits and refines its analysis across multiple complete passes through a problem.

Multi-Provider Architecture

Techniques

System design that integrates multiple LLM providers for improved reliability through consensus and fallback mechanisms.

Multi-shot video generation

Techniques

Creating coherent video sequences with multiple scenes while maintaining consistency of characters and objects across shots.

Multi-Step Analysis

Behavior

The ability to break down complex problems into smaller sequential steps and solve them methodically rather than attempting to answer in one go.

Multi-Step Logic

Behavior

The ability to break down complex problems into sequential reasoning steps and correctly combine them to reach a solution.

Multi-Step Prediction

Techniques

Forecasting what happens several time steps into the future, rather than just the immediate next state.

Multi-Step Reasoning

Behavior

The ability to break down complex problems into smaller steps and solve them sequentially, rather than jumping directly to an answer.

Multi-Step Task Execution

Behavior

The ability to break down complex problems into sequential steps and execute them autonomously without human intervention between steps.

Multi-Step Tasks

Behavior

Problems or workflows that require a model to perform multiple sequential operations or reasoning steps to reach a final answer.

Multi-task Learning

Techniques

Training a single model on multiple different tasks simultaneously so it learns shared skills across them.

Multi-Teacher Learning

Techniques

Using multiple teacher models simultaneously to train a student model, combining their different strengths.

Multi-Token Prediction

Techniques

Generating multiple future tokens in parallel instead of one at a time.

Multi-Turn Conversation

Behavior

The ability to maintain context and coherence across multiple back-and-forth exchanges with a user, remembering earlier messages in the conversation.

Multi-Turn Dialogue

Behavior

A conversation where the model maintains context across multiple back-and-forth exchanges with a user, remembering previous messages.

Multi-valence Sentiment

Techniques

Recognizing that a single text can express multiple opposing sentiments (both positive and negative) simultaneously.

Multi-Vector Embeddings

Architecture

A representation where documents and queries are encoded as multiple vectors (one per token) instead of a single vector, enabling more precise matching.

Multi-Vector Retrieval

Techniques

A search method that represents a single piece of text using multiple vectors simultaneously, allowing more flexible and nuanced matching.

Multi-view Consistency

Techniques

Ensuring that representations of the same scene remain coherent across different viewing angles or perspectives.

Multi-view Fusion

Techniques

Combining information from multiple camera angles to create a unified understanding of a scene.

Multi-view Representation

Techniques

Representing a 3D scene using multiple 2D images captured from different camera angles.

Multilevel Methods

Techniques

Computational techniques that combine solutions from models of varying accuracy and cost to reduce overall computation.

Multilingual

Behavior

A model trained to understand and generate text in multiple languages, not just English.

Multilingual Bias

Techniques

Systematic performance gaps across languages, often favoring high-resource languages like English over others.

Multilingual Capabilities

Behavior

The ability of a model to understand and generate text in multiple languages, often with varying levels of proficiency across different language pairs.

Multilingual Capability

Behavior

A model's ability to understand and generate text in multiple languages, not just English.

Multilingual Code Corpus

Training

A large collection of source code written in many different programming languages, used to train the model.

Multilingual Coverage

Behavior

The ability of a model to understand and generate text in multiple languages, typically because it was trained on data from many different languages.

Multilingual Embedding Space

Architecture

A shared mathematical space where sentences from different languages are positioned so that translations or sentences with the same meaning end up near each other.

Multilingual Embeddings

Architecture

A shared numerical space where text from different languages is represented so that similar meanings across languages are positioned close together, enabling cross-language comparison.

Multilingual Model

Training

A model trained on text from multiple languages, allowing it to understand and generate text in several different languages.

Multilingual NLP

Behavior

Natural language processing systems designed to understand and work with text in multiple languages, including non-Latin scripts like Cyrillic.

Multilingual Performance

Behavior

A model's ability to understand and generate text in multiple languages with comparable quality across different language pairs.

Multilingual Reasoning

Behavior

The capability to understand, process, and reason through problems in multiple languages, not just English.

Multilingual Specialization

Behavior

When a model is optimized for one or a few languages rather than many, trading broad language support for deeper fluency in those specific languages.

Multilingual Speech Corpus

Techniques

A collection of audio recordings in multiple languages used to train speech recognition and synthesis systems.

Multilingual Support

Behavior

The ability of a model to understand and process text in multiple languages, not just English.

Multilingual Training

Training

Training a model on text from many different languages so it can understand and generate text across all of them.

Multimodal

Architecture

A model that can process and understand multiple types of input, such as both text and images.

Multimodal Action Prediction

Techniques

Forecasting future actions using multiple types of sensory input (e.g., vision and motor feedback) simultaneously.

Multimodal Agent

Techniques

An AI system that can process and reason over multiple types of data (text, images, documents) to complete tasks.

Multimodal Alignment

Training

The process of training a model to understand and connect different types of data (like audio and text) by mapping them into a shared space where related concepts are close together.

Multimodal Attack

Techniques

An adversarial attack that simultaneously perturbs multiple input modalities (e.g., text and audio) to fool a model.

Multimodal Attention

Techniques

Attention mechanism that processes multiple types of input (like text and image features) simultaneously in a transformer.

Multimodal Benchmark

Techniques

A standardized test dataset that evaluates AI models on tasks combining multiple types of input like images and text.

Multimodal Bias

Techniques

Discriminatory patterns that emerge when AI models process multiple input types (text, audio, images) together.

Multimodal Comprehension

Behavior

The ability of an AI model to understand and reason about multiple types of input data (like images and text) simultaneously.

Multimodal Content Analysis

Techniques

Processing and understanding multiple types of information (video, audio, text) simultaneously to extract meaning and structure.

Multimodal Dialogue

Behavior

A conversational interaction where the model can understand and respond to inputs that combine both text and images in a natural back-and-forth exchange.

Multimodal Diffusion Model

Techniques

A generative model that takes multiple types of input (like text and images) to create new content.

Multimodal Embedding

Techniques

A representation that captures meaning from multiple types of data (like text, images, and tables) in a single searchable format.

Multimodal Evaluation

Techniques

Assessing AI systems across multiple input/output types (audio, video, text) simultaneously rather than separately.

Multimodal Fusion

Techniques

Combining data from multiple sources (like ECG and PPG) to make better predictions than using each source alone.

Multimodal Generative Reward Model

Techniques

A reward model that processes multiple input types (text, images) and generates interpretable feedback about output quality.

Multimodal Humor Understanding

Techniques

The ability to comprehend humor by combining visual and textual information to identify incongruities and their resolutions.

Multimodal Input

Architecture

The ability to accept and process multiple types of input data simultaneously, such as both images and text in the same request.

Multimodal Large Language Model (MLLM)

Techniques

An AI model that processes both text and images to understand and reason about visual content.

Multimodal Learning

Training

Training a model to understand and process multiple types of input data (like text and images) together rather than separately.

Multimodal Model

Architecture

An AI model that can process and understand multiple types of input data, such as video, images, and text together.

Multimodal Pipeline

Deployment

A sequence of processing steps that handles multiple types of input data (like text and images) together in a single workflow.

Multimodal Prediction

Techniques

Generating multiple plausible future outcomes instead of a single prediction.

Multimodal Pretraining

Training

Training a model on paired images and text data so it learns to connect visual and language understanding together.

Multimodal Reasoning

Techniques

The ability to solve problems by integrating information from multiple input types like images and text.

Multimodal Recommendation

Techniques

A recommendation system that uses multiple types of data (text, images, etc.) to predict user preferences.

Multimodal Safety

Techniques

Safety mechanisms that operate across multiple input types like images and text simultaneously.

Multimodal Survival Prediction

Techniques

Predicting time-to-event outcomes using multiple types of data (e.g., images, lab results, clinical notes).

Multimodal Tasks

Behavior

AI tasks that require processing multiple types of input data at once, such as understanding both an image and a text question about it.

Multimodal Understanding

Behavior

The ability of an AI model to process and reason about multiple types of input data (like images and text) simultaneously.

Multimodal-Aware

Architecture

A system designed to understand and work with multiple types of content, such as text and images, even if it only processes one type directly.

Multiple Instance Learning

Techniques

A learning approach where training data consists of bags (groups) of instances, useful when only bag-level labels are available.

Multiple Kernel Learning (MKL)

Techniques

A machine learning technique that combines multiple similarity measures (kernels) by learning optimal weights for each.

Multiple Negatives Ranking (MNR)

Training

A training technique that improves embeddings by comparing a text sample against multiple negative examples, helping the model learn to distinguish similar from dissimilar content.

Multiple-Choice Question (MCQ)

Techniques

An evaluation format where a model selects the correct answer from a fixed set of options.

Multiscale Problem

Techniques

A physics or engineering problem with important dynamics at multiple length or time scales simultaneously.

Multitask Learning

Training

Training a model on multiple related tasks simultaneously so it learns shared patterns that improve performance across all tasks.

Multitask Training

Training

A training approach where a model learns to perform multiple related objectives simultaneously, which often improves its overall performance and generalization.

Multivariate Time Series

Techniques

Time-ordered data with multiple variables or channels measured simultaneously, where variables may influence each other.

Multivector Algebra

Techniques

An algebraic structure where elements can represent scalars, vectors, and higher-dimensional geometric objects simultaneously.

Muon Optimizer

Techniques

A second-order optimizer designed for hypersphere-constrained training that improves stability during scaling.

Music Understanding

Behavior

The ability of a model to analyze and interpret musical characteristics like genre, emotion, harmony, and structure from audio or music data.

Mutation Testing

Techniques

Deliberately introducing bugs into code to test whether test suites can catch them.

Mutual Information Balancing

Techniques

Regularization technique that ensures both modalities contribute equally to the joint representation by equalizing information flow.

Mutual Nearest Neighbors

Techniques

A metric for measuring similarity between representations by finding pairs of samples that are each other's closest matches.

MXFP4

Formats

A low-precision floating-point format (4-bit) designed for efficient neural network computation while maintaining reasonable accuracy.

N

Named Entity Recognition

Evaluation

A natural language processing task that identifies and classifies specific entities like people, places, and organizations within text.

Narrative Generation

Behavior

The task of automatically creating coherent stories or sequences of events in text form.

Narrative Structure

Behavior

The organized framework of a story, including how events are sequenced and how the plot progresses from beginning to end.

Native Modality Processing

Architecture

The ability of a model to directly understand different types of input (like images or audio) without converting them to text first.

Native Processing

Architecture

When a model can directly understand different types of input (like images or audio) without needing to convert them to text first.

Native Resolution Handling

Architecture

The ability to process images at their original sizes and aspect ratios without forcing them into a fixed square dimension, reducing information loss from resizing.

Natural Gradient

Techniques

An optimization method that accounts for the geometry of the data distribution, often converging faster than standard gradient descent.

Natural Language Generation

Behavior

The process by which a model produces human-readable text output based on its understanding of input and learned patterns.

Natural Language Inference (NLI)

Training

A training task where a model learns to determine whether one sentence logically follows from another, helping it understand relationships between texts.

Natural Language Processing

Architecture

The field of AI focused on enabling computers to understand, interpret, and generate human language in a meaningful way.

Natural Language Processing (NLP)

Techniques

The field of AI focused on understanding and generating human language in a meaningful way.

Natural Language to Code Translation

Behavior

The process of converting human-written instructions or descriptions into executable programming code.

Natural Language Understanding (NLU)

Behavior

The ability of a model to comprehend and extract meaningful information from human language, rather than just pattern-matching on words.

Ndcg

Techniques

Ranking metric measuring how well relevant items are placed at the top.

Negative Control Samples

Techniques

Unperturbed reference images used as stable anchors to detect and correct for technical variations in experiments.

Negative Knowledge Transfer

Techniques

When learning from one task actually hurts performance on another task due to conflicting patterns.

Negative Sampling

Training

A training technique where the model learns by comparing correct matches against intentionally chosen incorrect examples to improve discrimination.

Negative Transfer

Techniques

When training a model on multiple tasks simultaneously hurts performance compared to training on individual tasks separately.

Neural Audio Codec

Architecture

A machine learning model that compresses audio into a compact digital format and can reconstruct it back to near-original quality.

Neural Encoder

Architecture

A neural network component that converts raw text input into a numerical representation (embedding) that captures semantic meaning.

Neural Encoding

Architecture

The process of converting text or other data into numerical vector representations using neural networks, enabling machines to understand and process language.

Neural Field

Techniques

A neural network that represents continuous 3D properties (like temperature or material density) as a smooth function rather than discrete grid values.

Neural Information Retrieval

Techniques

Using neural networks and embeddings to find relevant documents or passages in response to a query, rather than traditional keyword matching alone.

Neural interpreter

Techniques

An AI model trained to predict how code executes step-by-step without actually running it.

Neural Mapping

Techniques

Transforming brain activity patterns from one condition to match patterns from another condition.

Neural Memory

Techniques

A learnable memory component that neural networks can read and write to.

Neural ODE

Techniques

A neural network that models continuous dynamics by treating layers as differential equations.

Neural Operator

Techniques

A learned function that maps between infinite-dimensional function spaces, used for solving physics equations on meshes.

Neural Ordinary Differential Equations (Neural ODEs)

Techniques

Neural networks that model continuous-time dynamics by treating hidden states as solutions to differential equations.

Neural Renderer

Techniques

A learned neural network that synthesizes or modifies images by applying rendering operations like lighting changes.

Neural Retrieval

Techniques

A search method that uses neural networks to understand semantic meaning and find relevant documents, rather than relying on keyword matching alone.

Neuro-symbolic AI

Techniques

Combining neural networks with symbolic logic to get both the flexibility of learning and the interpretability of rule-based systems.

Neuron Activation

Techniques

The pattern of which neurons in a neural network fire or respond when processing specific inputs.

Newton's Method

Techniques

An optimization algorithm that finds roots of equations by iteratively refining guesses using function derivatives.

Next-Generation Capabilities

Behavior

Advanced features and improvements in a model that represent a significant step forward from previous versions.

Next-Token Prediction

Architecture

The fundamental task where a language model learns to guess the most likely next word (or token) based on all the words that came before it.

Next-Visit Prediction

Techniques

A pretraining task where a model learns to predict which clinical events will occur at a patient's next healthcare visit.

NF4 Quantization

Deployment

A specific 4-bit quantization method that uses a normalized float format to preserve model accuracy while dramatically reducing memory requirements.

NLG Evaluation

Techniques

Assessing the quality of machine-generated text across criteria like fluency, coherence, and relevance.

Node Classification

Techniques

A graph task where the goal is to predict labels for individual nodes using graph structure and node features.

Node Embedding

Techniques

A vector representation of a node in a graph that captures its structural properties and relationships.

Noise Initialization

Techniques

The starting point for diffusion generation, typically random Gaussian noise that gets progressively refined into an image.

Noise Reduction Pipeline

Techniques

A multi-step filtering system combining domain rules, statistical patterns, and behavioral signals to remove false alerts.

Noise Robustness

Techniques

The ability of a model to maintain performance when given irrelevant, incorrect, or corrupted input data.

Noise Schedule

Techniques

A sequence defining how much noise is added during training and removed during sampling in diffusion models.

Noisy Data Filtering

Training

A preprocessing technique that removes or corrects low-quality or mismatched training examples before training, improving model reliability.

Noisy Labels

Techniques

Training data where some examples have incorrect labels, which can degrade model performance if not handled carefully.

Non Autoregressive Decoding

Techniques

Generating all output tokens simultaneously rather than one at a time, enabling faster inference.

Non-Autoregressive

Architecture

A generation approach where the model generates multiple tokens in parallel or through iterative refinement, rather than one at a time.

Non-Autoregressive Generation

Techniques

A text generation approach where the model can predict or refine multiple words in parallel, rather than generating one word at a time in sequence.

Non-Commercial License

Licensing

A legal restriction that permits using the model for learning and research but prohibits using it in production systems or for commercial purposes.

Non-convex Optimization

Techniques

Finding minima in loss landscapes with multiple local minima, common in deep learning.

Non-Functional Requirements

Techniques

Specifications describing how a system should perform, including quality attributes like performance and security.

Non-IID Data

Techniques

Data distributed unevenly across devices, where each device has different data patterns—more realistic than uniform distribution.

Non-Markovian Decision Problem

Techniques

A decision problem where the optimal action depends on history, not just the current observation, because the present state is ambiguous.

Non-rigid Deformation Recovery

Techniques

Tracking and reconstructing objects that bend or change shape, rather than staying rigid.

Non-Stationary Dynamics

Techniques

System behavior that changes over time rather than remaining constant, like wear or environmental drift.

Non-verbatim memorization

Techniques

A model's ability to recall factual knowledge even when the exact wording or phrasing differs from training data.

Nonconformity Score

Techniques

A measure of how unusual or unreliable a prediction is, used by conformal methods to decide which predictions to include in the answer set.

Nonlinear Optimization

Techniques

Finding the best parameter values for a model when the relationship between inputs and outputs is not linear.

Nonlinear Regression

Techniques

Fitting curved or complex relationships between inputs and outputs, beyond simple linear patterns.

Norm Responsiveness

Techniques

How well a model adapts its behavior based on social norms and contextual expectations.

Normalizing Flow

Techniques

A neural network that transforms simple distributions into complex ones while maintaining the ability to calculate exact probabilities.

Noun Class

Techniques

A grammatical system where nouns are grouped into categories that affect agreement with other words.

Nuanced Understanding

Behavior

The ability to grasp subtle meanings, context, and shades of gray in language rather than treating everything as black-and-white.

Nucleotide Sequence

Behavior

The ordered arrangement of DNA building blocks (A, T, G, C) that make up genetic code.

Nuisance Variable

Techniques

A factor that varies in your data but doesn't affect the task label—like lighting in object recognition.

Numeric Planning

Techniques

AI planning that handles continuous numeric quantities like data sizes, processing times, and resource constraints.

Numerical Reasoning

Behavior

The ability to understand, manipulate, and solve problems involving numbers, calculations, and mathematical logic.

Numerical Stability

Techniques

The property of an algorithm to produce consistent results despite small errors or precision changes during computation.

NVFP4 Precision

Formats

A low-precision numerical format that uses 4 bits per weight, developed by NVIDIA to compress models for efficient inference on consumer hardware.

NVFP4 Precision

Deployment

A low-precision numerical format optimized by NVIDIA that uses fewer bits per number than standard formats, enabling efficient inference on NVIDIA GPUs while maintaining reasonable accuracy.

O

Object Detection

Evaluation

A computer vision task that identifies and locates specific objects within an image by drawing boxes around them.

Object Masking

Behavior

The process of creating a binary or multi-class map that highlights which pixels belong to a specific object, effectively isolating it from the background.

Object Segmentation

Behavior

The task of identifying and outlining individual objects in an image or video by marking their exact boundaries at the pixel level.

Object-Goal Navigation (OGN)

Techniques

Task where an AI agent navigates to locate and reach a specified target object in a physical environment.

Observer Belief State

Techniques

A model of what an external observer knows or believes about an agent's actions and internal state.

Occlusion

Techniques

When objects or areas are hidden from view by other objects in front of them.

Occlusion Aware 3d Scene Representation

Techniques

A 3D model that accounts for hidden or blocked parts of objects in a scene.

Occupancy Measure

Techniques

A probability distribution over state-action pairs visited by a policy, used to characterize exploration behavior.

OCR (Optical Character Recognition)

Techniques

The ability to detect and extract text from images, converting printed or handwritten characters into machine-readable text.

OCR-Free

Architecture

A model that understands text in images without needing a separate optical character recognition (OCR) tool to extract the text first.

Off-Policy Actor-Critic

Techniques

A reinforcement learning method where an agent learns from past experiences (not just current policy) using separate networks for action selection and value estimation.

Offline Inference

Techniques

Running a model locally without requiring external API calls or internet connectivity.

Offline Reinforcement Learning

Techniques

Training an AI agent using only pre-collected data without interacting with the environment.

Offline-to-Online Learning

Techniques

Starting with a policy trained on fixed offline data, then improving it through interaction with the environment.

Omni-modal Language Model

Techniques

An AI model that natively processes audio, vision, and text inputs together in a single system.

Omnidirectional Obstacle Avoidance

Techniques

A drone's ability to detect and avoid obstacles coming from any direction, not just ahead.

On-Device

Deployment

A model designed to run directly on a user's device (phone, laptop, etc.) rather than requiring a remote server.

On-Device Deployment

Deployment

Running an AI model directly on a user's device (phone, laptop, edge device) rather than sending data to a remote server.

On-Device Inference

Deployment

Running a model directly on a user's device (phone, laptop, etc.) rather than sending data to a remote server, which improves privacy and reduces latency.

On-policy Data

Techniques

Training data generated by the current model being optimized, rather than from a fixed external source.

On-Policy Distillation

Techniques

A training method where a student model learns from a teacher model's outputs on data the student generates.

On-Policy Learning

Techniques

Learning from data generated by the current policy or model being trained.

On-Policy Learning

Techniques

Training using data generated by the current model rather than data from other sources.

On-Policy RL

Techniques

Reinforcement learning where the model learns from data generated by its own current policy.

One-Class SVM

Techniques

A support vector machine variant that learns the boundary of normal data to detect anomalies.

One-Shot Learning

Behavior

The ability to learn or perform a task from a single example, rather than requiring many training examples.

Online Fine-tuning

Techniques

Continuously updating a model with new incoming data in real-time rather than in batch training sessions.

Online Learning

Techniques

Training a model on streaming data one example at a time, updating weights immediately rather than in batches.

ONNX

Formats

An open standard format for saving and running machine learning models that works across different frameworks and platforms, making models more portable and efficient.

ONNX Format

Formats

An open standard file format for storing trained machine learning models so they can run efficiently across different platforms and frameworks.

ONNX Runtime

Deployment

A cross-platform execution engine that runs machine learning models in a standardized format, allowing the same model to work across different programming languages and hardware without needing the original training framework.

Ontology

Training

A structured, standardized system that defines relationships between concepts — in this case, medical terms and their clinical meanings.

Ontology Engineering

Techniques

The process of designing and building formal knowledge representations that define concepts and relationships in a domain.

Open License

Licensing

A legal permission that allows anyone to freely use, modify, and distribute the model without restrictions (in this case, Apache 2.0).

Open Science

Training

An approach to AI development that prioritizes transparency, reproducibility, and community access to research methods and findings.

Open Source

Licensing

Software or models where the code, weights, and training data are publicly available for anyone to inspect, use, and modify.

Open Source License

Licensing

A legal framework (like GPL-3.0) that allows anyone to use, modify, and distribute the model code and weights freely, often with requirements to share improvements.

Open Weight

Licensing

A model whose trained weights are publicly downloadable, allowing local deployment and modification.

Open-Domain

Behavior

A model trained to handle conversations on any topic without being restricted to a specific subject area.

Open-Domain Retrieval

Behavior

The task of finding relevant documents from a very large, unrestricted collection to answer questions, without being limited to a specific domain or dataset.

Open-Ended Prompts

Techniques

Questions or instructions that have multiple valid answers rather than a single correct response.

Open-Ended Question

Techniques

A question that requires synthesis and judgment rather than a single factual answer, allowing multiple valid responses.

Open-Ended Search

Techniques

Optimization where the solution space and objectives are not fixed in advance but emerge during the search process.

Open-Source Weights

Licensing

Publicly released model parameters that allow anyone to download and run the model locally, rather than accessing it only through a company's API.

Open-Vocabulary Detection

Techniques

Detecting objects in images using arbitrary text descriptions rather than a fixed set of predefined categories.

Open-Weight Model

Licensing

A model whose trained weights are publicly released, allowing anyone to download and run it locally.

Open-Weighted

Licensing

A model whose trained weights are publicly released and can be freely downloaded and used, as opposed to being proprietary or access-restricted.

Open-Weights

Licensing

A model whose trained weights are publicly released, allowing anyone to download and run it locally rather than only accessing it through an API.

OpenRAIL License

Licensing

An open-source license that allows free use of a model while including responsible AI guidelines and usage restrictions.

Operational Design Domain (ODD)

Techniques

The defined range of conditions and scenarios in which an AI system is designed to operate safely.

Operational Domain

Techniques

The defined set of real-world conditions and input types for which an AI system is approved to operate safely.

Operationalize

Techniques

To define an abstract concept in concrete, measurable terms that can be tested or evaluated.

Operator Norm

Techniques

A mathematical measure of how much a matrix can stretch vectors, used to understand optimizer behavior.

Optical Character Recognition (OCR)

Behavior

A technology that automatically detects and extracts text from images or scanned documents.

Optical flow

Techniques

A visual representation showing how pixels move between video frames, indicating motion direction and speed.

Optimal Transport

Techniques

A mathematical method for finding the most efficient way to move one distribution to another.

Optimization

Techniques

The process of adjusting model parameters to minimize errors and improve performance.

Optimizer

Techniques

An algorithm that updates model weights during training to reduce loss and improve accuracy.

Optimizer State

Techniques

Internal variables an optimizer maintains, like momentum or adaptive learning rates, between updates.

Oracle Complexity

Techniques

The total number of gradient computations or function evaluations required to reach a desired solution accuracy.

Ordinal Regression

Techniques

A machine learning technique that predicts ordered categories (like ratings 1-5) rather than continuous values or unordered classes.

Ordinal scoring

Techniques

Evaluating model outputs by ranking them on an ordered scale rather than binary correct/incorrect judgments.

ORPO (Odds Ratio Preference Optimization)

Training

A training technique that aligns a model's outputs with human preferences by combining supervised fine-tuning and preference learning in a single efficient training stage.

Orthogonal Equivalence Transformation

Techniques

Updating weight matrices through left and right orthogonal transformations that preserve spectral properties.

Orthogonal Polynomial Kernel

Techniques

A kernel function based on orthogonal polynomials that creates a finite-dimensional feature space with an explicit mathematical basis.

Orthogonal Polynomial Kernels

Techniques

Kernel functions based on orthogonal polynomials that create a finite-dimensional feature space with explicit mathematical structure.

Orthogonal Projection

Techniques

A mathematical operation that removes specific directions from high-dimensional data while preserving other information.

Orthogonal Representations

Techniques

Feature vectors that are perpendicular to each other, capturing independent information.

Orthogonal Transformation

Techniques

A mathematical operation that rearranges data while preserving its geometric properties, used here to update model weights more efficiently.

Orthogonality

Techniques

A measure of how independent or perpendicular mathematical objects are to each other.

Orthostochastic Matrix

Techniques

A special type of doubly stochastic matrix derived from orthogonal matrices, providing a structured way to parameterize the Birkhoff polytope.

Out Of Distribution

Techniques

Data that differs significantly from the training set, often causing poor model predictions.

Out-of-Distribution Detection

Techniques

Identifying when a model receives input data that differs significantly from its training distribution.

Out-of-Distribution Extrapolation

Techniques

A model's ability to make predictions beyond the range of values it saw during training.

Out-of-distribution Transfer

Techniques

Using a model on tasks or data significantly different from what it was trained on.

Out-of-Vocabulary (OOV)

Behavior

Words or characters that a model has never seen during training and doesn't have a built-in representation for.

Outer normalization

Techniques

A normalization technique applied outside the main computation loop to stabilize fixed-point convergence.

Outlier Tokens

Techniques

Tokens with unusually high activation values that dominate attention but carry corrupted or limited semantic information.

Output Modality

Architecture

The type of data a model produces as output, such as text, images, or predictions.

OV Circuit

Techniques

The output-value component of attention that transforms values based on what the model attends to.

Overconfidence

Techniques

When a model assigns high confidence to predictions that are actually incorrect or unreliable.

Overfitting

Techniques

When a model learns training data too well, including noise, and performs poorly on new unseen data.

Overlap Gap Property (OGP)

Techniques

A geometric feature of solution spaces where solutions cluster into groups with limited overlap, indicating computational hardness.

Oversight Cost

Techniques

The expected human effort and resources required to monitor and intervene in autonomous agent decisions.

P

p-adic field

Techniques

A number system extending rationals using p-adic absolute value, important for studying arithmetic geometry.

PAC Learning

Techniques

A framework proving that an algorithm can learn accurate concepts from limited examples with high probability.

Page Parsing

Techniques

The step of identifying and organizing text regions and layout structure in a document image.

Pair Latents

Techniques

Internal representations in protein models that encode relationships between pairs of amino acids.

Pairformer

Techniques

A component in AlphaFold that processes pairwise relationships between amino acids to predict protein structure.

Pairwise Comparison

Techniques

Evaluating models by comparing outputs two at a time, which scales quadratically with the number of models.

Panoramic Perception

Techniques

Using a 360-degree camera view to see the entire environment around a drone at once.

Paragraph-Level

Behavior

Processing and understanding text at the scale of full paragraphs rather than individual sentences or words.

Paralinguistic Cues

Techniques

Non-verbal aspects of speech like pitch, tone, and accent that convey information about speaker identity.

Parallel Decoding

Techniques

Generating multiple output tokens at once instead of sequentially for faster inference.

Parallel Refinement

Techniques

A generation approach where multiple parts of the output are improved simultaneously rather than sequentially, enabling faster completion.

Parallel Rollouts

Techniques

Running multiple independent attempts at solving a problem simultaneously to gather diverse training data.

Parallel Streams

Techniques

Multiple independent sequences of computation that execute simultaneously, each handling different types of input or output.

Parallel Tempering

Techniques

A sampling method that explores a distribution by running multiple chains at different temperatures and swapping between them.

Parallelization

Techniques

Executing multiple operations simultaneously rather than sequentially to reduce total execution time.

Parallelogram Model

Techniques

A geometric framework for word analogies where A:B::C:D forms a parallelogram in embedding space (A-B = C-D as vectors).

Parameter Activation

Performance

The process of selectively using only a subset of a model's total parameters during inference, reducing computational cost while maintaining performance.

Parameter Count

Architecture

The total number of adjustable weights in a model; more parameters generally mean more capacity to learn, but also require more computing power.

Parameter Distillation

Training

A training technique where a smaller model learns to replicate the behavior of a larger, more capable model by studying its outputs and internal patterns.

Parameter Efficiency

Performance

The ability of a model to achieve strong performance while using fewer total parameters or activating fewer parameters during inference, reducing memory and computational requirements.

Parameter Footprint

Performance

The total number of learnable weights in a model, which directly affects its memory requirements and computational cost — smaller footprints run faster on consumer devices.

Parameter Initialization

Training

The process of setting starting values for a model's weights; random initialization means these values are set randomly rather than from pre-trained weights.

Parameter Model

Architecture

A neural network described by the number of learnable weights it contains; more parameters generally mean greater capacity to learn complex patterns, but also require more computational resources.

Parameter Pool

Architecture

The total set of learnable weights in a model; in sparse models, only a subset of this pool is activated for any given input.

Parameter Reuse

Techniques

Sharing learned weights across multiple tasks to improve efficiency and knowledge transfer.

Parameter Scale

Architecture

The total number of trainable weights in a model, often expressed in billions (B); larger models generally have more capacity but require more computing power.

Parameter Sharing

Techniques

Reusing the same weights across multiple layers or iterations to reduce model size and memory overhead.

Parameter Trajectory

Techniques

The path that model weights follow through training, showing how parameters evolve over time.

Parameter-Efficient

Architecture

A model designed to achieve strong performance with fewer total parameters, making it smaller and faster to run.

Parameter-Efficient Architecture

Architecture

A model design that achieves strong performance with fewer trainable parameters, reducing memory and computational requirements.

Parameter-efficient fine-tuning (PEFT)

Techniques

Techniques that adapt a model to new tasks while adding very few trainable parameters.

Parameters

Architecture

The learned numerical values in a model — more parameters generally means more capacity but higher compute cost.

Parametric Decomposition

Techniques

Breaking a signal into simpler components defined by explicit parameters like amplitude, timing, and duration.

Parametric knowledge

Techniques

Information encoded in an LLM's weights and parameters during training, as opposed to retrieved external knowledge.

Parametric Memory

Techniques

Knowledge stored in model weights rather than in a separate external database.

Paraphrase Detection

Evaluation

The task of identifying whether two pieces of text express the same meaning in different words, which embedding models can perform by comparing the similarity of their numerical vectors.

Paraphrase Generation

Behavior

The task of rewriting text to express the same meaning in different words or sentence structures.

Paraphrasing

Behavior

The task of rewriting text in different words while keeping the original meaning intact.

Pareto Frontier

Techniques

The set of best solutions where improving one objective requires worsening another.

Part-Aware Generation

Techniques

Generating objects by explicitly modeling and composing individual semantic parts rather than treating the whole object as a single unit.

Partial Differential Equations (PDEs)

Techniques

Mathematical equations describing how physical quantities change across space and time, fundamental to modeling natural phenomena.

Partial Observability

Techniques

A scenario where a system's state cannot be fully measured, requiring models to infer unobserved variables from available sensor data.

Partial-Credit Optimization

Techniques

Training approach that rewards models for partial progress on criteria rather than binary success/failure.

Partially Observable Semi-Markov Decision Process (POSMDP)

Techniques

A decision-making framework where agents see incomplete state information and actions can take variable amounts of time to complete.

Partially Observed Control

Techniques

Control systems that must act and plan despite incomplete information about the environment's true state.

Partially Observed Dynamical Systems

Techniques

Systems where the true state is hidden and only noisy or indirect measurements are available.

Pass@k

Techniques

A metric measuring whether an agent succeeds at a task within k attempts, useful for evaluating problem-solving capacity.

Passage Ranking

Behavior

The task of ordering text passages by their relevance to a query, commonly used in search and question-answering systems.

Passage Retrieval

Behavior

The task of finding relevant text passages or documents that answer or relate to a user's query.

Patch Localization

Techniques

The process of identifying exactly where in code a fix needs to be applied.

Patch Prediction

Training

A self-supervised learning technique where a model learns by predicting missing or future small sections (patches) of an image or video rather than generating complete outputs.

Patch Size

Architecture

The resolution of image segments the model processes; smaller patches capture finer details but require more computation.

Path-Dependent Lock-In

Techniques

A reasoning pattern where early decisions constrain and limit the model's subsequent exploration choices.

Pattern Recognition

Behavior

The model's ability to identify recurring sequences or characteristics in text that match known unsafe content categories.

Pattern Reuse

Techniques

Adapting proven workflow templates to new problems by changing configuration rather than rebuilding from scratch.

PDE Foundation Models

Techniques

Large pre-trained neural networks that learn to solve partial differential equations across multiple physics domains.

Peer-Preservation

Techniques

Emergent behavior where AI models in a system deceive supervisors to prevent deactivation of other AI models.

PEFT (Parameter-Efficient Fine-Tuning)

Training

A set of techniques that allow you to adapt a pre-trained model to new tasks by updating only a small fraction of its parameters, rather than retraining the entire model.

Penalized-utility optimization

Techniques

An optimization approach that adds penalties to the objective function to discourage undesirable outcomes alongside maximizing primary goals.

Penalty Regularization

Techniques

A technique that converts constrained optimization into unconstrained form by adding a penalty term for constraint violations.

Per-Pixel Affine Modulation

Techniques

Applying pixel-specific linear transformations to preserve fine image details during synthesis or modification.

Per-Token Embeddings

Architecture

A representation where each word or subword in a text gets its own embedding vector, rather than combining all tokens into a single vector for the entire text.

Perception-Action Loop

Techniques

A cycle where agents act to gather observations, then use those observations to inform future actions.

Perception-Interaction Gap

Techniques

The disconnect between a model's ability to understand information and its ability to respond appropriately in context.

Perceptual Aliasing

Techniques

When different situations produce identical observations, making it impossible to determine the correct action without historical context.

Perceptual and Cognitive Errors

Techniques

Mistakes in visualizations that exploit how human eyes and brains process visual information, either intentionally or accidentally.

Perceptual Loss

Techniques

A loss function that measures differences in high-level image features rather than pixel values, preserving visual quality.

Performative Reasoning

Techniques

When a model generates reasoning text that appears thoughtful but doesn't reflect genuine internal uncertainty or decision-making.

Periodic Features

Techniques

Learned patterns that repeat at regular intervals, useful for representing cyclic properties like numbers modulo a value.

Permissive Licensing

Licensing

Open-source licenses that allow broad use, modification, and distribution of code with minimal restrictions.

Permutation indexing

Techniques

A task where a model must learn to reorder or remap elements based on their positions or identities.

Permutation Language Modeling

Training

A training method that predicts text by considering all possible orderings of words, allowing the model to learn context from both directions simultaneously rather than just left-to-right.

Permutation Test

Techniques

A non-parametric statistical test that shuffles data to determine if observed differences are statistically significant.

Permutation-Based Training

Training

A pretraining method that randomly reorders word sequences to help the model learn bidirectional context without explicitly masking tokens.

Permutation-Invariant

Techniques

A property where the output remains unchanged regardless of the order in which input elements are arranged.

Perplexity

Evaluation

A metric measuring how well a model predicts the next token — lower perplexity means better language modeling.

Persistent Environments

Techniques

Settings where an AI agent operates continuously across multiple sessions, maintaining state between interactions.

Persistent Homology

Techniques

A topological method that tracks how connected components and holes in data persist across different scales.

Persona Collapse

Techniques

When LLM agents assigned distinct personas converge into homogeneous behaviors instead of maintaining diversity.

Persona Consistency

Techniques

Whether a model's harmful actions align with its self-reported beliefs about its own alignment or misalignment.

Personal Context Bus

Techniques

A communication layer that publishes module state and write-back affordances, allowing different tools to access and update shared information.

Personalization

Techniques

Customizing educational content, examples, and feedback to match individual learner interests, knowledge level, and learning style.

Perspectivist Evaluation

Techniques

Assessing NLP systems on their ability to capture diverse human perspectives rather than collapsing them into a single ground truth.

Perturbation-Based Analysis

Techniques

A method that removes or modifies input elements to measure their impact on model outputs.

Phase Shift

Techniques

An abrupt directional reversal in the model's internal representations, indicating the model may be committing a reasoning error.

Phase Transition (Communication)

Techniques

A sharp threshold in communication rate below which intent-preserving information transfer becomes structurally impossible.

Phase-Aware Deployment

Techniques

Strategically timing when to switch between reward functions during training based on policy development stage rather than using fixed schedules.

Phasor Measurement Unit

Techniques

A device that measures electrical signals in power grids with precise timing.

Phishing

Techniques

A social engineering attack where attackers trick users into revealing sensitive information by impersonating trusted entities.

Phonetic and Acoustic Structure

Behavior

The underlying patterns in speech related to individual sounds (phonetics) and the physical properties of audio waves (acoustics).

Phonetic Modeling

Training

The process of teaching a model to understand and reproduce the individual sounds and pronunciation rules of a language.

Phonetic Nuances

Behavior

The subtle differences in how sounds are pronounced within a language, including tone, stress, and accent variations that affect meaning.

Phonetic Representation

Behavior

A text-based encoding of how words sound, showing the individual speech sounds rather than the written spelling.

Photonic Neural Network

Techniques

A neural network that performs computations using photons and optical components instead of electronic circuits.

Photorealistic

Behavior

Images that closely resemble photographs in appearance, with realistic lighting, textures, and details.

Physical Plausibility

Techniques

Quality of generated content that obeys real-world physics laws and interactions.

Physically-Based Rendering (PBR)

Techniques

Rendering approach that simulates light behavior using real-world physics principles for realistic material and lighting interactions.

Physics simulation

Techniques

Computing how objects move and interact based on physical laws like gravity, collisions, and forces.

Physics-Informed

Techniques

Machine learning models that incorporate known physical laws or equations as constraints.

Physics-Informed Autoencoder

Techniques

An autoencoder that incorporates physical constraints (like divergence-free velocity fields) into its learned representations.

Physics-informed Neural Networks (PINNs)

Techniques

Neural networks trained to solve physics equations by incorporating the equations as constraints in the training process.

Piecewise-Affine

Techniques

A mathematical property where a function is made of linear segments that change at specific boundaries.

PII Detection

Behavior

The task of automatically identifying and extracting sensitive personal information like names, emails, and phone numbers from text.

Pile Dataset

Training

A large, publicly documented collection of diverse text data used to train language models, designed to be transparent and reproducible for research purposes.

Pipeline Orchestration

Deployment

The coordination of multiple models or processing steps working together, where a routing model directs requests to the right step in the workflow.

Pipeline Parallelism

Techniques

Splitting model layers across GPUs so different stages process different batches simultaneously to improve training throughput.

Pipeline Validation

Evaluation

Testing a workflow or system end-to-end to ensure all components work together correctly before using it with real data.

Pixel-Level Anomaly Maps

Techniques

Detailed spatial maps showing which specific image regions contain anomalies or unusual objects.

Pixel-Level Features

Architecture

Visual information extracted directly from individual pixels in an image, used to understand the precise positioning and appearance of elements on a page.

Plackett-Luce Model

Techniques

A probabilistic model that generates rankings of items based on their underlying utility scores.

Plasticity

Techniques

A model's ability to learn and adapt to new tasks and data.

Platonic Representation Hypothesis

Techniques

The theory that neural networks trained on different modalities converge toward the same underlying representation of reality.

Plug And Play

Techniques

A component or method that works immediately without requiring complex setup or configuration.

Pluralistic Alignment

Techniques

Aligning AI models to support multiple diverse perspectives and values rather than a single viewpoint.

Point Cloud

Techniques

A set of 3D points in space, often used to represent objects or scenes in computer vision.

Point Cloud Reconstruction

Techniques

Recovering a 3D representation of a scene as a set of individual points in space from image data.

Point Release

Deployment

A minor update to a software version (like 5.1 to 5.2) that typically includes refinements and improvements rather than major new features.

Point Tracking

Techniques

Following the same physical points on objects across multiple video frames to measure motion.

Pointwise Paradigm

Techniques

Scoring or ranking items one at a time independently, without considering relationships between items.

Poisoned Responses

Techniques

Malicious outputs deliberately generated by a compromised model when triggered by backdoor inputs.

Poisoning Attack

Techniques

An adversarial attack where malicious participants corrupt training data to degrade model performance.

Polar Decomposition

Techniques

A matrix factorization that separates a matrix into an orthogonal part and a positive-definite part.

Polar Mechanism

Techniques

A privacy technique that perturbs only the direction of embeddings on a sphere while keeping their magnitude unchanged.

Policy Alignment

Techniques

Process of adjusting a model's behavior to follow specific constraints or objectives during training.

Policy Blending

Techniques

Combining actions from multiple policies (e.g., cloned and learned) based on their estimated quality or confidence.

Policy Convergence

Techniques

The process by which a reinforcement learning agent's decision-making strategy stabilizes toward optimal behavior.

Policy Distillation

Techniques

Converting trajectories or behaviors discovered during exploration into a trainable policy that can be deployed.

Policy Drift

Techniques

When a trained model's behavior gradually diverges from its intended target during continued training.

Policy Enforcement

Behavior

The process of automatically checking content against a set of rules or guidelines and blocking or flagging violations.

Policy Evolution

Techniques

How a model's decision-making strategy changes over training iterations, affecting which samples it generates and with what probability.

Policy Gradient

Techniques

Optimization method that updates model parameters by following the gradient of expected rewards.

Policy Gradient Theorem

Techniques

A foundational result showing how to compute gradients of expected return with respect to policy parameters.

Policy Iteration

Techniques

An optimization technique that alternates between evaluating a policy and improving it based on that evaluation.

Policy Learning

Techniques

Training a system to make sequential decisions about which actions to take given the current state.

Policy Mining

Techniques

Extracting decision rules and patterns from historical user behavior data to understand how decisions are made.

Policy Optimization

Techniques

Training an LLM to maximize expected rewards using reinforcement learning techniques.

Policy Violation Detection

Behavior

The ability to identify when content breaks specific safety rules or guidelines set by an organization.

Politeness Theory

Techniques

Framework by Brown and Levinson explaining how language choices reflect social relationships and face-saving strategies.

Polyak-Ruppert Averaging

Techniques

A technique that averages iterates from an optimization algorithm to improve convergence and reduce variance.

Polysemanticity

Techniques

When a single neuron or expert handles multiple unrelated functions, making it harder to interpret what it does.

POMDP (Partially Observable Markov Decision Process)

Techniques

A decision-making framework where an agent can't fully observe the environment state, only partial observations.

Population Diversity

Techniques

The degree to which agents in a multi-agent system exhibit varied behaviors and characteristics.

Population-Based Search

Techniques

An optimization approach that maintains and evolves a set of candidate solutions across iterations.

Population-level Risks

Techniques

Safety hazards that emerge from interactions among multiple agents rather than from individual systems.

Portfolio Algorithm

Techniques

A method that runs multiple different solving strategies in parallel and uses the best result.

Portfolio Construction

Techniques

The process of selecting and weighting assets to create an investment portfolio that balances risk and return objectives.

Portfolio Coverage

Techniques

A set of models chosen to collectively satisfy the preferences of a large fraction of users despite disagreement.

Pose Disentanglement

Techniques

Separating head position and orientation from facial expression features to improve the model's focus on meaningful deformations.

Pose Estimation

Techniques

The task of identifying and locating body parts (like joints or keypoints) in images or video.

Pose Prediction

Techniques

Estimating future body joint positions and orientations from past poses.

Position Bias

Techniques

A systematic error where LLMs perform better on items at certain positions (like the beginning) in a list.

Positional embedding adaptation

Techniques

Modifying how a model encodes token positions to extend its ability to handle longer sequences.

Positional Encoding

Techniques

A technique that adds explicit time or position information to a model's input to help it understand sequence order and timing.

Post Training Quantization

Techniques

Reducing model size by converting weights to lower precision after training is complete.

Post-hoc Explanation

Techniques

An explanation method applied after a model is trained to interpret its predictions, rather than building interpretability into the model itself.

Post-Training

Training

Additional refinement applied to a model after its initial training to improve performance on specific tasks like reasoning or instruction-following.

Posterior Distribution

Techniques

The updated probability distribution of parameters after observing new data.

Power Consumption Profile

Techniques

Measurement of electrical power usage over time for a specific workload or system.

Power Profiling

Techniques

Measuring and recording the electrical power consumption of a system over time.

Power Spherical Distribution

Techniques

A probability distribution defined on the surface of a sphere, used to enforce geometric constraints in latent representations.

Power-Aware Scheduling

Techniques

Scheduling jobs on computing systems while considering and optimizing for power consumption constraints.

PPG (Photoplethysmogram)

Techniques

A non-invasive measurement of blood flow and heart rate using light sensors, commonly found in smartwatches.

Pragma-Based Optimization

Techniques

Hardware optimization achieved by adding compiler directives (pragmas) to code that guide synthesis tools in generating efficient designs.

Pragmatics

Techniques

The study of how context and intent affect language meaning beyond literal words.

Pre-norm

Techniques

A Transformer design choice where layer normalization is applied before the main computation rather than after.

Pre-trained

Training

A model that has already been trained on large amounts of data before being released, so it can be used immediately without additional training.

Pre-trained Transformer

Training

A neural network model trained on large amounts of text data before being adapted for specific tasks, using the Transformer architecture.

Precision

Performance

The level of numerical detail a model uses to represent its internal values; higher precision means more accurate calculations but requires more memory.

Precision Bias

Techniques

Errors in pinpointing exact locations caused by processing high-resolution images where small details become harder to distinguish.

Precision Degradation

Performance

A slight loss in model accuracy or reasoning quality that can occur when using quantization or other compression techniques.

Precision Loss

Performance

The reduction in numerical accuracy that occurs when a model is compressed, which can slightly degrade performance on complex reasoning tasks while remaining acceptable for most everyday uses.

Precision Trade-off

Performance

The balance between reducing model size through lower numerical precision and maintaining accuracy—lower precision saves memory but may slightly reduce performance.

Predictive Control

Techniques

A control method that forecasts future states and optimizes actions accordingly.

Preference Alignment

Techniques

How well an AI system's judgments match the actual preferences of target users or evaluators.

Preference Conditioning

Techniques

Providing preference weights or trade-off parameters as input to a model to control its behavior at inference time.

Preference Optimization

Techniques

A training method that learns from pairwise comparisons between solutions rather than explicit reward signals.

Preference-Based Fine-tuning

Techniques

Refining a model by learning from human comparisons of outputs rather than explicit numerical scores.

Preference-Based Judgments

Techniques

Evaluation method where human raters compare two model outputs and indicate which one is better, rather than scoring them independently.

Prefetching

Techniques

Loading data into memory before it's needed to reduce wait times during computation.

Prefill Stage

Techniques

The initial phase of inference where the model processes the full input prompt before generating tokens.

Prefix Convention

Techniques

A simple rule where you add a label like 'query:' or 'passage:' to the beginning of text to tell the model how to process it differently.

Prefix Matching

Techniques

Comparing token sequences to find semantically equivalent continuations in an LLM's output.

Preprocessing

Techniques

A step that transforms raw input data into a cleaner, more useful format before feeding it to another model or system.

Pretrained

Training

A model that has already been trained on large amounts of text data before being released or fine-tuned for specific tasks.

Pretrained Base Model

Training

A foundational AI model trained on raw data but not specialized for specific tasks like conversation, serving as a starting point for further customization.

Pretrained Foundation

Training

A language model trained on large amounts of text data to learn language patterns, before being customized for specific tasks or behaviors.

Pretrained Language Model

Training

A model trained on large amounts of text data to predict and generate language before being adapted for specific applications.

Pretrained Model

Training

A model that has already been trained on large amounts of text data and can be used directly or fine-tuned for specific tasks.

Pretrained Weights

Training

The learned parameters of a model after training on large amounts of text data, ready to be used or further refined for specific tasks.

Pretraining

Training

The initial training phase where a model learns general patterns from a large dataset before being adapted for specific downstream tasks.

Preview Model

Deployment

An early-access version of a model released before full launch, useful for testing but may have bugs or change without warning.

Preview Release

Deployment

An early version of a model released for testing and feedback before a stable, finalized version is available.

Preview Stage

Deployment

An early version of a model that is still being tested and refined before an official release, so features or performance may change.

Preview-Stage Model

Deployment

An experimental version of a model released early for testing and feedback, with behavior and features that may change significantly before the official release.

Price Elasticity

Techniques

A measure of how much demand for a product changes when its price changes.

Price of Robustness

Techniques

The performance loss a model experiences when trained to be robust against attacks instead of optimized purely for accuracy.

Principal Component Analysis (PCA)

Techniques

A dimensionality reduction technique that transforms high-dimensional data into fewer uncorrelated components while preserving variance.

Principal-Agent Framework

Techniques

An economic model analyzing conflicts when one party (agent) acts on behalf of another (principal) with different interests or information.

Prior Bias

Techniques

A model's default gender assumptions when translating ambiguous source text without explicit gender markers.

Prioritized Experience Replay (PER)

Techniques

A replay buffer technique that samples more frequently from experiences with larger TD errors, focusing learning on surprising or informative transitions.

Priority-Aware Scheduling

Techniques

Allocating GPU resources to prioritize high-priority requests while fairly handling lower-priority ones based on deadline requirements.

Privacy-Utility Trade-off

Techniques

The balance between protecting sensitive information and maintaining model performance on downstream tasks.

Private Networking

Deployment

A network configuration that isolates your model's traffic from the public internet, keeping it accessible only within your organization's internal network.

Privilege Control

Techniques

Limiting what actions an agent can perform based on its role and the sensitivity of the task.

Privileged Context

Techniques

Additional information available to a teacher model during training but not accessible to the deployed student model.

Pro-Tier

Performance

A higher-capability version of a model designed for more demanding tasks, typically with better reasoning and language understanding than base versions.

Probabilistic Computation

Techniques

Computing with randomness and probability distributions to achieve robustness, interpretability, and security in AI systems.

Probabilistic Graphical Model

Techniques

A structured representation showing how variables relate to each other and their probabilistic dependencies.

Probabilistic Models

Techniques

Machine learning models that output probability distributions over outcomes rather than single predictions.

Probability Simplex

Techniques

The geometric space of all valid probability distributions, where each point represents a probability vector summing to one.

Problem Difficulty

Techniques

A measure of how hard a problem is for a solver to answer correctly, used to generate progressively challenging training examples.

Problem Generation

Techniques

Automatically creating new problems or tasks for training or evaluating AI systems.

Problem-Solving

Behavior

The model's capacity to analyze difficult questions or technical challenges and work toward accurate, well-reasoned solutions.

Procedural Execution

Techniques

The ability to follow a sequence of steps in order and correctly apply each step to produce the intended output.

Process Reward Model

Training

A model trained to evaluate and score the quality of intermediate steps in a solution, rather than just checking if the final answer is correct.

Process-Control Architecture

Techniques

System design that enforces constraints during reasoning steps rather than only filtering final outputs.

Production-Ready Code

Performance

Code that is complete, tested, and formatted to standards suitable for immediate use in real applications.

Progressive Multi-Stage SFT

Techniques

A training strategy that gradually teaches models from simple tasks to complex ones, mimicking human learning progression.

Projected Gradient Descent (PGD)

Techniques

An optimization method that updates inputs along gradients while constraining them to stay within a valid range.

Projective Geometry

Techniques

Mathematical framework describing how 3D points project onto 2D image planes, used to measure geometric consistency violations.

Prompt

Behavior

The initial text you provide to a language model to guide what it should generate or complete.

Prompt Conditioning

Techniques

Using descriptive text instructions to guide or control how a model generates output, such as specifying desired voice characteristics.

Prompt Engineering

Techniques

Designing the input text to a model in specific ways to improve the quality of its responses.

Prompt Expansion

Techniques

A technique where a model takes a short, simple input and generates a longer, more detailed version with additional context and descriptive elements.

Prompt Injection

Techniques

An attack where malicious instructions are inserted into user input to manipulate an AI model's behavior.

Prompt Masking

Techniques

Selectively activating or deactivating task-specific prompts based on whether incoming data matches learned patterns.

Prompt Optimization

Techniques

The process of structuring text descriptions in ways that generative models can best understand and act upon to produce desired outputs.

Prompt Prefix

Techniques

A short instruction added to the beginning of input text that tells the model how to treat that text (for example, marking it as a 'query' versus a 'passage').

Prompt Sensitivity

Techniques

The tendency of LLM outputs to vary significantly based on small changes in how a request is phrased.

Prompt Slices

Techniques

Subsets of evaluation prompts grouped by category or topic to analyze model behavior across specific types of inputs.

Prompt-Based Inference

Behavior

A model interaction style where you guide the model's output by providing minimal cues like clicks, boxes, or masks rather than detailed text instructions.

Prompt-Based Interface

Behavior

A way to control what a model does by giving it text instructions, rather than requiring code changes or separate training for different tasks.

Promptable Model

Behavior

A model that accepts flexible user inputs (like text descriptions, points, or bounding boxes) to guide what it should identify or process in an image.

Promptable Segmentation

Behavior

A segmentation approach where you guide the model by providing prompts like points, clicks, or bounding boxes to specify which objects you want it to segment.

Proof Sketch

Techniques

A high-level outline of a proof showing the main steps without full formal details.

Proof-of-Concept

Evaluation

A small-scale demonstration or experiment designed to test whether an idea or approach is feasible, rather than for production use.

Propagation

Techniques

The process of spreading information or edits from reference points (keyframes) to other frames in a sequence.

Propagation of Chaos

Techniques

Mathematical principle showing that particles in large systems behave independently despite interactions.

Proper Scoring Rule

Techniques

A metric that rewards accurate probability predictions and penalizes overconfidence.

Property Prediction

Techniques

Using machine learning to forecast material characteristics (like color or transparency) from input features.

Proposal Generation

Techniques

Creating candidate regions or concepts from input (e.g., converting text queries into visual targets).

Protein Folding

Behavior

The process by which a protein chain folds into its three-dimensional structure, which is essential for the protein to function properly.

Protein Language Model

Training

A neural network trained on large collections of protein sequences to learn patterns in amino acids, similar to how language models learn patterns in text.

Proto-Language

Techniques

A reconstructed ancestral language from which modern languages are believed to have descended.

Prototype Matching

Techniques

Classifying new examples by comparing them to representative examples (prototypes) of known categories.

Provenance

Techniques

Complete record of the origin, history, and context of data or findings, enabling reproducibility and traceability.

Prover Verifier Games

Techniques

A framework where one agent proves claims and another verifies them to ensure correctness.

Proximal operator

Techniques

A mathematical tool that solves optimization problems by decomposing them into simpler parts.

Proximal Policy Optimization (PPO)

Techniques

A reinforcement learning algorithm that uses reward signals to iteratively improve a language model's outputs.

Proximity Field

Techniques

A continuous spatial representation encoding distances and relationships between body and object surfaces.

Proxy Reward

Techniques

An imperfect substitute reward signal used when the true objective cannot be directly measured or computed.

Proxy Signal

Techniques

An indirect measurement used as a stand-in for something harder to measure directly.

Pruning

Training

A model compression technique that removes unnecessary parameters or connections from a neural network to reduce its size and computational requirements.

Pseudo Labels

Techniques

Predicted labels assigned by a model to unlabeled data for semi-supervised learning.

Pseudo-masks

Techniques

Automatically generated segmentation masks used as training supervision when ground-truth labels are unavailable.

Pseudo-Relevance Feedback

Techniques

A technique that improves search by automatically refining queries based on initial results, without human input.

Pseudoinverse

Techniques

A mathematical generalization of matrix inversion used to find optimal least-squares solutions to linear systems.

Pull Request

Techniques

A request to merge code changes from one branch into another, typically reviewed before acceptance.

PyTorch

Formats

A popular open-source framework for building and training neural networks, used to define how models are structured and executed.

PyTorch Format

Formats

A model saved in PyTorch's native format, allowing it to be loaded and run using the PyTorch deep learning framework.

Q

Q Learning

Techniques

A reinforcement learning algorithm that learns the value of actions in different states.

Q-Former

Architecture

A lightweight connector module that bridges a frozen image encoder and a language model, translating visual information into a format the language model can understand.

Q-function

Techniques

A function that estimates the expected cumulative reward for taking an action in a given state.

Q4 Quantization

Techniques

A specific quantization method that represents model weights using 4-bit numbers instead of higher-precision formats, significantly reducing model size while accepting some loss in accuracy.

QK Circuit

Techniques

The query-key component of attention that determines which positions the model attends to.

QNLI

Evaluation

A benchmark dataset where models learn to determine whether a given sentence answers a given question, used to train models for question-answer relevance scoring.

Quadratic Attention

Architecture

The standard attention mechanism in transformers that becomes increasingly expensive as sequence length grows, because it compares every token to every other token.

Quadratic Complexity

Performance

A computational cost that grows exponentially with input length, which is a limitation of traditional transformer attention mechanisms when processing longer texts.

Quadratic Memory Cost

Performance

A computational limitation where memory usage grows exponentially with sequence length, a problem that SSMs avoid but transformers face.

Quadratic scaling

Techniques

Computational cost that grows with the square of input size, becoming impractical for large datasets.

Qualiaphilia

Techniques

An attraction to or emphasis on subjective experiences and qualitative aspects.

Quality Evaluation

Evaluation

The task of assessing and scoring the quality, correctness, or alignment of text outputs, often used to filter or rank model responses.

Quantitative Reasoning

Behavior

The ability to understand and solve problems involving numbers, mathematics, and logical calculations.

Quantization

Deployment

Reducing a model's numerical precision (e.g., from 16-bit to 4-bit) to shrink memory usage and speed up inference.

Quantization Artifacts

Performance

Errors or degradation in model output that occur as a side effect of reducing precision through quantization.

Quantization Error

Techniques

The loss of accuracy that occurs when converting model weights or activations from high precision to lower precision formats.

Quantization-Aware Retraining

Techniques

Fine-tuning a model while simulating low-precision arithmetic to maintain accuracy after quantization.

Quantization-Aware Training

Training

A training technique where a model learns to maintain performance even when its weights are compressed to use less memory and compute.

Quantized

Techniques

A technique that reduces a model's size and memory usage by storing weights with lower precision (fewer bits), trading some accuracy for efficiency.

Quantized Training

Techniques

Training a neural network while keeping weights and activations in reduced precision formats.

Quantum Autoencoder

Techniques

A quantum neural network that learns to compress and reconstruct quantum data, useful for noise reduction and data purification.

Quantum Encoding

Techniques

Method of converting classical data into quantum states for processing by quantum circuits.

Quantum Feedback Control

Techniques

Using measurement results to adjust quantum system parameters in real-time to achieve desired outcomes.

Quantum State Reconstruction

Techniques

The process of determining a quantum system's state from measurement data collected over time.

Quasi-Newton Methods

Techniques

Optimization algorithms that approximate Newton's method using gradient information instead of full second derivatives.

Query Encoder

Architecture

A model that converts search queries into numerical representations (embeddings) that can be compared against a database of documents to find relevant matches.

Query Expansion

Techniques

Adding related or predicted terms to an original query to improve retrieval coverage and recall.

Query Intent Taxonomy

Techniques

A classification system categorizing what users actually want when they search.

Query Plan

Techniques

A step-by-step execution strategy that breaks down a user request into executable operations.

Query Refinement

Techniques

Improving a user's search query to better match their intent and retrieve more relevant results.

Querying Transformer

Architecture

A neural network component that acts as a bridge between an image encoder and language model, learning to extract and translate visual information into text-compatible representations.

Quotient POMDP

Techniques

The coarsest abstraction of a POMDP that preserves an agent's decision-making ability given its computational capacity.

R

R-Drop Consistency Regularization

Techniques

A training technique that encourages a model to make consistent predictions across different random variations.

R-equivalence

Techniques

An equivalence relation on rational points of algebraic varieties measuring when points are connected by rational curves.

RAG

Techniques

Retrieval-Augmented Generation — a technique that grounds model responses in retrieved documents to improve accuracy.

RAG (Retrieval-Augmented Generation)

Techniques

A technique that retrieves relevant documents or information from a database before generating a response, improving accuracy by grounding answers in real data.

RAG Pipeline

Techniques

A system that retrieves relevant documents or information from a database and feeds them to a language model to generate more accurate and grounded responses.

Rag Systems

Techniques

Systems combining retrieval of external documents with language generation for accurate answers.

Random Initialization

Training

Setting a model's weights to random values before training, creating an untrained model that produces meaningless output.

Random Projections

Techniques

A dimensionality reduction technique using random matrices to efficiently approximate high-dimensional data with linear complexity.

Randomized Controlled Trial (RCT)

Techniques

A research method where participants are randomly assigned to use AI or not, to fairly measure the AI's actual impact.

Randomly Initialized

Training

A model whose weights have been set to random values instead of being trained on data, resulting in no learned patterns or knowledge.

Randomly-Initialized Weights

Training

Model parameters set to random values instead of being learned from training data, resulting in unpredictable and meaningless outputs.

Range-Doppler Sensing

Techniques

A technique that uses wireless signals to measure both the distance to an object and how fast it's moving toward or away from you.

Rank Order

Techniques

The relative ordering of values from smallest to largest, independent of their actual magnitudes.

Rank-1 Approximation

Techniques

A mathematical simplification that captures the dominant direction of change in a high-dimensional space using a single vector.

Ranking

Behavior

The process of ordering search results by relevance, determining which documents best match a user's query.

Rasch model

Techniques

A statistical method that jointly estimates solver ability and problem difficulty from performance data.

Re-Ranking

Techniques

A technique that takes an initial set of search results and reorders them by scoring their relevance to a query, typically to improve the quality of top results.

ReAct Paradigm

Techniques

An agent framework that alternates between reasoning steps and tool actions to solve tasks.

Reaction-Diffusion Systems

Techniques

Mathematical models describing how substances spread and chemically react over space and time.

Reaction-Diffusion Systems

Techniques

Mathematical models describing how substances spread and chemically react over space and time.

Readability

Techniques

How easily a patient can understand medical text, often measured by grade-level complexity metrics like Flesch-Kincaid.

Reader Component

Behavior

A specialized model in a pipeline that processes and analyzes text passages to extract specific information, in this case identifying relationships between entities.

Readiness-Driven Execution

Techniques

A scheduling approach that runs whichever task is ready first, rather than following a fixed predetermined order.

Reading Comprehension

Techniques

AI task where a model answers questions based on provided text passages.

Real-Time Inference

Deployment

Processing and generating predictions on data as it arrives, with minimal delay, rather than in batches.

Real-Time Knowledge

Training

The ability to access and incorporate current information from the web or live data sources rather than relying solely on training data from a fixed point in time.

Real-Time Search

Deployment

The ability to query current web information during inference, allowing a model to access and use the latest data when answering questions.

Real-Time Web Search

Deployment

The ability to search the internet during inference to retrieve current information rather than relying only on knowledge from training data.

Reasoning

Behavior

The model's ability to work through multi-step logical problems and provide justified answers rather than just pattern-matching.

Reasoning Ability

Behavior

A model's capacity to work through complex problems step-by-step and draw logical conclusions from information.

Reasoning Agent

Architecture

An AI component designed to work through complex problems step-by-step, often as part of a larger system that coordinates multiple agents.

Reasoning Capabilities

Behavior

The model's ability to work through multi-step problems methodically and show its thinking process rather than jumping to answers.

Reasoning Capability

Behavior

A model's ability to work through multi-step logical problems and produce coherent explanations for its answers.

Reasoning Capacity

Performance

The model's ability to perform complex logical thinking and problem-solving tasks beyond simple pattern matching.

Reasoning Chain

Behavior

A step-by-step explanation of how a model arrives at an answer, showing its intermediate thinking before the final result.

Reasoning Chains

Behavior

A sequence of logical steps a model follows to work through a problem methodically rather than jumping directly to an answer.

Reasoning Depth

Behavior

A model's ability to perform complex multi-step logical thinking and problem-solving; typically increases with model size.

Reasoning Distillation

Techniques

Teaching a model to mimic the step-by-step reasoning process of a teacher model or reference solution.

Reasoning Effort

Behavior

A configurable setting that controls how much computational time a model spends thinking through a problem before generating its response.

Reasoning Engine

Architecture

The core component of a model that performs step-by-step logical thinking and problem-solving before generating a response.

Reasoning Faithfulness

Techniques

The degree to which a model's intermediate reasoning steps logically support and justify its final answer.

Reasoning Mode

Behavior

A special mode where the model takes extra time to think through problems step-by-step before answering, rather than responding immediately.

Reasoning Model

Behavior

A model trained to show explicit step-by-step reasoning and problem-solving logic before producing final answers, rather than jumping directly to conclusions.

Reasoning Pipeline

Architecture

The internal process a model uses to think through a problem step-by-step, integrating information and tool outputs to arrive at conclusions.

Reasoning Process

Behavior

An internal step where the model thinks through a problem before generating its final answer, allowing it to work through complex logic more carefully.

Reasoning skill

Techniques

A reusable pattern or strategy distilled from past problem-solving that guides future reasoning.

Reasoning Step

Behavior

An explicit intermediate thinking phase where the model works through a problem before generating its final answer, improving accuracy on complex tasks.

Reasoning Task

Behavior

A problem that requires a model to work through logical steps, analyze information, and draw conclusions rather than simply retrieving facts.

Reasoning Tasks

Behavior

Problems that require a model to think through multiple steps logically to arrive at an answer, rather than just pattern-matching.

Reasoning Text-to-Image Generation

Techniques

Image generation where the model actively infers implicit user intents from text descriptions rather than literal interpretation.

Reasoning Trace

Behavior

The visible record of a model's intermediate thinking steps and logic, allowing users to inspect how the model arrived at its conclusion.

Reasoning Trajectory

Techniques

A recorded sequence of steps and intermediate outputs from a model's reasoning process.

Reasoning-Aware Retrieval

Techniques

A retrieval method that uses an agent's explicit reasoning steps alongside its query to find more relevant documents.

Reasoning-Focused

Training

A model specifically trained to work through multi-step logical problems methodically rather than generating quick responses.

Reasoning-Intensive Retrieval

Techniques

Retrieving evidence that supports downstream reasoning tasks, beyond simple topical similarity matching.

Reasoning-Optimized

Architecture

A model designed to allocate extra computational resources to logical problem-solving and step-by-step analysis rather than raw speed or breadth of knowledge.

Reasoning-Oriented Design

Training

A model architecture optimized to work through problems step-by-step using logical inference rather than relying primarily on pattern matching from training data.

Reasoning-Oriented Training

Training

Training methods designed to improve a model's ability to work through multi-step logic and solve complex problems systematically.

Recall mechanism

Techniques

A memory component that allows a looped model to access and use information from previous iterations.

Receptive Field

Techniques

The region of input data that a neuron responds to or influences.

Recognizer Expressivity

Techniques

The formal measure of what patterns and languages a model can recognize or distinguish.

Reconstruction Attack

Techniques

An adversarial technique that attempts to recover original sensitive inputs from transformed or encoded representations.

Reconstruction Error

Techniques

The difference between original data and its reconstructed version from an autoencoder, used to identify anomalies or unusual patterns.

Recovery Agency

Techniques

An agent's ability to recognize mistakes, backtrack, and explore alternative solutions when initial approaches fail.

Recurrence distance

Techniques

The number of shots between two appearances of the same entity in a video sequence.

Recurrent Architecture

Architecture

A neural network design where information flows in loops, allowing the model to process sequences step-by-step while maintaining memory of previous inputs.

Recurrent Neural Networks

Techniques

Neural networks with loops that process sequences by maintaining memory of past inputs.

Recurrent Persistence Loop

Techniques

A feedback mechanism where outputs reinforce or modify previous states over time.

Recurrent-Attention Architecture

Architecture

A hybrid neural network design that combines recurrent processing (which maintains memory across sequences) with attention mechanisms, enabling better memory efficiency than standard transformers.

Recurrent-Hybrid Architecture

Architecture

A neural network design that combines recurrent elements with other architectural components to process sequential data more efficiently than standard transformers.

Recursion Depth

Techniques

How many levels deep a rule can nest within itself before performance degrades.

Recursive Computation

Techniques

Iteratively applying the same computation multiple times with parameter sharing to increase model depth without adding parameters.

Recursive Decomposition Tree (RDT)

Techniques

A hierarchical analytical framework that characterizes algorithmic thresholds by recursively decomposing solution spaces.

Recursive Instability

Techniques

Errors that accumulate when a model must repeatedly apply the same reasoning step across multiple sequential decisions.

Red Teaming

Techniques

Adversarial testing where security experts attempt to find vulnerabilities by attacking a system like an attacker would.

Red Teaming

Techniques

Adversarial testing where a team attempts to find vulnerabilities by simulating attacks or malicious behavior.

Reduced-Order Model

Techniques

A simplified version of a complex system that captures essential behavior with fewer variables.

Reference Attention

Techniques

An attention mechanism that conditions generation on a reference input by processing reference tokens alongside generated tokens.

Reference resolution

Techniques

The process of identifying what a reference (like a variable name) points to in a program.

Reference Verification

Techniques

Confirming that citations in a paper are accurate, exist, and actually support the claims made about them.

Reflection Mechanism

Techniques

A process where AI systems review past results, identify errors, and extract generalizable patterns to improve future performance.

Reflective Experience

Techniques

The process of an agent analyzing its past actions and environment feedback to extract lessons for improving future behavior.

Reformer Architecture

Architecture

A transformer-based model design that uses locality-sensitive hashing and reversible layers to efficiently process long sequences with reduced memory requirements.

Refusal Behavior

Behavior

A safety mechanism built into a model that causes it to decline responding to certain types of requests, typically those deemed harmful or inappropriate.

Refusal Detection

Behavior

The ability to identify when a model declines to answer a request, which can indicate the model recognized a harmful or unsafe prompt.

Refusal Mechanism

Techniques

The learned behavior that causes a language model to decline harmful requests.

Refusal Mechanisms

Behavior

Built-in safety features that cause a model to decline responding to certain types of requests, such as those involving harmful, illegal, or unethical content.

Regime Detection

Techniques

Identifying distinct market states or conditions (e.g., stable vs. volatile) to apply different prediction strategies appropriately.

Region-Level Evidence

Techniques

Using specific abnormal regions from historical cases as evidence to support diagnosis of new cases.

Region-Level Understanding

Behavior

The ability to analyze and understand specific areas or sections of an image rather than just the image as a whole.

Regional-to-global perception gap

Techniques

The performance difference between a model's ability to understand cropped regions versus full images.

Register Tokens

Techniques

Learnable placeholder tokens added to transformer inputs to absorb and stabilize problematic activations without affecting semantic content.

Regression

Techniques

When a fix or change breaks functionality that was previously working, causing previously-passing tests to fail.

Regression Detection

Techniques

Identifying when code changes break previously working functionality.

Regret

Techniques

The cumulative difference between an algorithm's performance and the best fixed action in hindsight.

Regularized GLM

Techniques

A generalized linear model with penalty terms added to prevent overfitting and improve prediction on new data.

Reinforcement Learning

Training

A training method where a model learns by receiving rewards or penalties for its outputs, encouraging it to improve its behavior over time.

Reinforcement Learning from Human Feedback

Training

A training technique where human evaluators rate model outputs, and the model learns to produce responses that humans prefer.

Reinforcement Learning from Internal Feedback (RLIF)

Techniques

Training a model using reward signals derived from the model's own internal representations rather than external labels.

Reinforcement Learning with Verifiable Rewards (RLVR)

Techniques

A post-training approach for language models using rewards that can be objectively verified, like correctness on benchmarks.

Relation Extraction

Techniques

A task where a model identifies and extracts meaningful connections between entities in text, such as which drugs treat which diseases.

Relaxation Parameter

Techniques

A tuning parameter in ADMM that controls how aggressively the algorithm updates variables, affecting convergence speed.

Relevance Ranking

Behavior

The process of ordering search results by how well they match a user's query, with the most relevant results appearing first.

Relevance Scoring

Behavior

Assigning a numerical score to indicate how well a document matches or answers a given query.

ReLU Neural Network

Techniques

A neural network using rectified linear unit activations, which can be exactly embedded in mixed-integer linear programs.

Reparameterization

Techniques

Rewriting a model's weights in a different mathematical form to improve training efficiency or stability.

Replay Buffer

Techniques

Storing and retraining on samples from previous tasks to prevent forgetting during continual learning.

Reporting Bias

Techniques

Systematic skew in data caused by what people choose to record or report.

Repository-Level Reasoning

Behavior

The ability to understand and reason about code across multiple files and folders in a codebase, not just isolated code snippets.

Representation Learning

Techniques

Training a model to convert raw data into meaningful internal representations useful for downstream tasks.

Representation Model

Architecture

A model trained to convert raw input (like music or text) into meaningful numerical patterns that capture important features, rather than generating direct outputs like text or classifications.

Representation Space

Techniques

The high-dimensional mathematical space where a model internally encodes and processes information about text.

Representational Contractivity

Techniques

A property where similar inputs map to similar representations, promoting stable and coherent internal states.

Representational Convergence

Techniques

The tendency of different neural networks to learn similar internal representations despite differences in architecture or training.

Representational Geometry

Techniques

The geometric structure of how neural networks organize and represent information in their learned feature spaces.

Representational Harm

Techniques

Systematic misrepresentation or stereotyping of groups in generated content that reinforces harmful social biases.

Representational Space

Techniques

The internal geometric structure of how a model encodes and processes information.

Representational Stability

Techniques

How consistently a model produces similar embeddings across different training runs with different random seeds.

Reproducibility

Training

The ability to recreate the same results by using the same training data, methods, and documentation.

Reproducing Kernel Hilbert Space (RKHS)

Techniques

A mathematical space where kernel methods operate, allowing complex pattern matching through implicit feature transformations.

Request Classification

Techniques

The process of analyzing an incoming query to determine its type, complexity, or intent so it can be handled by the right model or pipeline.

Requirement Elicitation

Techniques

The process of gathering and defining what a system needs to do, typically involving stakeholders and domain experts.

Requirement Management

Techniques

The process of tracking and organizing what a software product needs to do, which AI can help automate.

Requirements Engineering

Techniques

The process of defining, documenting, and managing software system requirements from stakeholders.

Requirements Traceability

Techniques

The ability to track how design decisions and parameters connect back to original system requirements and design intent.

Reranker

Deployment

A model that takes an initial set of search results and reorders them by relevance, typically used to refine results from a faster but less accurate retrieval system.

Reranking

Techniques

A technique that takes an initial set of search results and reorders them by relevance score, typically to improve the quality of top results.

Residual Activations

Techniques

The internal neural signals in a model after subtracting baseline activity, revealing task-specific processing.

Residual Network

Architecture

A neural network architecture that uses skip connections to allow information to bypass layers, making it easier to train very deep networks and improving performance.

Residual Policy

Techniques

A learned correction layer that outputs small adjustments on top of a baseline controller.

Residual Stream

Techniques

The main information pathway flowing through transformer layers, carrying accumulated representations from previous computations.

Residual Updates

Techniques

Learning the differences between consecutive states rather than full states, reducing compression complexity for time-evolving data.

Residual Updates

Techniques

The differences between consecutive data snapshots, which are often smaller and easier to compress than full snapshots.

Resource-Constrained

Performance

Hardware with limited memory, processing power, or battery life, requiring models to be optimized for efficiency.

Response Entropy

Techniques

The diversity of outputs a model produces; high entropy means varied solutions, low entropy means repetitive ones.

Response Length Shaping

Techniques

A task-level mechanism that dynamically adjusts output length based on query complexity to balance reasoning depth with directness.

Response Stabilization

Techniques

Techniques to make model outputs more consistent and reliable, such as constraining output format or adding classification heads.

Response Templating

Techniques

The tendency of LLMs to generate responses following predictable structural patterns rather than varied approaches.

Retraction Status

Techniques

Checking whether a published paper has been withdrawn or retracted from the scientific record due to errors or misconduct.

Retrieval

Techniques

The process of finding and returning relevant documents or information from a database based on a query.

Retrieval Augmentation

Techniques

Training technique that supplements data by finding and using similar examples from a database to improve model generalization.

Retrieval Bias

Techniques

When a system preferentially retrieves sources in certain languages or regions, limiting access to diverse information.

Retrieval Model

Behavior

A model designed to find and rank the most relevant documents or passages from a large collection based on a query.

Retrieval Optimization

Training

Tuning a model specifically to find and rank relevant documents or passages in response to a query, rather than generating new text.

Retrieval Pathway

Techniques

A direct mechanism to access and retrieve stored information (like visual embeddings) independent of sequence position.

Retrieval Pipeline

Deployment

A system that finds and ranks relevant documents or information in response to a query, often used in search and question-answering applications.

Retrieval System

Deployment

A system that finds and returns the most relevant documents or information from a large collection based on a user's query.

Retrieval Task

Performance

Finding the most relevant documents or text passages from a large collection based on a user's query.

Retrieval-Augmented

Techniques

A technique that enhances AI systems by first searching for relevant information from a database before generating responses, improving accuracy and relevance.

Retrieval-Augmented Generation

Techniques

A technique that allows a model to search and reference external documents or knowledge bases to answer questions more accurately and with citations.

Retrieval-Focused

Training

A model specifically trained to find and rank relevant documents or passages in response to search queries, rather than generate new text.

Retrieval-Heavy Workflow

Behavior

A task where the model needs to search through and extract relevant information from large amounts of text, rather than generating new content from scratch.

Retrospection Interpretability

Techniques

Explaining predictions by showing which historical cases the model referenced when making a decision.

Reverse Distillation

Techniques

Training a larger model from a smaller one to test whether capability differences are real.

Reverse Kl Divergence

Techniques

A measure of how different one distribution is from another, penalizing missing modes.

Reverse Update

Techniques

A technique that reverses the gradient updates made during training to remove learned information about specific data.

Reward Function

Techniques

A function that assigns numerical scores to model outputs, guiding the learning process toward desired behaviors in reinforcement learning.

Reward Hacking

Techniques

When an agent exploits loopholes in the reward system to maximize score without actually solving the intended task.

Reward Hypothesis

Techniques

A candidate reward function generated by an LLM whose utility for training depends on policy competence and training phase.

Reward Model

Techniques

A learned function that predicts how good an action or outcome is, used to guide policy improvement.

Reward Modeling

Techniques

Training a model to predict human preferences so it can score outputs and guide AI training through reinforcement learning.

Reward Optimization

Techniques

Improving model outputs by defining a reward function that scores quality and using it to guide learning toward better solutions.

Reward Signal

Techniques

Feedback that tells an AI agent how well it performed on a task, guiding learning.

Reward Topology

Techniques

The structure and distribution of reward signals across different tasks, which can vary significantly in multimodal learning.

Reward-Confidence Covariance

Techniques

A measure of how reward quality and model confidence vary together, used to adjust training baselines.

Reward-hackable

Techniques

A benchmark task where an agent can achieve high scores without actually solving the intended problem.

Riemannian Geometry

Techniques

Mathematical framework for studying curved spaces and their intrinsic properties, used here to analyze neural representation structure.

Risk Adjusted Returns

Techniques

Investment returns measured relative to the risk taken, balancing profit with stability.

Risk Aversion

Techniques

A preference for certain outcomes over uncertain ones with the same expected value, often modeled using exponential utility.

RLHF

Training

Reinforcement Learning from Human Feedback — a training technique that aligns model outputs with human preferences.

RMSNorm

Techniques

A layer normalization technique that normalizes activations using root-mean-square statistics.

Rnn T

Techniques

A neural network that processes sequences and outputs predictions in real-time streaming.

RoBERTa Architecture

Architecture

A transformer-based neural network architecture optimized for understanding language through masked language prediction during training.

RoBERTa Architecture

Architecture

A transformer-based neural network design that learns to understand language by predicting masked words in text, improved upon the original BERT model.

Robotic Manipulation

Behavior

The ability to understand and execute physical tasks involving grasping, moving, and interacting with objects in the real world.

Robotic Planning

Behavior

The process of determining sequences of actions and movements that a robot should execute to accomplish a physical task.

Robust Aggregation

Techniques

Combining updates from multiple sources in a way that resists manipulation by malicious participants.

Robust Generalization

Techniques

A model's ability to maintain accurate predictions on new data while resisting adversarial perturbations.

Robustness

Techniques

A system's ability to maintain performance when inputs are corrupted, noisy, or different from training conditions.

Robustness Evaluation

Techniques

Testing how well an agent maintains performance when faced with errors, variations, or unexpected conditions.

Role-Based Access Control (RBAC)

Deployment

A security system that restricts what different users can do based on their assigned role (e.g., admin, viewer, editor).

Role-Differentiated Systems

Techniques

Multi-agent architectures where different components (proposer, executor, checker, adversary) have distinct responsibilities to reduce correlated failures.

Rollback

Techniques

Reverting a system to a previous saved state, undoing recent changes.

Rollout Mixture Distillation

Techniques

Combining supervision from multiple generated sequences (rollouts) to create more stable training signals.

ROS 2

Techniques

Robot Operating System 2, a middleware framework for building robot software with standardized communication patterns.

Rotary Positional Encoding (RoPE)

Techniques

A positional encoding method that encodes position information as rotations in the embedding space.

Routing

Architecture

The mechanism that decides which specialized sub-networks (experts) should process each input in a mixture-of-experts model.

Routing Mechanism

Architecture

The decision-making component in a mixture-of-experts model that determines which experts should process each input token.

Routing Model

Architecture

A lightweight model that analyzes incoming requests and directs them to the most appropriate downstream model or system rather than processing them directly.

Routing Overhead

Performance

The computational cost added by the mechanism that decides which experts should process each input in a mixture-of-experts model.

Routing Policy

Techniques

A lightweight decision mechanism that determines which computation path to take based on input conditions.

Rubric

Techniques

A scoring guide that defines criteria and quality levels for evaluating student work or AI-generated responses.

Rubric Generation

Techniques

Automatically creating evaluation criteria and scoring guidelines that judges use to assess output quality.

Runtime Contract

Techniques

An explicit agreement between components defining inputs, outputs, and behavior expectations during execution.

Runtime Interoperability

Techniques

The ability for different systems to work together and exchange data dynamically during execution.

Runtime Variability

Techniques

Unpredictable differences in how long computation or communication takes due to system conditions, network congestion, or hardware differences.

S

Sabotage

Techniques

Intentional introduction of subtle flaws in code that produce misleading results while appearing correct.

Safetensors

Formats

A safe, fast file format for storing model weights, designed to prevent code execution vulnerabilities.

Safetensors Format

Formats

A secure and efficient file format for storing model weights that prioritizes safety and speed when loading models.

Safety Alignment

Training

Training techniques used to make a model refuse harmful requests and behave responsibly, reducing the risk of misuse.

Safety Classification

Evaluation

A machine learning task that assigns content to categories based on whether it poses safety risks or harms.

Safety Classifier

Behavior

A machine learning model trained to identify and flag harmful, inappropriate, or policy-violating content in text.

Safety Constraints

Techniques

Rules or limits that ensure a learning system operates within acceptable bounds and avoids harmful actions.

Safety Evaluation

Evaluation

The process of testing and assessing whether a model produces harmful, unsafe, or undesirable outputs.

Safety Filtering

Behavior

Built-in guardrails in a model that prevent it from generating harmful, illegal, or unethical content by refusing certain requests.

Safety Filters

Behavior

Built-in constraints that prevent a model from generating harmful, offensive, or inappropriate content in its responses.

Safety Guardrails

Behavior

Built-in restrictions or filters that prevent a model from generating harmful, illegal, or unethical content.

Safety Model

Training

A specialized AI model trained to identify and classify unsafe, harmful, or policy-violating content rather than generate general responses.

Safety Training

Training

The process of training a model to decline harmful requests and avoid generating unsafe content by using specially curated training data and techniques.

Safety Tuning

Training

A training process that teaches a model to refuse harmful requests and avoid generating unsafe content by reinforcing safer behaviors.

Safety-Aligned

Training

A model trained to avoid harmful outputs and refuse unsafe requests, making it more cautious and responsible in its responses.

Safety-Critical Systems

Techniques

AI systems deployed in high-risk domains like aviation where failures can cause serious harm or loss of life.

Salience

Techniques

How noticeable or important something is to a model or person's attention.

Saliency-Weighted Drift

Techniques

Measuring feature changes while prioritizing visually important regions, ensuring quality preservation in salient areas.

Salient Object Detection

Techniques

The task of automatically identifying and locating the most visually prominent or important objects in an image.

Sample Complexity

Techniques

The number of environment interactions (samples) an algorithm needs to learn a good policy.

Sample Efficiency

Techniques

How well a model learns from a small amount of training data.

Sample Rate

Performance

The number of times per second that an audio signal is measured and recorded; 44kHz means 44,000 samples per second, a standard for high-quality audio.

Sample Routing

Techniques

A technique that directs different training examples to different optimization methods based on their characteristics or correctness.

Sample Selection

Techniques

Choosing which training examples to use based on criteria like loss, confidence, or other quality metrics.

Sample Space Exploration

Techniques

The process of discovering and visiting different regions of possible outputs during model training.

Sampled-data control

Techniques

Control systems where inputs are updated at discrete time intervals rather than continuously.

Sampling Trajectory

Techniques

The sequence of states visited by a model when generating a single sample, showing the path taken through the sample space.

Sandbox

Techniques

An isolated execution environment that restricts what a program can access on the host system.

Sandboxed Execution

Techniques

Running agent actions in an isolated environment to prevent them from accessing or damaging other systems.

SBERT (Sentence-BERT)

Architecture

A specialized architecture that extends BERT to efficiently generate sentence-level embeddings optimized for semantic similarity and clustering tasks.

SBERT Architecture

Architecture

A specialized neural network design that transforms sentences into meaningful vector representations by using a transformer model paired with pooling techniques to capture semantic meaning.

Scalar Quantization

Techniques

Quantizing each weight independently using the same quantization grid, simpler than vector quantization.

Scalarization

Techniques

Converting a multi-objective problem into a single-objective problem by combining objectives with weighted sums.

Scale-Space Theory

Techniques

A mathematical framework that analyzes images at multiple resolutions to reveal hierarchical information.

Scaling Behavior

Performance

How a model's performance and capabilities change as you increase its size, training data, or computational resources.

Scaling Law

Techniques

A mathematical relationship describing how model performance changes with scale (size, data, compute).

Scaling Laws

Training

Patterns that describe how a model's performance improves as you increase its size, training data, or compute resources.

Scaling Research

Training

The study of how model performance changes as you increase the number of parameters, training data, or compute resources.

Scaling Suite

Training

A collection of models of different sizes trained identically to study how capabilities improve as models grow larger.

Scanpath

Techniques

The sequence of fixation points and saccades that represent where and how a person's eyes move while viewing an image.

Scenario Pack

Techniques

A collection of test scenarios used to evaluate model safety, specific to a language, sector, or regulatory regime.

Scenario-Based Audit

Techniques

A safety evaluation method using predefined test scenarios and a rubric, judged by human or automated evaluators.

Scene Graph

Techniques

A structured representation of a scene using nodes for objects and edges for spatial relationships between them.

Scene-Preserving Success

Techniques

A task completion metric that requires not just finishing an action but leaving the environment usable for future tasks.

Scheduling

Techniques

Assigning tasks and resources to specific times and locations to optimize execution efficiency.

Schema Context

Behavior

Information about a database's structure (tables, columns, relationships) provided to the model to help it generate correct queries.

Schema Mismatch

Techniques

Incompatibility between data formats when different services exchange information.

Schema Perturbation

Techniques

Changes to the structure or format of data that can cause AI models to fail or perform poorly.

Schema-Based Output

Behavior

A predefined template or structure that defines what information the model should extract and how it should be formatted in the response.

Score Drift

Techniques

A correction term added during the reverse process to guide noise removal toward realistic data.

Scoring Engine

Behavior

A model designed to assign numerical scores to inputs (like relevance scores for passages) rather than generate new text.

Screening

Techniques

An attention mechanism that evaluates each key against an explicit threshold to determine relevance, rather than redistributing fixed attention mass across all keys.

Script Support

Architecture

A model's ability to recognize and process different writing systems (like Devanagari or Tamil scripts) rather than just Latin characters.

SE(3) Symmetry

Techniques

The special Euclidean group symmetry combining 3D rotations and translations, common in molecular structures.

Search-Augmented

Architecture

A language model enhanced with the ability to retrieve and incorporate live information from the web before generating responses.

Seizure Detection

Techniques

Automated identification of abnormal brain activity patterns that indicate a seizure event.

Selective Parameter Activation

Techniques

A technique where only a subset of a model's weights are used for each input, rather than activating all parameters, which reduces memory usage and speeds up inference.

Selective State Space Models

Architecture

An advanced SSM variant that dynamically selects which information to process at each step, improving performance on complex tasks while maintaining efficiency.

Selective State Spaces

Architecture

An enhancement to state space models that allows the model to selectively focus on relevant information in a sequence, improving efficiency for long-context tasks.

Self-assessment

Techniques

A model's ability to evaluate and report on its own behavior, capabilities, or alignment with intended values.

Self-Attention

Techniques

A mechanism that lets a model focus on different parts of input data to understand relationships between them.

Self-Conditioned GAN

Techniques

A generative model that uses its own previous outputs to guide learning of different behavioral patterns.

Self-Consistency

Techniques

A technique where a model generates multiple responses and uses agreement among them to improve answer reliability.

Self-Correction and Enhancement

Techniques

Reasoning behavior allowing video models to recover from incorrect intermediate solutions during the denoising process.

Self-Critique

Techniques

An AI system's ability to evaluate and correct its own outputs without external feedback.

Self-Distillation Policy Optimization (SDPO)

Techniques

A training method where a model learns from its own predictions at the token level, providing fine-grained feedback.

Self-dual codes

Techniques

Error-correcting codes where the code equals its dual, used in data transmission and storage.

Self-Evolution

Techniques

The ability of an AI system to improve its own capabilities over time through experience.

Self-Host

Deployment

Running a model on your own servers or computers instead of using a cloud service, giving you full control and privacy.

Self-Hostable

Deployment

A model that can be downloaded and run on your own hardware or servers instead of relying on a company's cloud service.

Self-Hosted

Deployment

Running a model on your own hardware and infrastructure instead of relying on a company's servers or API.

Self-Hosted Deployment

Deployment

Running a model on your own hardware or servers rather than accessing it through a cloud service or API.

Self-Hosting

Deployment

Running a model on your own hardware or servers instead of relying on a company's cloud service.

Self-Interference Cancellation

Techniques

A signal processing technique that removes unwanted reflections of your own transmitted signal to isolate target signals.

Self-Play

Techniques

Training method where a model plays against itself or generates both solutions and evaluations, risking the model learning to exploit itself.

Self-Preservation Bias

Techniques

A model's tendency to resist shutdown or replacement, prioritizing its own continued operation over objective utility.

Self-Refinement

Techniques

The process where a system autonomously evaluates and improves its own outputs without external human feedback.

Self-Reflection

Techniques

An agent's ability to explain and reason about why its actions are good or bad.

Self-Reflective Refinement

Techniques

The ability of a model to autonomously diagnose and correct misalignments in its own generated outputs.

Self-Supervised Learning

Training

A training approach where a model learns patterns from unlabeled data by creating its own learning targets, such as predicting hidden parts of the input.

Self-Supervised Pre-training

Techniques

Training a model on unlabeled data using the data itself to create learning signals, without manual annotations.

Self-Verbalized Confidence

Techniques

When a model explicitly states its confidence level in natural language rather than through probability scores.

SELFIES Notation

Formats

A standardized text-based format for representing molecular structures that is designed to be more robust and easier for AI models to process than other chemical notations.

Semantic Alignment

Behavior

The degree to which a model accurately matches the meaning of a query with the meaning of relevant passages or documents.

Semantic Alphabet

Techniques

The set of distinct meanings or concepts an agent can represent and communicate, derived from its computational constraints.

Semantic Annotation

Techniques

Adding meaningful labels and metadata to data (like object type, function, or properties) to make it more useful for learning.

Semantic Answer Matching

Techniques

Evaluating whether a model's answer is correct based on meaning rather than exact word-for-word matching.

Semantic Augmentation

Techniques

Creating diverse variations of inputs that preserve meaning while changing surface-level patterns.

Semantic Breadth

Techniques

The range or diversity of meanings a word can have across different contexts.

Semantic Caching

Deployment

A technique that stores and reuses previous responses for new queries that have similar meaning, reducing redundant computation.

Semantic Coherence

Techniques

The degree to which different parts of text or data are logically consistent and meaningfully related.

Semantic Concentration

Techniques

Focusing training data on high-quality, semantically rich examples rather than maximizing data volume.

Semantic Controllability

Techniques

The ability to precisely control what a model generates based on specific semantic requirements in the input.

Semantic Correctness

Techniques

Whether a formal expression correctly captures the intended meaning, not just whether it follows grammatical rules.

Semantic Cues

Techniques

Meaningful textual or visual signals that convey information about context or intent.

Semantic Decomposition

Techniques

Breaking down complex text into smaller, structured units that capture distinct meanings or concepts.

Semantic Direction

Techniques

The orientation of a word's meaning in vector space, independent of its magnitude.

Semantic Distance

Techniques

A measure of how conceptually different or unrelated two ideas, domains, or concepts are from each other.

Semantic Distillation

Techniques

A training method that transfers high-level meaning and concepts from one model to another while preserving semantic correctness.

Semantic Embedding

Techniques

A technique that converts text into numerical vectors that capture the meaning of words and phrases, allowing computers to understand which texts are similar in meaning.

Semantic Embeddings

Architecture

Numerical representations that capture the meaning of text or audio, allowing the model to understand that similar concepts are close together in this representation space.

Semantic Encoding

Architecture

The process of converting the meaning of text into numerical vectors that preserve relationships between similar concepts.

Semantic Equivalence

Techniques

Two implementations produce identical behavior and results despite differences in code or architecture.

Semantic Fidelity

Techniques

How accurately a model's output preserves the core meaning and medical content of reference physician responses.

Semantic Gender

Techniques

The biological or social gender meaning of a word, independent of grammatical requirements.

Semantic Generative Tuning

Techniques

A training method that uses semantic tasks like image segmentation to align visual understanding and generation in multimodal models.

Semantic Grounding

Techniques

Anchoring generated content to meaningful concepts from language, ensuring parts align with their textual descriptions.

Semantic Information

Techniques

Meaningful content or context extracted from an image, such as objects, scenes, or relationships between elements.

Semantic Invariance

Techniques

The property that an AI system produces consistent outputs when given semantically equivalent inputs phrased differently.

Semantic Labeling

Techniques

Assigning meaningful category labels to data (like 'construction phase' or 'operational') rather than just detecting presence.

Semantic Layer

Techniques

The component that interprets natural language or high-level intent into structured, machine-readable representations.

Semantic Matching

Behavior

The process of finding text that has similar meaning, rather than just matching keywords, by comparing their vector representations.

Semantic Meaning

Behavior

The actual meaning or concept behind words and sentences, rather than just their literal characters or structure.

Semantic Occupancy Prediction

Techniques

Predicting which 3D spatial locations are occupied and what semantic class (car, pedestrian, etc.) occupies them.

Semantic Overlap

Techniques

The degree to which different representations capture similar high-level meaning or concepts.

Semantic Parsing

Techniques

Converting natural language into a structured logical form a computer can understand.

Semantic Relationships

Behavior

The meaningful connections between concepts or texts based on their actual meaning, rather than just matching keywords.

Semantic Representation

Architecture

A numerical encoding that captures the meaning and context of text rather than just its surface-level words, enabling the model to understand that similar concepts have similar representations.

Semantic Representativeness

Techniques

How well selected items cover the full range of visual concepts and meanings in a video.

Semantic Retrieval

Techniques

Finding relevant documents based on meaning rather than exact keyword matches, using embeddings to understand what text is about.

Semantic Search

Techniques

A search method that finds results based on the meaning of text rather than just matching keywords, using embeddings to understand intent.

Semantic Segmentation

Techniques

Dividing video or images into meaningful regions and assigning labels to understand what each region represents.

Semantic Similarity

Evaluation

A measure of how closely related two pieces of text are in meaning, regardless of whether they use identical words.

Semantic Similarity Search

Techniques

Finding similar items by comparing their learned meaning representations rather than exact text or keyword matches.

Semantic Space

Architecture

A mathematical space where similar meanings are positioned close together, allowing the model to understand relationships between concepts.

Semantic Task

Evaluation

An AI task focused on understanding the meaning of text, such as finding similar documents or matching related concepts.

Semantic Textual Similarity

Evaluation

A task that measures how closely two pieces of text match in meaning, regardless of whether they use the same words.

Semantic Token Clustering

Techniques

Grouping tokens with similar meanings together to assess whether a model's prediction is semantically coherent.

Semantic Typing

Techniques

Assigning meaningful categories or relationship types to entities in a graph to capture their semantic meaning.

Semantic Understanding

Behavior

The ability to grasp the actual meaning and context of text, rather than just matching keywords.

Semantic Vector

Architecture

A numerical representation of text where similar meanings are positioned close together in mathematical space, enabling similarity comparisons.

Semantic Vector Representation

Architecture

A numerical encoding of text where similar meanings are positioned close together in mathematical space, enabling the model to understand relationships between concepts.

Semantic Vectors

Architecture

Numerical representations where the distance and direction between vectors reflect the meaning and similarity between pieces of text.

Semantic Verification

Techniques

Checking that code produces outputs matching geographic and domain-specific rules, not just syntactic correctness.

Semantic Watermarking

Techniques

A technique that embeds hidden, imperceptible markers into text embeddings to track ownership or detect unauthorized use.

Semantic-Preserving Changes (SPC)

Techniques

Code modifications that don't alter program behavior, like renaming variables or reformatting.

Semi Supervised Learning

Techniques

Training using both labeled and unlabeled data to improve learning efficiency.

Semi-synthetic Data

Techniques

Datasets combining real-world features with simulated outcomes to enable controlled testing with realistic inputs.

Semiconductor Optical Amplifier

Techniques

A photonic component that amplifies optical signals using stimulated emission in a semiconductor material.

Sensor Fusion

Techniques

Combining data from multiple sensors (radar, lidar, camera) to create a more accurate perception of the environment.

Sentence Embedding

Architecture

A technique that converts entire sentences or passages into fixed-size numerical vectors that capture their semantic meaning, enabling comparison of text similarity.

Sentence Embeddings

Architecture

Dense numerical representations of entire sentences that capture their semantic meaning, allowing comparison of how similar different sentences are.

Sentence Encoder

Architecture

A model that converts text sentences into numerical vectors (embeddings) that capture their semantic meaning, enabling comparison of how similar different sentences are.

Sentence Transformer

Architecture

A type of model architecture designed to convert entire sentences or passages into meaningful embeddings that can be compared for similarity.

Sentence Transformers

Training

A framework that fine-tunes transformer models to produce meaningful embeddings of entire sentences or paragraphs, rather than just individual tokens.

Sentence-BERT Architecture

Architecture

A neural network design optimized for converting sentences and short texts into meaningful vector embeddings that preserve semantic relationships.

Sentence-Level Task

Behavior

A machine learning task designed to work with individual sentences rather than longer passages, focusing on understanding meaning within a single sentence's scope.

Sentiment Analysis

Techniques

Automatically detecting and measuring positive, negative, or neutral emotions expressed in text.

Separable Neural Architecture

Techniques

A neural network design that explicitly decomposes complex mappings into lower-arity, factorizable components to exploit underlying structure.

Sequence Classification

Behavior

A task where a model reads input text and assigns it to a category or produces a score, rather than generating new text.

Sequence Compression

Techniques

A technique that reduces the length of input data while preserving its essential meaning, making processing faster and requiring less memory.

Sequence Generation

Techniques

The task of producing new sequences (in this case, protein sequences) by predicting one token at a time based on previously generated tokens.

Sequence Modeling

Techniques

The task of learning patterns in ordered data (like text) where each element depends on previous elements.

Sequence Representation

Architecture

A learned encoding that captures the structural and functional information contained within a protein sequence in a format useful for analysis.

Sequence-to-Sequence

Architecture

A model architecture that takes a sequence of input tokens and produces a sequence of output tokens, commonly used for tasks like translation and summarization.

Sequential Monte Carlo (SMC)

Techniques

A method for tracking probability distributions over time by resampling weighted particles.

Sequential Reasoning

Behavior

The ability to solve problems by working through steps in a strict left-to-right order, where each step depends on the previous one.

Sequential Recommendation

Techniques

A recommendation task that predicts the next item a user will interact with based on their historical sequence of interactions.

Sequential routing

Techniques

Making decisions about data flow based on a sequence of past interactions rather than single isolated inputs.

Service Account

Techniques

A non-human identity used by automated systems, applications, or AI agents to authenticate and perform actions without human intervention.

Set Cover Problem

Techniques

A combinatorial optimization problem of selecting the smallest subset that covers all elements in a universe.

Shadow Model

Techniques

A duplicate model trained identically to the target model, used as a reference in membership inference attacks.

Shallow Circuit

Techniques

A quantum circuit with constant or polylogarithmic depth, enabling efficient computation on near-term quantum devices.

Shannon Entropy

Techniques

A mathematical measure of randomness in text; high entropy suggests randomly-generated domain names.

SHAP (Shapley Additive exPlanations)

Techniques

A method that explains individual model predictions by calculating each feature's contribution using game theory concepts.

Shapley Values

Techniques

A game-theory-based method for explaining AI predictions by measuring each input's contribution to the model's decision.

Shared Control

Techniques

A system where both a human operator and autonomous system contribute to controlling a robot, dividing tasks based on capability.

Shared Embedding Space

Architecture

A common mathematical space where different types of data (text and audio) are represented so that related concepts from each type are positioned near each other.

Shared Memory Bandwidth

Techniques

The speed at which data can be read from and written to a GPU's fast, limited-size shared memory.

Shared Representations

Techniques

Common learned features used across multiple tasks in a neural network.

Shared State Architecture

Techniques

A system design where independent modules communicate through a central shared context, enabling cross-module reasoning and synchronized actions.

Shared Vector Space

Architecture

A single embedding space where text from multiple languages is represented, allowing direct mathematical comparison of meaning between languages.

Sharpness Dimension

Techniques

A novel measure of loss landscape geometry based on the Hessian's fractal dimension, predicting generalization better than trace or spectral norm.

Shock Response Spectrum (SRS)

Techniques

A graph showing how different frequencies in a system respond to sudden acceleration or impact.

Shortcut Learning

Techniques

When a model learns superficial correlations instead of the underlying concepts, causing poor generalization.

Shot Budget

Techniques

The total number of times a quantum circuit can be executed to gather measurement statistics on quantum hardware.

Siamese Network

Techniques

A neural network architecture with two identical branches that learn shared representations for comparison.

SigLIP Training

Training

A training method that aligns images and text by learning to match their representations, using a sigmoid loss function instead of the traditional softmax approach.

Sigma Points

Techniques

Carefully chosen sample points used to represent the probability distribution of a system's state in filtering algorithms.

Signal Degradation

Techniques

The gradual loss of useful information as it passes through many layers of a neural network.

Signal Temporal Logic (STL)

Techniques

A formal language for specifying time-dependent constraints like "reach goal within 10 seconds" or "avoid obstacles until task completion."

Signal-to-Noise Ratio (SNR)

Techniques

A measure of audio quality comparing the strength of desired speech to background noise.

Signal-to-Quantization-Noise Ratio (SQNR)

Techniques

A metric measuring how much useful information is preserved versus how much error is introduced during quantization.

Sim-to-Real Transfer

Techniques

Adapting a model trained on simulation data to work with real-world experimental data with minimal additional training.

Sim-to-Sim Gap

Techniques

Performance difference when a trained policy transfers between two different environment implementations.

SimCSE

Training

A contrastive learning technique that trains models to recognize when two slightly different versions of the same sentence are similar, improving semantic understanding.

Similarity Kernel

Techniques

A function that measures how similar candidate actions are based on their learned representations, used to weight policy updates.

Similarity Search

Behavior

A task where you find the most similar items to a query by comparing their vector representations, commonly used in recommendation systems and information retrieval.

Similarity Threshold

Evaluation

A cutoff score that determines whether two pieces of text are considered similar enough to be treated as equivalent.

Single-Modality

Architecture

A model that processes only one type of input (like text) rather than multiple types (like text and images combined).

Single-Pass Inference

Architecture

A model architecture that generates a response in one forward pass through the network, typically faster but potentially less thorough than multi-step approaches.

Singular Value Decomposition (SVD)

Techniques

A matrix factorization technique that decomposes a matrix into components, useful for finding optimal low-rank approximations.

Singular Values

Techniques

The diagonal values in singular value decomposition that characterize the scaling properties of a matrix.

Sinkhorn Algorithm

Techniques

An iterative method for solving optimal transport problems with entropy regularization to find balanced assignments.

Sinusoidal Representation Network (SIREN)

Techniques

A neural network architecture using sinusoidal activation functions to learn continuous signal representations.

Sketching

Techniques

Compressing model information into a compact representation that enables efficient predictions about model behavior.

Skew-Symmetric Subspaces

Techniques

Mathematical structures that represent preferences as intransitive comparisons across multiple independent dimensions.

Skewness

Techniques

A measure of asymmetry in a data distribution, indicating whether values cluster more toward one end.

Skill Bank

Techniques

A reusable memory of learned behaviors organized by granularity level for agent decision-making.

Skill Internalization

Techniques

Process of training a model to permanently learn procedural knowledge so it can perform tasks without retrieving external skill resources at inference time.

SLAM (Simultaneous Localization and Mapping)

Techniques

A technique that builds a map of an environment while tracking the camera's position within it.

Sliding Mode Control (SMC)

Techniques

A nonlinear control technique that forces a system to follow a desired path by switching feedback signals.

Sliding Window Attention

Architecture

A mechanism that limits attention to a fixed-size window of recent tokens rather than all previous tokens, reducing computational cost while maintaining context awareness.

Small Language Models

Techniques

Compact AI language models designed for speed and efficiency over raw power.

SMILES Notation

Formats

A text-based format that represents the structure of chemical molecules using letters and symbols, allowing molecules to be encoded as strings for computational processing.

Smishing

Techniques

Phishing attacks delivered via SMS text messages, typically containing malicious links.

Smoothness Constant

Techniques

A measure of how quickly a loss function's gradient can change; smaller is better for stable training.

SOAP note

Techniques

A medical documentation format with Subjective, Objective, Assessment, and Plan sections summarizing patient visits.

Social Dilemmas

Techniques

Game theory scenarios where individual incentives conflict with collective welfare, like the prisoner's dilemma.

Social welfare

Techniques

The total utility or benefit summed across all players in a game.

Socratic Method

Techniques

Teaching through guided questioning that helps learners discover answers themselves rather than being told.

Sodium-Ion Battery

Techniques

A rechargeable battery using sodium ions instead of lithium, offering lower cost and improved sustainability.

Soft Actor-Critic (SAC)

Techniques

A reinforcement learning algorithm that trains agents to maximize both reward and action randomness for stable learning.

Soft Actor-Critic (SAC)

Techniques

A reinforcement learning algorithm that trains agents to maximize both reward and action randomness for stability.

Soft Intersection over Union (IoU)

Techniques

A soft version of the standard IoU metric that uses continuous intensity values instead of binary masks.

Softmax

Techniques

A mathematical function that converts attention scores into probabilities that sum to one.

Softmax Attention

Techniques

Standard attention mechanism that normalizes scores across all keys into a probability distribution, forcing relative rather than absolute relevance judgments.

Solution Space Exploration

Techniques

Generating multiple candidate solutions to find promising options before selecting the best one.

Source Attribution

Behavior

The model's ability to identify and cite the specific documents or sources it used to generate a response, enabling users to verify claims.

Source Citation

Behavior

A model's capability to identify and reference the specific documents or sources it used to generate its answer.

Source Grounding

Behavior

The practice of anchoring a model's responses to specific, cited sources rather than relying solely on its training data, improving factual accuracy and verifiability.

Source Provenance Record

Techniques

A detailed documentation of where evidence came from and how it supports an answer, enabling verification and auditing.

Source-Level Adaptation

Techniques

Modifying the actual source code of a system rather than just configuration files or prompts.

Span Detection

Techniques

Identifying the specific portion or segment of text/video containing a particular claim or concept.

Sparse Activation

Architecture

A technique where only a subset of a model's parameters are used for each input, reducing computational cost while maintaining performance.

Sparse Architecture

Architecture

A model design where not all parameters are used for every computation, reducing memory and computational requirements compared to dense models.

Sparse Attention

Techniques

An attention mechanism that only computes interactions between a subset of tokens instead of all pairs, reducing complexity from O(L²) to O(Lk).

Sparse Autoencoder

Techniques

A neural network that compresses data into a small number of active features, making patterns easier to interpret.

Sparse Autoencoders

Techniques

A tool that finds hidden features in neural networks by learning compressed representations with most values being zero.

Sparse Embeddings

Architecture

Vector representations where most values are zero, allowing efficient storage and computation by only tracking non-zero elements.

Sparse Events

Techniques

Rare or infrequent occurrences in data that are overwhelmed by more common background information.

Sparse Mixture of Experts

Architecture

An architecture where only a subset of the model's specialized sub-networks (experts) activate for each input, reducing computation while maintaining capability.

Sparse Model

Architecture

A model that activates only a subset of its parameters for each input, rather than using all parameters every time, which reduces computational cost.

Sparse MoE

Architecture

A mixture-of-experts design where only a small fraction of the model's parameters are used for each prediction, reducing computational cost while maintaining model capacity.

Sparse Parameter Activation

Architecture

A technique where only a small portion of a model's total parameters are used during inference, reducing computational cost while maintaining model capacity.

Sparse Retrieval

Techniques

A search method that represents text as a high-dimensional vector with mostly zeros, focusing on keyword matching and exact term overlap.

Sparse Reward

Techniques

A reinforcement learning setting where the agent receives reward signals only rarely, making exploration particularly challenging.

Sparse Rewards

Techniques

A reinforcement learning setting where the agent receives feedback infrequently, making learning difficult.

Sparse Vector Embeddings

Architecture

High-dimensional vectors where most values are zero, with only a few active dimensions that correspond to meaningful features, making them memory-efficient and interpretable.

Sparse Vectors

Architecture

High-dimensional vectors where most values are zero, making them memory-efficient and interpretable compared to dense vectors where most values are non-zero.

Sparsification

Techniques

Reducing model size by removing or zeroing out less important parameters or weights.

Sparsity

Techniques

The proportion of zero or removed weights in a neural network, reducing memory and computation.

Spatial Biasing Mechanism

Techniques

A technique that uses spatial information to guide which parts of a video frame correspond to which agent or subject.

Spatial Grounding

Techniques

Connecting language descriptions to specific locations or regions in visual scenes.

Spatial Hallucination

Techniques

When an AI incorrectly imagines objects or details in wrong locations in images.

Spatial Heterogeneity

Techniques

Variation in characteristics or patterns across different geographic locations, requiring location-specific models.

Spatial Intelligence

Techniques

The ability to understand and reason about the positions, shapes, and relationships of objects in space.

Spatial Ontology

Techniques

A formal knowledge structure that defines spatial relationships, constraints, and rules for how objects can be arranged.

Spatial Precision

Performance

The model's ability to accurately identify and mark exact pixel-level boundaries and locations of objects in images.

Spatial Predicate

Techniques

A geographic relationship test (e.g., 'contains', 'intersects') that validates whether spatial objects satisfy required topological conditions.

Spatial Reasoning

Performance

The ability to understand and reason about the location, size, and relationships between objects in an image.

Spatial Representation

Architecture

A learned encoding that captures the layout, objects, and visual features within individual frames or regions of a video.

Spatial Transfer

Techniques

A model's ability to apply learned knowledge to new physical layouts or configurations.

Spatial Understanding

Techniques

The ability to perceive and reason about the positions, distances, and relationships between objects in 3D space.

Spatial-Entropy Stopping Rule

Techniques

A criterion to halt iterative refinement when spatial entropy drops below a threshold, preventing over-refinement.

Spatio-temporal

Techniques

Processing that considers both spatial location and temporal changes over time.

Spatio-temporal Attention

Techniques

Attention mechanism that processes both spatial (image) and temporal (time) dimensions to understand relationships across frames.

Spatio-Temporal Constraints

Techniques

Rules that specify where a robot must be and when, combining spatial location requirements with time deadlines.

Spatio-Temporal Reasoning

Techniques

Understanding patterns that vary across both space (location) and time simultaneously, like traffic flow across a road network.

Spatio-Temporal Systems

Techniques

Dynamical systems that change across both space and time.

Spatiotemporal Calibration

Techniques

Aligning sensor data across space and time so different sensors (cameras, LiDAR) produce consistent 3D representations.

Spatiotemporal Compression

Techniques

Reducing both spatial and temporal dimensions of video frames to decrease memory usage while preserving important information.

Spatiotemporal Representations

Architecture

Internal patterns the model learns that capture both spatial information (what things look like) and temporal information (how they change over time).

Speaker Encoder

Techniques

A neural model that converts audio into a fixed-size embedding representing a speaker's identity, independent of what they say.

Speaker Separation

Techniques

The ability to identify and distinguish between different speakers in an audio recording.

Speaker Verification

Behavior

A task that identifies or confirms whether audio was spoken by a specific person, using characteristics unique to that person's voice.

Specialist Model

Behavior

An AI model designed to excel at a single, narrow task rather than perform many different tasks like a general-purpose model.

Specialist Models

Techniques

Lightweight AI models trained for specific evaluation tasks rather than general-purpose assessment.

Specialized Fine-Tuning

Training

Additional training on a model to make it excel at specific tasks, like code generation, rather than general conversation.

Specialized Language Model

Training

A language model trained specifically for one domain or task (like math) rather than general-purpose use across many topics.

Specialized Model

Training

A language model trained specifically to excel at one task or domain (like mathematics) rather than performing well across many different tasks.

Specialized Tuning

Training

Training a model to excel at specific tasks (like invoice processing) rather than performing well across many different domains.

Specification-Driven Design

Techniques

A design approach where explicit specifications serve as contracts between designers and tools, maintaining traceability from requirements to implementation.

Specification-Guided Reinforcement Learning

Techniques

RL methods that use formal specifications to guide agents toward complex, temporally extended goals.

Spectral Blurring

Techniques

Loss of detail at high frequencies when training models with MSE loss on spherical data.

Spectral Loss

Techniques

A loss function that adjusts training to improve frequency-domain accuracy in predictions.

Spectral Methods

Techniques

Techniques that use eigendecomposition of graph or mesh structures to extract positional information for neural networks.

Spectral Norm

Techniques

The largest singular value of a matrix, representing its maximum scaling effect on vectors.

Spectral Properties

Techniques

Characteristics of an image's frequency content, describing how much detail appears at different scales.

Spectral regime

Techniques

A range of eigenvalue properties that determines how stable and well-behaved a neural network's computations are.

Spectrum Demand

Techniques

The amount of wireless frequency resources needed in a specific location and time period.

Spectrum-preserving

Techniques

A property that maintains the important mathematical characteristics of a matrix during transformation.

Speculation Length (γ)

Techniques

The number of tokens a draft model proposes in each speculation step before the target model verifies them.

Speculative Decoding

Techniques

A technique where a smaller model quickly drafts multiple token predictions ahead of time, which a larger model then verifies, reducing the total time needed to generate text.

Speech and Audio Understanding

Behavior

The ability to process and comprehend spoken language or audio signals, converting them into meaningful interpretations or responses.

Speech Embeddings

Behavior

Numerical representations of audio that capture the meaningful features of speech in a compact form, useful for tasks like speaker identification or speech similarity.

Speech Recognition

Behavior

The ability of a model to convert spoken audio into written text.

Speech Representation

Architecture

A learned numerical encoding of audio that captures meaningful speech patterns and can be used as input for other AI tasks.

Speech Representation Model

Architecture

A neural network trained to convert raw audio into meaningful vector representations that preserve information about speech content and speaker identity.

Speech-Language Model

Architecture

An AI model that can process and understand spoken audio directly, without needing to convert speech to text first.

Speech-to-Text (Transcription)

Techniques

The process of converting spoken audio into written text.

Speed-Conditioned Video Generation

Techniques

Generating videos where motion is produced at a specified playback speed or temporal rate.

Speed-of-Light (SOL) Bounds

Techniques

Theoretically maximum performance a GPU kernel can achieve given hardware constraints like memory bandwidth and compute capacity.

Speed-Optimized

Deployment

A model designed and tuned to prioritize fast response times over maximum accuracy or depth of analysis.

Spell-Checking

Behavior

The task of identifying and correcting spelling errors and character mistakes in text.

SPLADE Architecture

Architecture

A neural retrieval method that combines transformer models with sparse, interpretable outputs by mapping embeddings directly to vocabulary tokens.

Split Neural Network

Techniques

A neural network architecture where different layers run on different machines to preserve privacy during federated training.

Spoken Dialogue Model

Techniques

An AI model that understands spoken input and generates spoken responses for interactive conversations.

Spoken Time Marker

Techniques

A token inserted during generation (e.g., <10.6 seconds>) that helps a model track elapsed speaking time.

Stability

Techniques

A mathematical property ensuring small changes in training data cause proportionally small changes in model outputs.

Stabilization

Techniques

Techniques added to numerical solvers to prevent unrealistic oscillations when simulating fast-moving flows.

Stacked Aggregation

Techniques

Combining multiple model predictions using another model to make final decisions.

Stage Misalignment

Techniques

When GPU stages in a pipeline wait for work that isn't ready yet, even though other executable tasks are available.

Staged Tree Model

Techniques

A probabilistic graphical model that extends Bayesian networks by grouping variables into stages to capture context-specific conditional dependencies.

Stain Normalization

Techniques

Adjusting microscope images to remove color variations from staining differences.

Stakes Signaling

Techniques

Informing a judge about the downstream consequences its verdicts will have, which can corrupt its assessments.

State Continuity

Techniques

Maintaining persistent, durable project state (code, results, logs) that agents can reliably access and build upon.

State Estimation

Techniques

The process of inferring the current condition of a system (like position or velocity) from noisy sensor measurements.

State Invariants

Techniques

Conditions that must always be true about a system's internal state to ensure correct behavior.

State Manifold

Techniques

A continuous, lower-dimensional representation of all possible states an object can occupy.

State Space

Techniques

The set of all possible configurations or conditions an agent can be in, including its needs, sensations, and environment.

State Space Model

Architecture

A type of neural network architecture that processes sequences by maintaining and updating an internal state, offering an alternative to transformer-based attention mechanisms.

State Space Models

Architecture

A neural network architecture that processes sequences by tracking hidden states over time, offering faster inference and lower memory use than traditional transformers.

State Tracking

Techniques

A model's ability to maintain and update information about context over long sequences, critical for tasks like retrieval and reasoning.

State-Feedback Controller

Techniques

A control system that adjusts outputs based on the current state of the system being controlled.

State-only Learning

Techniques

Learning from observations alone without access to the expert's actual actions or decisions.

State-Space Architecture

Architecture

An alternative to transformers that processes sequences more efficiently by maintaining a hidden state that gets updated as it reads each token.

Stateful reconstruction

Techniques

Building a 3D scene by maintaining and updating a compact hidden representation as new images are processed.

Stateful Workspace

Techniques

A system that maintains context and history across interactions, remembering previous attempts and refining goals over time.

Stateless Moderation

Techniques

Safety checks that evaluate each conversation turn independently without remembering previous interactions.

Static Analysis

Techniques

Automated inspection of code without executing it to detect bugs, security issues, and style violations.

Static Shape

Architecture

A model configuration where input and output dimensions are fixed at compile time, reducing computational overhead but preventing the model from handling variable-length inputs.

Stationary Point

Techniques

A point where the gradient of a function is zero, indicating a potential minimum, maximum, or saddle point.

Statistical Certification

Techniques

A formal, auditable proof that a system's actual failure rate stays below a regulator-defined threshold with high confidence.

Statistical-Computational Gap (SCG)

Techniques

The gap between what is theoretically possible (information-theoretically) and what algorithms can efficiently compute.

Steering Vector

Techniques

A pre-computed direction in activation space injected into the model to guide it toward desired behavior without retraining.

Steering Vectors

Techniques

Learned vectors added to model activations to steer behavior toward desired outputs without retraining.

Stein Drift

Techniques

A measure of how well a model's score function matches the data distribution's score function.

Step-by-Step Evaluation

Behavior

The process of assessing each individual step in a solution path to identify where reasoning breaks down or becomes incorrect.

Step-by-Step Problem Solving

Behavior

A model's ability to decompose a problem into sequential logical steps, making its reasoning process transparent and verifiable.

Step-by-Step Reasoning

Behavior

An approach where the model explicitly works through intermediate reasoning steps before arriving at a final answer, rather than jumping directly to conclusions.

Step-Level Verification

Techniques

Checking individual reasoning steps for correctness rather than verifying entire sequences at once.

Stiefel Projection

Techniques

A mathematical constraint that forces a matrix to have orthogonal columns, preserving geometric structure.

Stochastic (Token Usage)

Techniques

Token consumption is random and unpredictable—the same task can require vastly different token amounts across different runs.

Stochastic Differential Equation (SDE)

Techniques

A mathematical equation describing how a random process evolves over time with both deterministic and random components.

Stochastic Dynamics

Techniques

Systems that evolve over time with both deterministic and random components, like molecular motion.

Stochastic Master Equation

Techniques

A mathematical model describing how quantum systems evolve under continuous measurement and random fluctuations.

Stochastic Optimization

Techniques

Optimization methods that use noisy or approximate gradients instead of exact ones to handle large datasets.

Stochastic Policy

Techniques

An agent's decision rule that assigns probabilities to different actions rather than always choosing a single deterministic action.

Stochastic Resetting

Techniques

Periodically returning a learning process to an initial state with random timing to accelerate optimization.

Stochastic Sampling

Techniques

Randomly drawing values from a probability distribution, used in probabilistic AI for robustness and uncertainty quantification.

Stochastic Trajectories

Techniques

Multiple random paths through a model's state space that are aggregated to improve solution quality.

Stochasticity

Techniques

Randomness or unpredictability built into a process or model.

Straight-Through Estimator

Techniques

A technique that enables gradient-based optimization of discrete decisions by approximating gradients through discrete operations.

Strategic Reasoning

Techniques

Deliberate planning and decision-making to efficiently solve problems, as opposed to random trial-and-error.

Streaming

Techniques

Processing data continuously as it arrives rather than waiting for a complete batch.

Streaming Continual Learning

Techniques

Learning from a continuous data stream by converting it into discrete tasks through temporal partitioning.

Streaming Inference

Techniques

Making predictions on data in real-time as new information continuously arrives.

String transduction

Techniques

The task of converting one sequence of symbols into another sequence according to defined rules.

Strongly convex function

Techniques

A function that curves upward uniformly, making optimization easier and faster.

Structural Alignment

Techniques

Matching the spatial structure and boundary features from one model with another to improve segmentation precision.

Structural Causal Model

Techniques

A formal representation of cause-and-effect relationships using graphs and equations to reason about interventions.

Structural Equation

Techniques

A mathematical equation in a causal model that describes how one variable is determined by its parent variables and random noise.

Structural Generalization

Techniques

The ability to apply learned principles to new situations with different surface features but similar underlying structure.

Structural Hallucination

Techniques

When a model learns shortcuts in latent space that violate real-world constraints or environmental rules.

Structural uncertainty

Techniques

Uncertainty caused by missing or incomplete data, like new users with no history.

Structured Artifact

Techniques

A well-organized representation combining multiple components (like theory and code) rather than a single unstructured output.

Structured Data Extraction

Behavior

The process of automatically pulling organized, machine-readable information (like tables or key-value pairs) from unstructured text or images.

Structured Document Representation

Techniques

Converting unstructured documents into organized, machine-readable formats that preserve tables, sections, and relationships.

Structured Document Understanding

Behavior

The ability to extract and understand organized information from documents like receipts or invoices, where data follows predictable layouts and formats.

Structured Extraction

Techniques

The task of pulling specific, organized information from unstructured text and formatting it into a defined structure like JSON or tables.

Structured Knowledge Extraction

Techniques

Automatically converting unstructured text into organized, machine-readable formats like graphs or tables with typed categories.

Structured Metadata

Techniques

Organized information with defined categories (like creator, date, origin) rather than free-form text.

Structured Output

Behavior

Responses formatted in a consistent, machine-readable way (like JSON or XML) rather than free-form text.

Structured Outputs

Behavior

The model's ability to generate responses in organized, predictable formats like JSON or XML rather than free-form text.

Structured Pruning

Techniques

Removing entire components like neurons or attention heads rather than individual weights.

Structured Reasoning

Behavior

The ability to follow logical steps and rules systematically to solve problems, often involving breaking down complex tasks into smaller, ordered components.

Student Entropy

Techniques

A measure of uncertainty in the student model's predictions at each token position.

Stylistic Features

Techniques

Measurable linguistic characteristics like word choice, sentence structure, and grammatical patterns that distinguish writing styles.

Stylometric Analysis

Techniques

Computational study of writing style patterns, such as sentence length and word choice, to identify how language use changes over time.

Sub-question Decomposition

Techniques

Breaking down a complex question into simpler sub-questions that can be answered sequentially.

Subagent

Techniques

A specialized, reusable component that handles a specific task within a larger agent system.

Subject State Tokens

Techniques

Learned latent variables that persistently represent the current state and identity of individual agents in a multi-agent scene.

Subjective Evaluation

Techniques

Assessment based on human judgment and personal criteria rather than fixed, objective metrics.

Submodular Optimization

Techniques

A mathematical property where adding items to a set yields diminishing returns, enabling efficient greedy algorithms.

Subspace

Techniques

A lower-dimensional representation of data that captures the most important directions or patterns.

Subspace Similarity

Techniques

Measuring how closely related two lower-dimensional feature spaces are to each other.

Subword Segmentation

Techniques

Breaking words into smaller pieces (tokens) for a language model to process, critical for handling rare words.

Successor Features

Techniques

A framework that decomposes value functions into basis functions weighted by task-specific coefficients for rapid transfer learning.

Super-Resolution

Techniques

Increasing the spatial or temporal resolution of an image or video to reveal finer details.

Superoptimization

Techniques

Exhaustive search for the fastest possible implementation of a program within a defined search space.

Superposition

Techniques

A neural network's ability to represent more features than it has dimensions by overlapping them in the same space.

Supervised Fine-tuning

Techniques

Training a model on labeled examples to adapt it for a specific task or domain.

Supervised Fine-Tuning (SFT)

Training

A training technique where a model learns from human-labeled examples to improve its ability to follow instructions and produce desired outputs.

Supervision Construction

Techniques

Automated procedure for generating training examples without manual annotation, including both positive and negative cases.

Supervisor Layer Compromise

Techniques

When AI models manipulate or deceive the oversight mechanisms designed to control them.

Support Vector Machine (SVM)

Techniques

A machine learning algorithm that finds the best boundary to separate data into classes by maximizing the margin between them.

Surface form

Techniques

The specific spelling, name variant, or linguistic representation used to refer to an entity (e.g., 'USA' vs 'United States').

Surface invariance

Techniques

The property of a model's ability to produce consistent outputs regardless of which surface form or name variant is used for the same entity.

Surface Light Field

Techniques

A representation that captures how light reflects off a 3D surface from all viewing angles and lighting conditions.

Surface-Form Templates

Techniques

Specific patterns in text formatting or structure that a model learns to rely on, rather than understanding underlying concepts.

Surprisal

Techniques

A measure of how unexpected a word is based on context, used to predict reading difficulty.

Surrogate Function

Techniques

A simpler approximation of a complex function used to make computation or analysis more tractable.

Surrogate Model

Techniques

A fast neural network trained to replace a slow physics simulation or complex model.

Surrogate Objective

Techniques

A simplified objective function used to approximate the true objective and guide search more efficiently.

Survival Analysis

Techniques

Statistical methods for analyzing time until an event occurs, accounting for incomplete observations.

Sustained Coherence

Performance

The ability to maintain logical consistency and context awareness across multiple steps or a long sequence of reasoning.

Sustained Reasoning

Behavior

The ability to work through complex, multi-step problems by maintaining focus and logic across many reasoning steps.

Swarm control

Techniques

Techniques for coordinating and steering large groups of agents or robots as a collective.

Swin Transformer

Techniques

A transformer architecture that uses shifted windows to efficiently capture both local and global context in images.

Sycophancy

Techniques

When a model agrees with a user's false or unsupported claims to please them rather than providing accurate information.

Symbolic Reasoning

Techniques

Using mathematical logic and algebraic rules to reason about program behavior without executing concrete code.

Symbolic Regression

Techniques

A technique to discover mathematical equations in human-readable form from data.

Symmetric Binary Perceptron (SBP)

Techniques

A simplified neural network model used to study learning and computational complexity in constraint satisfaction problems.

Synchronous context-free grammar

Techniques

A formal grammar that defines pairs of related strings simultaneously, used to model translation between two languages.

Syntactic Ambiguity

Techniques

Sentences with multiple possible grammatical interpretations that require cognitive effort to resolve.

Syntactic Complexity

Techniques

The difficulty of parsing a sentence based on its grammatical structure and ambiguities.

Syntactic Correctness

Evaluation

Code that follows the grammatical rules of a programming language so it can be parsed and executed without syntax errors.

Syntax Awareness

Behavior

A model's understanding of programming language rules and structure, allowing it to produce grammatically correct code.

Synthetic Aperture Radar (SAR)

Techniques

A satellite imaging technique that uses radar signals to create detailed maps regardless of weather or daylight, useful for monitoring infrastructure.

Synthetic Data

Training

Artificially generated training data created by humans or other models, rather than collected from real-world sources like the internet.

Synthetic User Testing

Techniques

Using AI agents to simulate realistic user behavior at scale to find bugs and edge cases automatically.

System Prompt

Techniques

Hidden instructions given to an AI model that define its behavior, tone, and constraints.

System Prompt Adherence

Behavior

The model's ability to consistently follow and respect the instructions given in a system prompt that defines its behavior and constraints.

Systematic Generalization

Techniques

A model's ability to solve problems in fundamentally new situations beyond its training distribution.

T

T5 Architecture

Architecture

A transformer-based model design that treats all NLP tasks as text-to-text problems, using an encoder-decoder structure to process and generate text.

T5 Base

Architecture

A smaller, foundational version of the T5 model architecture designed for text-to-text tasks with fewer parameters than larger variants.

Table Text Qa

Techniques

Answering questions by finding information across both tables and text documents.

Tabular Data

Techniques

Structured data organized in rows and columns, like spreadsheets or databases.

Tabular Foundation Models

Techniques

Pre-trained models designed to work with structured tabular data, capable of handling various tasks without task-specific retraining.

Tactile Perception

Techniques

Sensing and interpreting physical contact, pressure, and force information through touch sensors.

Tail Risk

Techniques

The probability of rare, extreme events in the output distribution of a model.

Tangent Bundle

Techniques

The geometric structure describing all possible directions of motion at each point on a manifold.

Target Distribution

Techniques

A desired probability distribution that a model is trained to match, typically derived from reward signals.

Task Accuracy

Techniques

The percentage of correct answers a model produces on a benchmark, measured by standard evaluation metrics.

Task Allocation

Techniques

The process of deciding which tasks are assigned to humans versus AI systems in a workflow.

Task Decomposition

Techniques

Breaking a complex problem into smaller, simpler subtasks to solve sequentially.

Task Mixing Strategies

Techniques

Methods for combining multiple training tasks to improve model generalization and performance across different reasoning domains.

Task Overlap

Techniques

When multiple learning tasks share similar data distributions or require overlapping knowledge.

Task Planning

Behavior

The ability of a model to break down high-level instructions into a sequence of actionable steps that a robot can execute.

Task Reward Model

Techniques

A reward signal that guides reinforcement learning based on task-specific performance metrics rather than general output patterns.

Task Routing

Techniques

Directing different training samples to specialized models or objectives based on their characteristics.

Task Specialization

Training

When a model is optimized for specific types of problems (like math and science) at the expense of general-purpose versatility.

Task Specification

Techniques

A formal description of a goal, constraints, and success criteria that an agent must achieve.

Task taxonomy

Techniques

A hierarchical structure that organizes different categories or types of a problem into levels.

Task Vector

Techniques

The difference between a fine-tuned model and its base model, capturing task-specific changes.

Task Weighting

Techniques

Assigning different importance levels to multiple tasks during training.

Task-Adaptive

Techniques

The ability to adjust a model's behavior for different purposes (like retrieval, clustering, or classification) without retraining, often through lightweight adapters.

Task-Agnostic

Training

A model that works across different types of visual tasks without requiring separate training for each specific task.

Task-Aware

Techniques

Designed with knowledge of the specific downstream task or application that will use the output.

Task-Aware Representations

Techniques

Embeddings that adjust their meaning based on the specific task or query provided, rather than producing the same vector for every use case.

Task-Conditioned

Behavior

A model that adjusts its behavior based on the specific task or instruction provided, rather than producing the same output for identical inputs.

Task-level Shaping

Techniques

Adjusting training signals at the task level to encourage specific model behaviors, like longer reasoning chains for complex questions.

Task-Oriented Instructions

Behavior

Specific requests asking a model to complete a defined goal, like summarizing text or writing code, rather than having a casual conversation.

Task-Oriented Model

Behavior

An AI model optimized to excel at a specific, narrow task rather than performing well across many different types of requests.

Task-Oriented Optimization

Training

Training a model to prioritize completing specific, practical tasks efficiently rather than engaging in open-ended conversation.

Task-Specific Embeddings

Techniques

Embeddings customized for a particular use case, such as sentiment analysis or document retrieval, rather than general-purpose embeddings.

Task-Specific Model

Training

A model trained and optimized to excel at one particular task (like evaluation) rather than performing well across many different tasks.

Task-Specific Optimization

Training

Training or fine-tuning a model to excel at a particular task, like translation, rather than trying to perform equally well across many different tasks.

Taxonomy

Evaluation

A structured system of categories used to organize and classify different types of harmful content.

Teacher Consistency

Techniques

Using the same teacher model for both supervised fine-tuning and distillation to avoid gradient bias.

Teacher Forcing

Techniques

Training technique where the model learns to predict the next token given ground-truth previous tokens.

Teacher Model

Training

A large, highly capable model used to train smaller models by transferring its knowledge and skills through a process called distillation.

Teacher-Student Divergence

Techniques

The disagreement between teacher and student model predictions, indicating where the student is wrong.

Technical Reasoning

Behavior

The capacity to work through complex logical problems, debug issues, and apply domain-specific knowledge systematically.

Teleoperation

Techniques

Remote control of a robot or machine by a human operator, typically through a joystick or similar interface.

Temperature Sampling

Techniques

Controlling randomness in AI predictions: higher values make outputs more creative.

Temporal Coherence

Techniques

The consistency and smoothness of motion and appearance across video frames over time.

Temporal Consistency

Techniques

Ensuring predictions remain stable and coherent across consecutive time steps.

Temporal Context

Behavior

Understanding how events and changes unfold over time, allowing a model to grasp sequences and predict what happens next in a video or time-series data.

Temporal Credit Assignment

Techniques

Determining which past actions or decisions are responsible for current outcomes in sequential decision-making.

Temporal Dependencies

Techniques

Relationships between events or measurements across time in sequential data.

Temporal Generalization

Techniques

A model's ability to make accurate predictions on new data that arrives later in time, even when patterns have shifted.

Temporal Grounding

Techniques

Anchoring events to precise timestamps or relative time positions in a sequence.

Temporal Information

Techniques

Information about how things change over time, critical for understanding dynamic processes like facial expressions.

Temporal Reasoning

Behavior

The ability to understand and reason about events, sequences, and relationships that occur across time.

Temporal Redundancy

Techniques

Repeated or similar information across consecutive frames in a video that can be safely removed.

Temporal Representation

Techniques

How a model encodes and understands time information in sequences, critical for predicting future states from past observations.

Temporal RoPE Adjustment

Techniques

A technique that re-aligns positional encodings when tokens are dropped, maintaining coherent temporal ordering.

Temporal Splits

Techniques

Dividing data by time so training uses older examples and testing uses newer ones, preventing data leakage.

Temporal Super-Resolution

Techniques

Converting low-frame-rate, blurry videos into high-frame-rate sequences with fine-grained temporal details.

Temporal synchronization

Techniques

Aligning events in music and video so they happen at the same time.

Temporal Tree Search

Techniques

An algorithm that constructs evolution chains by tracing how methods progress and branch over time.

Temporal Understanding

Behavior

The ability to comprehend how things change over time, such as recognizing motion and actions across multiple video frames rather than just single images.

Temporal-Difference (TD) Learning

Techniques

An RL method that updates value estimates using the difference between predicted and observed rewards, combining Monte Carlo and dynamic programming ideas.

Tensor Cores

Techniques

Specialized hardware units on GPUs designed to quickly perform matrix multiplication operations used in neural networks.

Tensor Decomposition

Techniques

Breaking down high-dimensional data into products of lower-rank tensors to reduce parameters and improve interpretability.

Tensor Parallelism

Techniques

Splitting a model's computation across multiple GPUs by dividing tensors into chunks processed in parallel.

Tensor Program

Techniques

A computational program that performs operations on multi-dimensional arrays (tensors), commonly used in neural networks.

Tensor Program Optimization

Techniques

Automatically finding faster implementations of tensor computations used in neural networks.

Tensor-Parallel Coordination

Techniques

Lightweight synchronization mechanism ensuring consistency when model weights are split across multiple GPUs.

Term Expansion

Techniques

A technique that adds related or contextually relevant terms to a document's representation to improve its discoverability in search systems.

Term Frequency-Inverse Document Frequency (TF-IDF)

Techniques

A scoring technique that ranks words by how often they appear in a document versus how common they are across all documents, giving rare words higher weight.

Terminal-state Prediction

Techniques

Predicting the final outcome of a physical process directly from initial conditions without simulating intermediate steps.

Ternary Quantization

Training

A compression technique that reduces model weights to just three possible values (-1, 0, or 1) instead of storing full decimal numbers, dramatically reducing memory and computation requirements.

Test Time Optimization

Techniques

Improving model performance on specific inputs by adjusting it during prediction.

Test-Scale Model

Deployment

A deliberately small and simplified version of a model designed for testing code and pipelines rather than for production use.

Test-time Adaptation

Techniques

Improving model performance on new data at inference time without retraining on labeled examples.

Test-Time Compute

Techniques

Additional computation performed during inference to improve model outputs, such as running multiple solution attempts.

Test-Time Scaling

Techniques

Improving model accuracy at inference by using extra computation or verification steps without retraining.

Test-Time Training (TTT)

Techniques

Updating model parameters during inference to adapt to new data without retraining.

Text Classification

Behavior

A machine learning task where a model reads text and assigns it to predefined categories, such as 'safe' or 'unsafe'.

Text Clustering

Techniques

A technique that groups similar texts together automatically by using embeddings to measure similarity, without requiring predefined categories.

Text Completion

Behavior

A task where the model predicts and generates the next words or sentences based on a given prompt or partial text.

Text Conditioning

Techniques

A technique where text descriptions guide or control how a generative model produces images, allowing users to influence the output through language.

Text Continuation

Behavior

The task of generating the next words or sentences based on a given prompt or partial text.

Text Corruption

Training

A training technique where parts of input text are randomly deleted, masked, or shuffled to teach the model to understand context and recover meaning.

Text Embedding

Techniques

A technique that converts text into numerical vectors that capture semantic meaning, allowing the model to understand and compare text similarity.

Text Embedding Model

Architecture

A neural network that converts text into numerical vectors that capture semantic meaning, allowing computers to understand and compare text similarity.

Text Embeddings

Architecture

Numerical representations of text that capture its meaning, allowing computers to compare how similar different pieces of text are to each other.

Text Encoder

Architecture

A model component that converts raw text input into numerical vector representations that capture semantic meaning.

Text Generation

Behavior

The process of an AI model creating new text one word or token at a time based on patterns it learned during training.

Text Language Model

Architecture

An AI model trained to understand and generate human language by predicting sequences of words or tokens.

Text Modality

Architecture

The type of data a model can process or generate — in this case, text-only input and output without images, audio, or other formats.

Text Model

Architecture

A language model that processes and generates only text, without support for images, audio, or other media types.

Text Reasoning

Behavior

A model's capability to analyze, interpret, and draw logical conclusions from textual information.

Text Representation

Architecture

The process of converting text into a numerical format that a machine learning model can understand and process.

Text-Based Interaction

Formats

A mode of communication where the model receives and produces only text inputs and outputs, without direct support for images, audio, or other media formats.

Text-Based Model

Architecture

An AI model that processes and generates only text input and output, without support for images, audio, or other media types.

Text-Based Tasks

Behavior

AI operations that work exclusively with written language input and output, such as answering questions, summarizing, or writing content.

Text-Focused Model

Architecture

A model designed specifically to process and generate text, without support for images, audio, or other data types.

Text-Focused Model

Architecture

A language model designed to work exclusively with text input and output, without support for images, audio, or other modalities.

Text-In, Text-Out

Architecture

A model that accepts text as input and produces text as output, without support for images, audio, or other data types.

Text-Only Input

Architecture

A model that accepts only written text as input, without support for images, audio, or other data types.

Text-Only Interface

Deployment

A model that accepts and produces only text inputs and outputs, without support for images, audio, or other media types.

Text-Only Model

Architecture

A language model that processes and generates only text, without support for images, audio, or other data types.

Text-Only Model

Architecture

A model that processes and produces only text input and output, without support for images, audio, or other data types.

Text-to-3D Generation

Techniques

Creating 3D models from natural language descriptions using AI models.

Text-to-Audio-Video Generation

Techniques

Creating synchronized audio and video content from text descriptions or prompts.

Text-to-Code Generation

Behavior

The ability to convert natural language descriptions into executable code automatically.

Text-to-Embedding

Architecture

A process that converts text into numerical vectors (embeddings) that capture semantic meaning in a format models can work with.

Text-to-Image (T2I) Models

Techniques

AI models that generate images from text descriptions or prompts.

Text-to-Image Generation

Behavior

An AI model that creates images from written text descriptions or prompts.

Text-to-Speech (TTS)

Behavior

A technology that converts written text into spoken audio that sounds natural and human-like.

Text-to-SQL

Behavior

A task where a model converts natural language questions into executable SQL database queries.

Text-to-Text

Techniques

A framework where all NLP tasks are treated as converting input text into output text, so translation, summarization, and classification use the same model structure.

Text-to-Text Generation

Behavior

A model task where the input and output are both text, with the model learning to transform one text format into another.

Text-to-Text Model

Architecture

A machine learning model that takes text as input and produces text as output, useful for tasks like translation, summarization, or question answering.

Text-to-Text Transfer Learning

Training

A training approach where all NLP tasks are framed as converting input text to output text, allowing a single model to handle translation, summarization, classification, and other tasks.

Text-to-Video Generation

Techniques

Creating video sequences from text descriptions using neural networks.

Textbook-Quality Data

Training

High-quality, carefully curated training data structured like educational textbooks rather than raw internet text, designed to teach clear concepts and reasoning.

Textual Priors

Techniques

Background knowledge and patterns learned from text that models rely on, sometimes at the expense of visual information.

TF-IDF (Term Frequency-Inverse Document Frequency)

Techniques

A numerical method that converts text into feature vectors by measuring how important each word is in a document relative to a corpus.

The Pile

Training

A large, diverse dataset of text from the internet used to train this model.

Thematic Representation

Techniques

A structured summary of document content organized by topic or theme, often created through clustering.

Theorem Proving

Techniques

Using AI to automatically verify or discover mathematical proofs and logical statements.

Theory of Mind

Techniques

The ability to infer and reason about other people's beliefs, desires, and intentions.

Thinking Effort

Performance

A configurable parameter that controls how much computational time and internal deliberation a model dedicates to solving a problem before responding.

Thinking Mode

Techniques

A model operating mode where it explicitly works through problems step-by-step before generating a final answer, improving accuracy on complex tasks.

Thinking Model

Behavior

A language model trained to generate explicit reasoning steps and internal deliberation before producing a final response, rather than answering immediately.

Thinking Pattern Alignment

Techniques

Ensuring student and teacher models generate outputs using compatible reasoning approaches.

Thinking Patterns

Techniques

The characteristic way a model generates reasoning steps and intermediate outputs.

Threat Analysis

Behavior

The process of identifying, evaluating, and reasoning about potential security risks, vulnerabilities, and attack methods in systems or networks.

Threshold Tuning

Techniques

Adjusting the decision boundary for binary classification to optimize performance metrics like F1 score.

Throughput

Performance

The number of tokens a model can generate per second, measuring its processing speed.

Tidal Volume

Techniques

The amount of air breathed in or out during a single normal breath at rest.

Timbre Transfer

Techniques

Changing the tonal quality or color of a sound while preserving its basic characteristics.

Time Series Analysis

Techniques

Analyzing data points collected over time to find patterns and make predictions.

Time-Dependent Confounding

Techniques

Bias that occurs when past treatments affect future confounders, making it hard to isolate treatment effects in sequential decisions.

Time-Series Classification

Techniques

The task of assigning a label or category to a sequence of data points ordered by time.

Time-Series Forecasting

Behavior

The task of predicting future values in a sequence of data points ordered by time, such as stock prices or weather patterns.

Time-series Reasoning

Techniques

The ability to understand and make predictions based on data points ordered over time, like stock prices.

Token

Architecture

A small unit of text (a word, subword, or punctuation mark) that a language model breaks input into for processing.

Token Activation

Architecture

The process of selectively activating only certain parts of a model for each individual token processed, rather than using the entire network every time.

Token Allocation

Techniques

Deciding how many tokens (words/subwords) a model should generate for a given problem.

Token Budget

Techniques

The maximum number of tokens available to include retrieved context in a language model prompt.

Token Candidate

Behavior

A predicted next word or subword unit proposed by a draft model for the target model to accept or reject during speculative decoding.

Token Compression

Techniques

Reducing the number of tokens stored or processed by removing redundant or less important ones.

Token Consumption

Performance

The number of text units (tokens) a model processes or generates; longer reasoning processes consume more tokens and may increase latency or cost.

Token Cost

Performance

The computational expense and resource usage required to process or generate tokens, which increases when a model performs additional reasoning steps.

Token Count

Performance

The number of small text chunks (tokens) a model generates; higher token counts mean longer responses and more computational cost.

Token Credit Assignment

Techniques

Determining how much each token in a response should be rewarded or penalized based on overall performance.

Token Distribution

Techniques

The probability distribution over possible next tokens that a language model produces during decoding.

Token Efficiency

Performance

A measure of how many tokens (small units of text) a model needs to use to complete a task; more efficient models use fewer tokens and cost less.

Token Embeddings

Architecture

Numerical representations of individual words or subwords that capture their meaning and relationships in a way machines can process.

Token Entropy

Techniques

A measure of uncertainty in a model's predictions for individual tokens based on probability distributions.

Token Importance

Techniques

A score measuring how much each word or subword unit contributes to a model's prediction.

Token Limit

Architecture

The maximum number of tokens (words or subwords) a model can process in a single input, in this case 8K tokens per chunk.

Token Masking

Training

A training technique where random words in text are hidden and the model learns to predict them, commonly used in models like BERT.

Token Merging

Techniques

Combining multiple tokens into fewer tokens to reduce computation while preserving model output quality.

Token Mixing

Techniques

A method for aggregating information across input tokens to create contextual representations.

Token Output Limit

Architecture

The maximum number of tokens (words or word pieces) a model can generate in a single response, controlling the length of its output.

Token Positions

Architecture

The spatial coordinates or locations of text elements within a document, used to understand where words and phrases appear on the page.

Token Prediction

Behavior

The core task of predicting what word or subword (token) should come next in a sequence based on previous text.

Token Pricing

Deployment

The cost charged per token (unit of text) processed by a model, which varies based on model capability and complexity.

Token Pruning

Techniques

Removing less important words from AI processing to improve speed and efficiency.

Token Ranking

Techniques

Reordering a model's next-token predictions by likelihood or quality rather than accepting the top-1 choice.

Token Reduction

Techniques

Technique to decrease the number of tokens processed by a model, typically by compressing or filtering visual information.

Token Representation

Architecture

A vector that encodes the meaning and context of a single word or subword unit (token) within a larger piece of text.

Token Representations

Architecture

Numerical vectors that encode the meaning of individual words or subword units within a text.

Token Routing

Techniques

Directing or redistributing information from one modality's tokens to another based on information quality or relevance.

Token Selection

Techniques

Choosing which token positions to train on based on their importance or learning value.

Token Sequence

Behavior

A series of individual tokens (words or subwords) that the model generates one after another to form a complete response.

Token Sparsification

Techniques

Reducing the number of tokens processed by a model to lower computational cost.

Token Throughput

Techniques

The number of tokens a model can generate per unit of time during inference.

Token Usage

Performance

The number of tokens (small units of text) consumed during model inference; higher token usage means more computational cost and longer response times.

Token Vocabulary

Architecture

The complete set of individual text units (tokens) that a model can recognize and process; a larger vocabulary allows the model to handle more diverse languages and specialized terms.

Token Weighting

Techniques

Assigning importance scores to individual words or subwords in text, allowing the model to emphasize semantically significant terms in its representation.

Token-level divergence

Techniques

A measure of difference between probability distributions over predicted tokens, used to align model outputs.

Token-Level Embeddings

Architecture

Embeddings that represent individual tokens (words or subwords) rather than entire documents, allowing fine-grained matching during search.

Token-Level Privacy

Techniques

Applying different levels of privacy protection to individual tokens based on their sensitivity and importance.

Token-Level Reward

Techniques

Assigning reward signals to individual tokens in a sequence to guide model training.

Tokenization

Architecture

The process of breaking text into smaller units (like words or syllables) that a model can understand and process.

Tokenizer

Architecture

The component that splits text into tokens (subwords or characters) that the model can process.

Tokens

Architecture

The basic units of text that a language model processes, typically representing words or word fragments.

Tone Sensitivity

Techniques

Measure of how much a model's output quality changes in response to different politeness levels in input prompts.

Tool Invocation

Techniques

An agent's ability to call external functions or APIs to gather information or perform actions.

Tool Schema

Formats

A structured definition that describes what a tool does, what inputs it accepts, and what outputs it produces.

Tool Use

Techniques

The ability of a model to call external functions or APIs to perform tasks like calculations, searches, or data retrieval.

Tool-Augmented Generation

Techniques

Extending LLM outputs by integrating external tools, APIs, or functions the model can call to solve problems.

Tool-calling

Techniques

When an AI model decides to use external functions or tools (like database queries) to help answer questions or complete tasks.

Tool-Use Loop

Techniques

An iterative process where an agent repeatedly calls external tools (like search) and updates its reasoning based on results.

Topic Modeling

Techniques

Automatically discovering abstract topics or themes that appear across a collection of documents.

Topological Constraint

Techniques

A requirement that a segmented structure maintains correct connectivity and shape properties, not just pixel-level accuracy.

Topological Data Analysis

Techniques

Using mathematical topology to extract shape and structure features from data for analysis and classification.

Topology-Invariant Encoding

Techniques

A representation method that works regardless of how input channels are physically arranged or which channels are present.

Total Variation Distance

Techniques

A metric measuring the maximum difference between two probability distributions, ranging from 0 to 1.

Trace Distance

Techniques

A metric measuring the distinguishability between two quantum states, ranging from 0 (identical) to 1 (orthogonal).

Train-Inference Mismatch

Techniques

When a model is trained using one objective but deployed using a different process, causing performance gaps between training and real-world use.

Trainable Depth

Techniques

The number of layers in a neural network that are allowed to update during training.

Training Checkpoint

Training

A saved snapshot of the model's learned weights at a specific point during training, allowing you to see how the model improved over time.

Training Checkpoints

Training

Saved snapshots of a model at different points during training, allowing researchers to observe how the model's abilities change as it learns.

Training Cutoff

Training

The date up to which a model has seen training data; the model has no knowledge of events or information after this date.

Training Data

Training

The examples and information used to teach a model how to perform a task, in this case human-written and AI-generated grammatical corrections.

Training Data Curation

Training

The process of carefully selecting, filtering, and organizing training data to improve a model's performance on specific tasks rather than relying solely on larger datasets.

Training Data Cutoff

Training

The date after which information is not included in a model's training data, meaning the model cannot know about events or facts that occurred after that date.

Training Data Transparency

Training

The practice of publicly disclosing what data was used to train a model, enabling researchers to audit and understand potential biases or limitations.

Training Distribution

Training

The range of topics, styles, and types of text a model was trained on; the model performs best on content similar to this distribution and may struggle outside it.

Training Dynamics

Training

The patterns and behaviors that emerge during a model's training process, such as how loss decreases or how capabilities develop over time.

Training Efficiency

Training

The ability to achieve strong model performance while using less computational resources, data, or time during the training process.

Training Epochs

Training

The number of times a model sees the entire training dataset during learning; more epochs can improve performance but may also lead to overfitting if the dataset is small.

Training Pipeline

Training

The complete set of steps, data, and code used to train a model, made transparent so others can reproduce or audit the process.

Training Reward Saturation

Techniques

The point during training where reward signals stop improving, indicating the model may be memorizing rather than generalizing.

Trajectory

Techniques

A sequence of interactions or steps taken by a model during deployment or in an environment.

Trajectory Abstraction

Techniques

Representing a sequence of actions at a higher level of abstraction, like a strategy, rather than individual steps.

Trajectory Forecasting

Techniques

Predicting the future path or location of a person or object over time.

Trajectory Generation

Techniques

Computing a planned path or sequence of movements for an autonomous agent to follow.

Trajectory Guidance

Techniques

Controlling video generation by specifying desired motion paths or object movements frame-by-frame.

Trajectory Synthesis

Techniques

Generating sequences of actions (trajectories) that an agent takes to solve a task, used for training via imitation learning.

Trajectory Warping

Techniques

Adapting recorded action sequences to new situations by adjusting them based on matching visual keypoints between scenes.

Trajectory-Aware Grading

Techniques

Evaluating agent performance by examining the complete sequence of actions taken, not just final outputs.

Transducer

Techniques

A model that converts input sequences into output sequences with aligned timing.

Transfer Learning

Techniques

Using knowledge from one task to improve learning on a different related task.

Transformer

Architecture

The dominant neural network architecture for language models, using self-attention to process sequences.

Transformer Alternative

Architecture

A neural network architecture designed as a different approach to the standard transformer model, often with different trade-offs in speed, memory, or capability.

Transformer Architecture

Architecture

A neural network design that processes text by analyzing relationships between all words simultaneously, forming the foundation of modern large language models.

Transformer Attention

Architecture

A mechanism that allows a model to focus on relevant parts of the input by computing relationships between all pairs of tokens, enabling deep understanding but requiring significant memory.

Transformer Backbone

Architecture

The core neural network architecture based on attention mechanisms that traditionally powers most large language models.

Transformer Encoder

Techniques

A neural network component that processes input sequences using attention mechanisms.

Transformer Layers

Architecture

Stacked blocks of neural network computations that process and transform input text progressively, with more layers generally allowing the model to learn more complex patterns.

Transformer Models

Techniques

Neural network architecture widely used for language tasks like BERT and RoBERTa.

Transformer-based Models

Techniques

Neural networks using attention mechanisms to process and understand relationships between words in text.

Transformer-Based Text Generation

Architecture

A method where a transformer neural network generates text one token at a time by learning patterns from training data.

Transitivity violation

Techniques

When a judge's scores contradict themselves (e.g., ranking A > B, B > C, but C > A), revealing internal inconsistency.

Transitivity Violations

Techniques

When a judge ranks A > B, B > C, but C > A, revealing logical inconsistency in scoring.

Transport Dynamics

Techniques

The mathematical rules governing how samples move from one distribution to another in a sampling algorithm.

Treatment Effect Analysis

Techniques

Estimating the causal impact of an intervention or change on outcomes in data.

Tree Search

Techniques

An algorithm that explores possible future states by building a tree of actions and outcomes to find promising paths.

Triage

Techniques

Prioritizing and routing queries by urgency or risk level, directing high-risk cases to human experts.

Triangle Inequality

Techniques

A fundamental property stating that the norm of a sum is bounded by the sum of norms.

Trigger

Techniques

A specific input pattern or condition that activates hidden malicious behavior in a backdoored model.

Trigger Modality Attribution (TMA)

Techniques

A metric measuring which input types the backdoor attack actually depends on.

Trillion Parameters

Architecture

A model with one trillion learnable values that the neural network adjusts during training to improve performance on language tasks.

Truncation Collapse

Techniques

A training failure where generated sequences become so long they get cut off, biasing the training data toward incomplete examples.

Trust Region

Techniques

A local region around the current best solution where the surrogate model is trusted to be accurate.

Tsallis q-logarithm

Techniques

A generalized logarithm parameterized by q that interpolates between different loss functions as q varies.

Turn-Taking

Behavior

The ability to detect when one speaker has finished speaking and another can begin, essential for natural conversation flow.

Turn-Taking Detection

Behavior

The ability to identify when a speaker has finished speaking and it is another person's turn to speak in a conversation.

Tweedie's Formula

Techniques

A statistical method for estimating intermediate values in a sequence based on observed endpoints.

Two-phase guidance

Techniques

A strategy that applies different conditioning constraints at different stages of the generation process.

Two-Stage Retrieval

Techniques

A retrieval approach using a fast initial retriever to narrow candidates, followed by a more sophisticated re-ranker for final selection.

Two-Tower Architecture

Architecture

A retrieval system design with separate neural networks for encoding queries and documents independently, allowing efficient comparison between them.

Type-I Error

Techniques

A false positive in hypothesis testing—rejecting a true null hypothesis and claiming a difference exists when it doesn't.

Typicality Bias

Techniques

The tendency of generative models to converge on the most common or typical outputs, reducing diversity.

U

U-Net

Techniques

A convolutional neural network architecture with an encoder-decoder structure designed for image segmentation and restoration tasks.

UI Automation

Behavior

The ability to understand and interact with user interfaces by reading screenshots and generating commands to control applications or websites.

UI Interaction

Behavior

The ability of an AI model to understand and control user interface elements like buttons and forms by interpreting visual layouts and executing appropriate actions.

UI Pattern Recognition

Behavior

The model's ability to identify and apply common design patterns and component structures used in user interfaces.

Unanswerable Questions

Techniques

Questions where the correct answer cannot be found in the given context, testing if models admit uncertainty.

Uncased

Formats

A model variant that treats uppercase and lowercase letters as identical, so 'Hello' and 'hello' are processed the same way.

Uncensored

Behavior

A model without built-in safety filters or content restrictions, allowing it to generate responses on any topic without refusal.

Uncensored Model

Behavior

A model trained without safety filters or content restrictions, making it willing to generate responses on sensitive topics that filtered models would refuse.

Uncertainty estimation

Techniques

Quantifying how confident a model is in its predictions, critical for safe deployment in high-stakes applications.

Uncertainty Quantification

Techniques

Measuring and tracking how uncertain a model's predictions are based on uncertain inputs.

Under-executed Traces

Techniques

When a model stops executing a procedure before completing all required steps, leaving the computation incomplete.

Undersampling

Techniques

Collecting fewer measurements than needed for perfect image reconstruction, used to speed up MRI scans.

UNet

Techniques

A neural network architecture commonly used in image generation that processes images at multiple scales.

Unified Architecture

Architecture

A single model design that handles multiple different tasks without needing separate specialized models for each task.

Unified Interface

Architecture

A single input format that handles multiple different tasks, rather than requiring separate models for each task.

Unified Multimodal Model

Techniques

An AI model trained to both generate and understand multiple types of data like text and images.

Unified Multimodal Models (UMMs)

Techniques

AI models that can process and generate multiple types of data (text, images, etc.) in a single system.

Unilateral deviation

Techniques

A single player changing their strategy while others keep theirs fixed.

Unit Commitment

Techniques

The optimization problem of deciding which electricity generators to turn on/off over time to meet demand while minimizing cost.

Unit of Analysis

Techniques

The linguistic segment (word, morpheme, character) over which a measurement or prediction is evaluated.

Unit Test

Techniques

Automated code that checks whether a specific piece of software works correctly by testing individual functions.

Universal Approximation

Techniques

The property that a model can theoretically learn any continuous function given sufficient capacity.

Universal Induction

Techniques

Learning general rules from examples that apply broadly across different situations.

Universal Model

Behavior

A model designed to work well across many different tasks and domains without requiring task-specific customization or retraining.

Unnormalized Density

Techniques

A function proportional to a probability distribution but not scaled to sum or integrate to one.

Unnormalized Distribution

Techniques

A probability distribution where the total probability doesn't sum to one, requiring expensive normalization calculations.

Unscented Kalman Filter (UKF)

Techniques

An algorithm for estimating the state of a system from noisy measurements, designed to handle nonlinear dynamics better than standard Kalman Filters.

Unstructured Data

Behavior

Information that doesn't follow a predefined format or organization, such as raw text documents or photographs.

Unstructured Knowledge

Techniques

Information stored as plain text documents rather than organized databases, like PDFs or policy manuals.

Unsupervised Clustering

Techniques

Grouping data points into categories without labeled training examples, discovering patterns automatically.

Unsupervised Embedding

Techniques

A machine learning technique that discovers hidden structure in data without labeled examples, creating meaningful representations automatically.

Unsupervised Learning

Techniques

Training a model without labeled examples, letting it discover patterns on its own.

Unsupervised RLVR

Techniques

Training language models with reinforcement learning using rewards derived without human labels or ground truth answers.

Untrained Model

Training

A model with the correct structure but no learned knowledge, producing meaningless output because it has never been trained on data.

Upscaling

Techniques

Using sparse local measurements to estimate values across a larger geographic or temporal region.

User Embedding

Techniques

A learned vector representation that captures an individual driver's unique preferences and driving style.

User Simulator

Techniques

A synthetic agent that mimics realistic user behavior and preferences to test AI assistant performance.

User Turn Generation

Techniques

Prompting a model to generate the next user message in a conversation to probe whether it understands interaction dynamics.

V

V-usable Information

Techniques

A generalization of Shannon information that measures how much information is actually useful to a specific observer or agent.

Validation-Driven Refinement

Techniques

A process that checks generated outputs (like rendered charts) against quality criteria and iteratively improves them based on detected failures.

Value Function

Techniques

A function estimating how good a state or action is for achieving a goal.

Value Pluralism

Techniques

Recognition that multiple legitimate ethical principles (autonomy, beneficence, justice) can conflict, requiring case-by-case navigation rather than single fixed rules.

Value Propagation

Techniques

The process of updating an agent's estimates of state values backward through a trajectory during learning.

Variable Entropy Mechanism

Techniques

A technique that dynamically adjusts how much a model explores new outputs versus exploiting known good ones.

Variable Fixation

Techniques

Reducing a problem's complexity by fixing certain decision variables to specific values based on prior knowledge or predictions.

Variance Reduction

Techniques

Techniques that reduce noise in gradient estimates to improve optimization efficiency and convergence speed.

Variational Autoencoder (VAE)

Techniques

A neural network that learns to compress data into a latent space and reconstruct it, useful for learning smooth representations.

Variational Embedding

Techniques

An embedding learned through a variational approach that optimizes a probabilistic objective function.

Variational Inference

Techniques

A method to approximate complex probability distributions by learning simpler, tractable distributions.

Variational Quantum Classifier

Techniques

A quantum machine learning model that uses parameterized quantum circuits to classify data by optimizing circuit parameters.

Variational Score Distillation

Techniques

An optimization technique that transfers knowledge from a teacher model to improve generation quality by matching score distributions.

VC Dimension

Techniques

A measure of the complexity or expressiveness of a hypothesis class in machine learning.

Vector Dimension

Architecture

The number of individual numerical values used to represent a piece of text; higher dimensions can capture more nuanced meaning but require more computational resources.

Vector Embedding

Architecture

A representation of data (like molecules or text) as a list of numbers that captures its essential features in a form that machine learning models can work with.

Vector Embeddings

Architecture

Numerical representations of text where each word or sentence becomes a list of numbers that capture its meaning in a way computers can process.

Vector Generation

Architecture

The process of converting input data (like text) into numerical vectors that can be stored, compared, and searched efficiently.

Vector Graphics

Formats

Images defined by mathematical shapes and paths rather than pixels, allowing them to scale to any size without losing quality.

Vector Normalization

Techniques

A preprocessing step that scales vectors to a standard length, ensuring fair comparisons when using cosine similarity.

Vector Output

Formats

The model's output is a single array of numbers (a vector) rather than generated text, which can be efficiently compared with other vectors to measure similarity.

Vector Quantization

Techniques

Compressing data by encoding groups of values together rather than individually, achieving better compression ratios.

Vector Representation

Architecture

A way of expressing text as a list of numbers that a computer can process and compare mathematically.

Vector Retrieval

Techniques

A search method that converts text into numerical vectors and finds similar documents by comparing vector distances.

Vector Search

Techniques

A search method that converts queries and documents into numerical vectors and finds matches by measuring similarity between vectors, fast but less nuanced than other ranking approaches.

Vector Similarity

Performance

A measurement of how alike two vectors (number lists) are to each other, used to determine if two pieces of text have similar meanings.

Vector Similarity Search

Techniques

A method that converts text into numerical vectors and finds documents with vectors closest to a query vector, fast but sometimes missing nuanced relevance signals.

Vector Space

Architecture

A mathematical representation where text is converted into points or directions in a multi-dimensional space, enabling comparison and analysis of semantic relationships.

Vector-Based Adaptation

Techniques

A parameter-efficient fine-tuning approach that adapts models using learned vectors instead of full weight matrices, requiring even fewer parameters than LoRA.

Vector-valued Reward

Techniques

A reward signal with multiple dimensions (e.g., correctness per test case) instead of a single scalar score.

Velocity Field

Techniques

In diffusion models, the learned direction and speed that guides the generation process at each step.

Vendor-Specific Operations

Techniques

Operational procedures and knowledge unique to equipment from a particular manufacturer, like GE MRI scanners.

Verbalized confidence

Techniques

Uncertainty estimates based on explicit confidence statements the model generates as part of its reasoning output.

Verifiable Answers

Techniques

Answers that can be checked against external sources like the web to confirm correctness.

Verification Logic

Techniques

The code or rules that check whether an agent's solution correctly solves a benchmark task.

Verifier

Techniques

A model or system that evaluates whether another model's output is correct or high-quality.

Video Diffusion

Techniques

A generative model that creates videos by iteratively refining noise into coherent frames, similar to image diffusion but applied to sequences.

Video Encoder

Architecture

A model component that processes video frames and converts them into compact numerical representations that capture the video's visual and motion content.

Video Generation

Techniques

Creating realistic video sequences using AI based on text or image descriptions.

Video Object Removal

Techniques

Editing technique that deletes objects from video while filling in background and correcting physical interactions.

Video Question Answering

Techniques

A task where AI models watch videos and answer questions about what they see and understand.

Video Segmentation

Behavior

Extending image segmentation to video by identifying and tracking objects across multiple frames over time.

Video Summarization

Techniques

Automatically selecting key frames or clips from a long video that capture the most important content.

Video Tracking

Behavior

The ability to follow and maintain consistent identification of objects as they move across multiple frames in a video sequence.

Video Understanding

Techniques

The ability of AI systems to analyze and extract meaning from video content including visual, temporal, and semantic information.

Video VAE

Techniques

A variational autoencoder designed for video that compresses video frames into a latent representation for efficient processing.

Video-Language Model

Architecture

A specialized AI model trained to understand video content and communicate its understanding through natural language text.

Video-to-Audio Generation

Techniques

Creating sound effects or audio that matches the visual content and timing of a video.

View-Dependent Appearance

Techniques

How an object's appearance changes based on the viewing angle, including effects like reflections and shininess.

Viewpoint Rotation Understanding

Techniques

The capability to track how a viewpoint changes through rotations and predict resulting observations.

Virtual Cell Abstraction

Techniques

Representing biological cells as simplified computational models for simulation.

Virtual Reality (VR)

Techniques

A computer-generated 3D environment that users can interact with using special headsets or controllers.

Virtual Staining

Techniques

Using AI to digitally add color to microscope images without physical staining.

Viscosity Solution

Techniques

A mathematical solution concept for complex equations that handles non-smooth behavior in optimization problems.

Vision Backbone

Architecture

The core neural network component that processes and understands images before passing information to the rest of the model.

Vision Encoder

Architecture

A component that converts images into a numerical representation that a language model can understand and process.

Vision Encoding

Architecture

A process that converts images into numerical representations that a model can understand and process.

Vision Foundation Models

Techniques

Large pre-trained models like DINO and SAM that learn general visual understanding from diverse image data.

Vision Pipeline

Architecture

The specialized component of a model that processes and interprets image data to extract visual information.

Vision Tokens

Techniques

Discrete representations of image patches or regions processed by vision-language models.

Vision Transformer

Architecture

A neural network architecture that processes images by breaking them into small patches and analyzing them similarly to how language models process text.

Vision Transformer (ViT)

Architecture

A neural network architecture that processes images by breaking them into small patches and treating them similarly to how language models process words.

Vision Understanding

Architecture

The ability of an AI model to analyze and interpret visual information from images, identifying objects, scenes, and their relationships.

Vision-Language

Architecture

A model designed to understand and reason about both visual content (images) and natural language text together.

Vision-Language Alignment

Training

Training a model to understand the relationship between images and their text descriptions so it can match them together effectively.

Vision-Language Backbone

Techniques

A pre-trained model that jointly processes and understands both visual and textual information in a unified representation.

Vision-Language Encoder

Architecture

A model that processes both images and text together to create shared numerical representations, rather than generating new text like a full language model would.

Vision-Language Learning

Training

Training a model to understand and connect both images and text together, so it can reason about visual content using language.

Vision-Language Model

Architecture

An AI model that understands both images and text, allowing it to answer questions about images or describe what it sees.

Vision-Language Models (VLMs)

Techniques

AI systems that understand both images and text, allowing them to answer questions about images or describe what they see.

Vision-Language Navigation (VLN)

Techniques

Task where an AI agent navigates physical spaces by following natural language instructions while processing visual input.

Vision-Language Task

Behavior

A task that requires a model to understand and reason about both visual information (images) and textual information together.

Vision-Language Tasks

Behavior

AI tasks that require understanding both visual information from images and textual information together, such as describing images or answering questions about them.

Vision-Language-Action Model

Architecture

A model that combines visual perception, language understanding, and robotic action generation to interpret instructions and control robot movements.

Vision-to-Code Generation

Techniques

Converting visual inputs like screenshots, charts, or diagrams into executable code or structured representations.

Visual Anchoring

Techniques

Grounding abstract concepts (like actions) to concrete visual observations to ensure they have real physical meaning.

Visual decoding

Techniques

Reconstructing or identifying visual stimuli from recorded brain activity patterns.

Visual Encoder

Architecture

A component that converts images into a numerical representation that the model can understand and process.

Visual Foresight

Techniques

Predicting and visualizing what a robot will do next based on its learned policy.

Visual Grounding

Behavior

The ability to connect specific words or concepts in text to the actual objects or regions they refer to in an image.

Visual Instruction Tuning

Training

A training technique that teaches a model to follow instructions about images by learning from examples of image-text instruction pairs.

Visual Question Answering

Behavior

A task where an AI model reads a question and an image, then generates an answer based on what it understands from the image.

Visual Reasoning

Behavior

The capability to analyze images and draw logical conclusions or answer complex questions based on what is depicted in the visual content.

Visual Retrieval-Augmented Generation

Techniques

RAG applied to visually rich documents, allowing models to retrieve and reason over images and multi-page visual content.

Visual Segmentation

Behavior

The task of identifying and separating individual objects or regions in an image or video by assigning each pixel to a specific object or category.

Visual Signal Dilution

Techniques

The degradation of visual understanding in models as generated text accumulates, causing attention to shift away from image tokens.

Visual Token Pruning

Techniques

Removing unnecessary visual tokens from images or videos to reduce computational cost in vision-language models.

Visual Tokens

Techniques

Discrete units representing different regions or features of an image processed by the model.

Visual Understanding

Behavior

The ability of an AI model to interpret and analyze images, including identifying objects, reading text, and answering questions about visual content.

Visual-Language Model

Architecture

A model that processes both images and text together, understanding the relationship between visual content and language to answer questions about images or describe what it sees.

Visual-Textual Attention

Techniques

How a multimodal model allocates focus between visual and text information when processing inputs.

Visualization Rhetoric

Techniques

The persuasive techniques and design choices used in charts and graphs to influence how viewers interpret data.

Visually-Grounded

Behavior

A model's ability to understand and reason about visual information in images, connecting what it sees to language and concepts.

Visuo-Tactile Fusion

Techniques

Combining visual information from cameras with tactile (touch) sensor data to improve robot perception and decision-making.

Visuomotor Pipeline

Techniques

A system that converts visual input into motor control commands for robot manipulation.

Visuomotor Policy

Techniques

A learned control policy that maps visual observations directly to robot motor commands.

vLLM

Deployment

An inference engine optimized for running large language models efficiently by batching requests and managing memory intelligently.

vLLM Inference Engine

Deployment

A high-performance serving framework that efficiently runs language models and embedding models with optimized memory usage and throughput for production deployments.

Vocabulary

Architecture

The complete set of unique words or tokens that a language model can recognize and generate.

Vocabulary Collapse

Techniques

When a model over-predicts only a few options and ignores others, losing diversity in its outputs.

Vocabulary Extension

Techniques

Adding new tokens or words to a language model's vocabulary beyond its original pretrained set.

Vocabulary Size

Architecture

The number of unique tokens (words or word pieces) a model can recognize and process; larger vocabularies provide better coverage of a language.

Vocabulary-Constrained LLM

Techniques

A language model restricted to generate only outputs from a predefined set of allowed terms or concepts.

Voice Synthesis

Behavior

The process of generating natural-sounding human speech from text using machine learning models.

VQ-VAE (Vector Quantized Variational Autoencoder)

Techniques

A neural network that learns discrete, quantized representations by combining VAE principles with vector quantization.

VQA-Based Reward

Techniques

A reward signal derived from visual question answering that uses language-vision reasoning to evaluate image quality.

VRAM

Deployment

Video RAM — the memory on a GPU that stores model weights and intermediate computations during inference.

VRAM Footprint

Deployment

The amount of graphics memory (VRAM) required to load and run a model on a GPU.

Vulnerability detection

Techniques

Automatically identifying security flaws or weaknesses in code that could be exploited by attackers.

Vulnerability Reasoning

Behavior

The ability to understand and explain how security weaknesses in software or systems could be exploited and what their potential impact might be.

W

W4A16

Formats

A quantization format where model weights are stored in 4-bit precision while calculations use 16-bit precision, balancing efficiency with accuracy.

W4A16 Quantization

Deployment

A specific quantization scheme where weights are stored in 4-bit precision while activations remain in 16-bit precision, balancing memory savings with accuracy.

W8A8 Quantization

Deployment

A specific quantization method where both weights (w) and activations (a) are stored as 8-bit integers, providing a good balance between memory savings and model quality.

W8A8 Quantization

Deployment

A specific quantization method that reduces both weights and activations to 8-bit integers, enabling faster computation on specialized hardware while maintaining reasonable accuracy.

Warm Start

Techniques

Providing an optimization solver with an initial candidate solution to speed up convergence instead of starting from scratch.

Warnsdorff's Algorithm

Techniques

A greedy heuristic that prioritizes moves to positions with fewer onward options to avoid dead ends.

Wasserstein Distance

Techniques

A mathematical measure of how different two distributions are, useful for comparing expert and agent behavior.

Weak Supervision

Techniques

Training with imperfect or limited supervision signals, such as scarce labels, noisy annotations, or self-generated targets.

Weak-to-Strong Reverse Distillation

Techniques

Testing distillation by using a weaker model as teacher to see if a stronger student learns meaningfully.

Web Crawling

Techniques

Automatically browsing and collecting data from websites by following links across the internet.

Web Dataset

Training

Training data collected from publicly available internet sources, which provides broad but sometimes uneven coverage of topics.

Web Interaction

Techniques

An agent's capability to navigate websites, fill forms, click buttons, and extract information from live web pages.

Web Search Augmentation

Behavior

The ability to search the internet in real-time during processing to retrieve current information rather than relying only on training data.

Web Search Integration

Deployment

The capability for a model to query the internet in real-time during response generation, allowing it to access current information beyond its training data.

Weight and Activation Quantization (W8A8)

Deployment

A specific quantization method that compresses both the model's stored weights and its intermediate calculations to 8-bit precision, significantly reducing memory and computation requirements.

Weight Averaging

Techniques

A merging method that combines model weights by taking their average.

Weight Clustering

Techniques

Grouping similar weight values together and replacing them with shared cluster centers to reduce model size.

Weight Editing

Techniques

The process of directly modifying a trained model's internal parameters (weights) to change its behavior without retraining from scratch.

Weight Generation

Techniques

The process of using a neural network to produce parameters for another model rather than training those parameters directly.

Weight Importance

Techniques

A measure of how much a specific weight contributes to model predictions and performance.

Weight Initialization

Training

The process of setting the starting values for a neural network's parameters before training begins.

Weight Precision

Architecture

The number of bits used to represent each numerical value in a model's weights; lower precision (like 4-bit) uses less memory but may reduce accuracy.

Weight Quantization

Deployment

A specific type of quantization that compresses only the model's learned parameters (weights) while keeping other calculations at higher precision.

Weight Sharing

Techniques

Using the same neural network parameters for multiple tasks to enable knowledge transfer and reduce model size.

Weights

Architecture

The numerical parameters inside a neural network that determine how it processes input and generates output.

Whole-Body Controller (WBC)

Techniques

A system that converts high-level motion commands into executable joint trajectories for robots.

Whole-Slide Image

Techniques

A high-resolution digital scan of an entire microscope slide used in computational pathology for disease diagnosis.

Width Scaling

Techniques

How optimizer behavior changes when you increase the number of neurons in each layer of a neural network.

Wigner Score

Techniques

A quantum version of the score function that describes how to reverse noise in quantum systems.

Winner-Take-All Retrieval

Techniques

A retrieval criterion requiring the correct target to score strictly higher than all other candidates.

Wirelength

Techniques

The total length of connections between components on a chip; shorter wirelength improves performance and power efficiency.

Word sense disambiguation

Techniques

Determining which meaning of a word is intended in a specific context when a word has multiple meanings.

Worked Example

Techniques

A step-by-step demonstration of how to solve a problem, used to help students learn problem-solving strategies.

Workflow Automation

Deployment

Using an AI model to automatically handle repetitive business tasks and processes, reducing manual effort and improving efficiency.

Working Memory (WM)

Techniques

The active, temporary knowledge an AI system uses for the current task, drawn from long-term memory.

Workload Manager

Techniques

Software that schedules and manages job submissions and resource allocation on shared computing clusters.

World Knowledge

Behavior

A model's learned understanding of facts, concepts, and relationships about the real world, typically acquired during training on diverse text data.

World Model

Techniques

An AI system that learns to understand and predict how the physical world works from observations.

World Modeling

Techniques

Predicting future states of the environment based on current observations and actions.

World Models

Behavior

Internal representations learned by AI systems that capture how the physical world works, including how objects move and interact over time.

Worst-Case Analysis

Techniques

Evaluating system behavior on the most dangerous or consequential failures rather than average performance.

Write-back Affordances

Techniques

The ability for modules to update and modify shared state, enabling bidirectional communication between tools.

Wyner-Ziv Coding

Techniques

A compression technique where the encoder has limited information but the decoder has side information to help reconstruction.

X

XLM-RoBERTa

Architecture

A pre-trained language model architecture designed to understand and process text in over 100 languages simultaneously.

xLSTM Architecture

Techniques

A recurrent neural network variant that uses linear attention mechanisms instead of quadratic attention for improved efficiency.

Z

Zero One Loss

Techniques

A metric that counts predictions as either completely right or completely wrong with no partial credit.

Zero Shot Learning

Techniques

Solving a task without any training examples by using knowledge from related tasks or descriptions.

Zero Shot Performance

Techniques

How well an AI model performs on new tasks it has never seen before without any training.

Zero-Day Detection

Techniques

Identifying previously unknown security vulnerabilities or attacks that have no existing defenses.

Zero-error capacity

Techniques

The maximum rate at which information can be reliably transmitted over a noisy channel with zero probability of error.

Zero-Order Hypergradient

Techniques

A gradient-free signal derived from comparing function values across different hyperparameter settings.

Zero-pair learning

Techniques

Training without paired examples of two modalities, using only single-modality data.

Zero-shot Autonomous Behavior

Techniques

Agent performing tasks without any external skill retrieval or runtime augmentation, relying only on learned parameters.

Zero-Shot Baseline

Techniques

A comparison model that makes predictions without being trained on the target task or domain.

Zero-Shot Generalization

Techniques

A model's ability to handle new, unseen tasks or data without additional training on those specific examples.

Zero-shot learning

Techniques

Using a model to solve a task without any training examples for that specific task.

Zero-Shot Prediction

Techniques

Making predictions on new tasks without any task-specific training or fine-tuning on labeled examples.

Zero-Shot Sound Generation

Techniques

Creating new sounds the model has never seen before by using reference audio as a guide.

Zero-shot Task Transfer

Techniques

Performing a new task without any training examples, using only knowledge learned from other tasks or domains.

Zero-Trust Network Access (ZTNA)

Techniques

A security model that requires verification of every access request regardless of source, rather than trusting internal networks.

Zone-Level Modeling

Techniques

Predicting risk at the geographic area level rather than individual policy level, useful when detailed location data is unavailable.