Video RAM — the memory on a GPU that stores model weights and intermediate computations during inference.