For symbolic music generation, autoregressive models with attention outperform VAEs and GANs at producing stylistically coherent Bach compositions, though vector quantization significantly improves VAE performance by preventing posterior collapse.
This paper compares three types of AI models for generating Bach-style piano music from symbolic notation: autoregressive models (which predict one note at a time), latent-variable models (which learn compressed representations), and adversarial models (which compete to fool each other).