By combining efficient tokenization with geometry-aware attention, you can build crystal generation models that are both faster and more accurate than complex graph neural networks, making generative modeling of materials more practical.
Crystalite is a lightweight diffusion Transformer for generating crystal structures that uses two key innovations: a compact atom representation called Subatomic Tokenization and a Geometry Enhancement Module that encodes crystal geometry directly into the model's attention mechanism.