Current LLMs have significant gaps in understanding discourse particles in Southeast Asian languages like Malay, but structured linguistic scaffolding can substantially improve their pragmatic competence.
This paper introduces MalayPrag, a benchmark for testing how well large language models understand discourse particles (words like 'well' or 'kind of' that convey emotion and intent) in colloquial Malay.