🎯 TL;DR: State-of-the-art paired encoder and decoder models (17M-1B params) trained identically for fair comparison with open data. Encoders beat ModernBERT. Decoders beat Llama 3.2/SmolLM2. These ...
Large language models (LLMs) have changed the game for machine translation (MT). LLMs vary in architecture, ranging from decoder-only designs to encoder-decoder frameworks. Encoder-decoder models, ...
We note that our work focuses on architectural comparisons rather than competing with recent SLM developments (e.g., SmolLM, MobileLLM). Our analysis isolates the fundamental advantages of ...