Present Massive Language Fashions (LLMs) are predominantly designed with English as the first language, and even the few which might be multilingual are likely to exhibit sturdy English-centric biases. Very like audio system who may produce awkward expressions when studying a second language, LLMs typically generate unnatural outputs in non-English languages, reflecting English-centric patterns in each vocabulary and grammar. Regardless of the significance of this difficulty, the naturalness of multilingual LLM outputs has acquired restricted consideration. On this paper, we handle this hole by introducing novel computerized corpus-level metrics to evaluate the lexical and syntactic naturalness of LLM outputs in a multilingual context. Utilizing our new metrics, we consider state-of-the-art LLMs on a curated benchmark in French and Chinese language, revealing an inclination in direction of English-influenced patterns. To mitigate this difficulty, we additionally suggest a easy and efficient alignment technique to enhance the naturalness of an LLM in a goal language and area, attaining constant enhancements in naturalness with out compromising the efficiency on general-purpose benchmarks. Our work highlights the significance of creating multilingual metrics, assets and strategies for the brand new wave of multilingual LLMs.
†Sapienza College of Rome
‡‡ Work partially performed throughout Apple internship