Mohamed bin Zayed University of Artificial Intelligence releases Nanda, Hindi Large Language Model

Mohamed bin Zayed University of Artificial Intelligence releases Nanda, Hindi Large Language Model

Eric Xing, President and University Professor, MBZUAI

Mohamed bin Zayed University of Artificial Intelligence, MBZUAI, the world’s first graduate-level Artificial Intelligence university dedicated to research, has released Nanda, the world’s most advanced open-source Hindi Large Language Model, LLM.

The model, released as open source, was developed by the University’s Institute of Foundation Models, IFM in partnership with Inception, a G42 company and Cerebras Systems, and was announced earlier this year. The release marks a significant milestone in the ongoing development of India’s AI ecosystem and its journey to equitable AI, with more than half-a-billion Hindi speakers now able to harness the potential of Generative AI in their mother tongue.

Llama-3-Nanda-10B-Chat, or Nanda for short, is a 10-billion parameter model, which demonstrates better knowledge and reasoning capabilities in Hindi than any existing open Hindi and multilingual models of similar size by a sizable margin, based on extensive evaluation. It is also very competitive in English. The model was trained on the Condor Galaxy supercomputer, built by G42 and Cerebras Systems.

The launch of Nanda builds on the success of Jais, the world’s Arabic LLM and joins MBZUAI’s zoo of advanced foundation models. Jais transformed Arabic Natural Language Processing, NLP, unlocking access to native-language Generative AI capabilities for over 400 million Arabic speakers globally.

MBZUAI President and University Professor Eric Xing said: “An accurate and efficient LLM for the Hindi language is vital for India’s ambitions for inclusive and accessible AI. With the release of Nanda, we are reinforcing our commitment to open-source LLMs and to making new technology affordable, safe, ethical and standardisable. This is aligned with our mission as an academic institution to lead the Generative AI development for public good and contributing to the UAE’s knowledge-led economy.”

“Nanda is an important advancement for Generative AI for Hindi, which is one of the most widely spoken languages in the world,” said the project’s lead, Preslav Nakov, Department Chair and Professor of Natural Language Processing at MBZUAI.

“We are releasing Nanda as an open model, so people can download it from HuggingFace and run it locally. It is of reasonable size, and thus has modest hardware requirements.”

The project’s co-lead, Monojit Choudhury, Professor of Natural Language processing at MBZUAI, added: “The current state of LLMs in Hindi is not up to the mark. It is nowhere close to English or several European languages. Building LLMs especially for a language like Hindi, spoken by hundreds of millions of people, to a reasonable level is important. India is one of the world’s largest economies; any LLM that can serve Hindi will benefit communities as it opens new commercial opportunities.”

Browse our latest issue

Intelligent CIO Middle East

View Magazine Archive