NVIDIA Introduces NIM Microservices for Generative AI in Japan and Taiwan

NVIDIA has announced the launch of its NIM microservices for generative AI applications in Japan and Taiwan, according to NVIDIA blog. The new microservices are designed to support the development of high-performing generative AI applications tailored to regional needs.

Supporting Regional AI Development

The introduction of these microservices is aimed at helping developers build and deploy generative AI applications that are sensitive to local languages and cultural nuances. The microservices support popular community models, enhancing user interactions through improved understanding and responses based on regional languages and cultural heritage.

In the Asia-Pacific region, generative AI software revenue is projected to reach $48 billion by 2030, up from $5 billion in 2024, according to ABI Research. NVIDIA's new microservices are expected to play a significant role in this growth by providing advanced tools for AI development.

Regional Language Models

Among the new offerings are the Llama-3-Swallow-70B and Llama-3-Taiwan-70B models, trained on Japanese and Mandarin data respectively. These models are designed to provide a deeper understanding of local laws, regulations, and customs. The RakutenAI 7B family of models, built on Mistral-7B, were trained on English and Japanese datasets and are available as NIM microservices for Chat and Instruct functionalities.

These models have achieved leading scores among open Japanese large language models, as evidenced by their top average score in the LM Evaluation Harness benchmark conducted from January to March 2024.

Global and Local Impact

Nations worldwide, including Singapore, the United Arab Emirates, South Korea, Sweden, France, Italy, and India, are investing in sovereign AI infrastructure. NVIDIA's NIM microservices allow businesses, government agencies, and universities to host native large language models (LLMs) in their own environments, facilitating the development of advanced AI applications.

For example, the Tokyo Institute of Technology has fine-tuned the Llama-3-Swallow 70B model using Japanese-language data. Preferred Networks, a Japanese AI company, is using the model to develop a healthcare-specific AI trained on Japanese medical data, achieving top scores on the Japan National Examination for Physicians.

In Taiwan, Chang Gung Memorial Hospital is building a custom AI Inference Service to centralize LLM applications within the hospital system, using the Llama-3-Taiwan 70B model to improve medical communication. Pegatron, a Taiwan-based electronics manufacturer, is adopting the model for internal and external applications, integrating it with its PEGAAi Agentic AI System to boost efficiency in manufacturing and operations.

Developing Applications With Sovereign AI NIM Microservices

Developers can deploy these sovereign AI models, packaged as NIM microservices, into production while achieving improved performance. The microservices, available with NVIDIA AI Enterprise, are optimized for inference with the NVIDIA TensorRT-LLM open-source library, providing up to 5x higher throughput and lowering the total cost of running the models in production.

The new NIM microservices are available today as hosted application programming interfaces (APIs).

Tapping NVIDIA NIM for Faster, More Accurate Generative AI Outcomes

The NIM microservices accelerate deployments, enhance overall performance, and provide the necessary security for organizations across various global industries, including healthcare, finance, manufacturing, education, and legal sectors.

“LLMs are not mechanical tools that provide the same benefit for everyone. They are rather intellectual tools that interact with human culture and creativity. The influence is mutual where not only are the models affected by the data we train on, but also our culture and the data we generate will be influenced by LLMs,” said Rio Yokota, professor at the Global Scientific Information and Computing Center at the Tokyo Institute of Technology.

Creating Custom Enterprise Models With NVIDIA AI Foundry

NVIDIA AI Foundry offers a platform and service that includes popular foundation models, NVIDIA NeMo for fine-tuning, and dedicated capacity on NVIDIA DGX Cloud. This provides developers with a full-stack solution for creating customized foundation models packaged as NIM microservices.

Developers using NVIDIA AI Foundry have access to the NVIDIA AI Enterprise software platform, which offers security, stability, and support for production deployments. This enables developers to build and deploy custom, regional language NIM microservices more quickly and easily, ensuring culturally and linguistically appropriate results for their users.