NVIDIA Unveils First On-Device Small Language Model to Enhance Digital Humans
At Gamescom this week, NVIDIA announced the introduction of Nemotron-4 4B Instruct, its first on-device small language model (SLM). This innovative model is designed to enhance the lifelikeness of digital humans in gaming and other interactive experiences, according to the NVIDIA Blog.
The SLM Advantage
SLMs, such as Nemotron-4 4B, are optimized for specific use cases, allowing them to deliver more accurate and faster responses compared to larger, general-purpose language models. Nemotron-4 4B was distilled from the larger Nemotron-4 15B model, reducing its memory footprint and improving its speed without compromising accuracy. This makes it suitable for running locally on GeForce RTX-powered PCs and laptops, as well as NVIDIA RTX-powered workstations.
The model includes advanced features like role-play, retrieval-augmented generation, and function-calling capabilities. These enable game characters to better understand and respond to player instructions, making in-game interactions more intuitive and engaging.
ACEs Up
NVIDIA's ACE (Avatar Creation Engine) technology, which includes Nemotron-4 4B, allows developers to deploy state-of-the-art generative AI models both on the cloud and on RTX AI PCs and workstations. This suite includes key AI models for speech-to-text, language processing, text-to-speech, and facial animation, making it modular and adaptable to various developer needs.
For instance, in the video game Mecha BREAK, players can converse with a mechanic game character and instruct it to switch and customize mechs, demonstrating the practical applications of Nemotron-4 4B's capabilities.
AI That’s NIMble
ACE also supports hybrid inference, allowing developers to run AI models either in the cloud or locally. The NVIDIA AI Inference Manager software development kit streamlines the deployment and integration of these models, offering a flexible solution for developers based on their specific requirements.
Current ACE NIM microservices running locally include Audio2Face and the new Nemotron-4 4B Instruct, as well as Whisper ASR, an advanced automatic speech recognition system. These services enhance the interactivity and realism of digital characters in games and other applications.
To Infinity and Beyond
NVIDIA's advancements in digital human technology extend beyond gaming. At the recent SIGGRAPH conference, the company showcased “James,” an interactive digital human capable of connecting with users through emotions and humor. James is built using the ACE framework and demonstrates the potential for digital humans in various industries, including customer service, healthcare, retail, and robotics.
According to Gartner, by 2025, 80% of conversational offerings will embed generative AI, and 75% of customer-facing applications will feature conversational AI with emotional capabilities. This indicates a significant shift towards more engaging and natural human-computer interactions.
For those interested in experiencing this future technology, interactions with James are available in real-time at ai.nvidia.com.