Copied


NVIDIA NIM Microservices Revolutionize AI Model Deployment

Caroline Bishop   Aug 06, 2024 17:07 0 Min Read


Delivered as optimized containers, NVIDIA NIM microservices are designed to accelerate AI application development for businesses of all sizes, paving the way for rapid production and deployment of AI technologies. The set of microservices can be used to build and deploy AI solutions across speech AI, data retrieval, digital biology, digital humans, simulation, and large language models (LLMs), according to the NVIDIA Technical Blog.

Speech and Translation NIM Microservices

The latest NIM microservices for speech and translation enable organizations to integrate advanced multilingual speech and translation capabilities into their conversational applications. These include automatic speech recognition (ASR), text-to-speech (TTS), and neural machine translation (NMT), catering to diverse industry needs.

Parakeet ASR

The Parakeet ASR-CTC-1.1B-EnUS ASR model, with 1.1 billion parameters, provides record-setting English language transcription capabilities. It delivers exceptional accuracy and robustness, adeptly handling diverse speech patterns and noise levels, enabling businesses to advance their voice-based services.

FastPitch-HiFiGAN TTS

FastPitch-HiFiGAN-EN integrates FastPitch and HiFiGAN models to generate high-fidelity audio from text. It enables businesses to create natural-sounding voices, elevating user engagement and delivering immersive experiences.

Megatron NMT

The Megatron 1B-En32 is a powerful NMT model excelling in real-time translation across multiple languages, facilitating seamless multilingual communication. It enables organizations to extend their global reach and engage diverse audiences.

Retrieval NIM Microservices

The latest NVIDIA NeMo Retriever NIM microservices help developers efficiently fetch the best proprietary data to generate knowledgeable responses for their AI applications. NeMo Retriever enables organizations to seamlessly connect custom models to diverse business data and deliver highly accurate responses using retrieval-augmented generation (RAG).

Embedding QA E5

The NVIDIA NeMo Retriever QA E5 embedding model is optimized for text question-answering retrieval. It transforms textual information into dense vector representations, crucial for a text retrieval system.

Embedding QA Mistral 7B

The NVIDIA NeMo Retriever QA Mistral 7B embedding model is a multilingual community base model fine-tuned for high-accuracy question-answering. This model is suitable for users building a question-and-answer application over a large text corpus.

Snowflake Arctic Embed

Snowflake Arctic Embed is a suite of text embedding models for high-quality retrieval, optimized for performance. These models are ready for commercial use, free of charge, and have achieved state-of-the-art performance on the MTEB/BEIR leaderboard.

Reranking QA Mistral 4B

The NVIDIA NeMo Retriever QA Mistral 4B reranking model provides a logit score representing document relevance to a query. It improves the overall accuracy of text retrieval systems, often deployed in combination with embedding models.

Digital Biology NIM Microservices

In healthcare and life sciences, NVIDIA NIM microservices are transforming digital biology. These AI tools empower pharmaceutical companies, biotechnology, and healthcare facilities to expedite innovation and deliver life-saving medicine to patients.

MolMIM

MolMIM is a transformer-based model for controlled small molecule generation, optimizing and sampling molecules for improved values of desired scoring functions. It can be deployed in the cloud or on-premises for computational drug discovery workflows.

DiffDock

NVIDIA DiffDock NIM microservice is built for high-performance, scalable molecular docking. It predicts up to 7x more poses per second compared to baseline models, reducing the cost of computational drug discovery workflows.

LLM NIM Microservices

New NVIDIA NIM microservices for LLMs offer unprecedented performance and accuracy across various applications and languages.

Llama 3.1 8B and 70B

The Llama 3.1 8B and 70B models provide cutting-edge text generation and language understanding capabilities, serving as powerful tools for creating engaging and informative content. Deploying Llama 3.1 8B NIM on NVIDIA H100 data center GPUs can achieve up to 2.5x tokens per second for content generation.

Llama 3.1 405B

Llama 3.1 405B is the largest openly available model for various use cases, including synthetic data generation. The Llama 3.1 405B NIM microservice can be downloaded and run anywhere from the NVIDIA API catalog.

Simulation NIM Microservices

New NVIDIA USD NIM microservices offer the ability to leverage generative AI copilots and agents to develop Universal Scene Description (OpenUSD) tools that accelerate the creation of 3D worlds.

USD Code

USD Code is a state-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

USD Search

USD Search provides AI-powered search for OpenUSD data, 3D models, images, and assets using text- or image-based inputs.

USD Validate

USD Validate enables verifying compatibility of OpenUSD assets with instant RTX render and rule-based validation.

Video Conferencing NIM Microservices

NVIDIA Maxine simplifies the deployment of AI features that enhance audio, video, and augmented reality effects for video conferencing and telepresence.

Maxine Audio2Face-2D

Maxine Audio2Face-2D animates a 2D image in real time using speech audio. It enables head pose animation for natural delivery and can be coupled with chatbot output or translated speech.

Eye Contact

NVIDIA Maxine Eye Contact NIM microservice uses AI to apply a filter to the user’s webcam feed in real time, redirecting their eye gaze toward the camera to improve, augment, and enhance the user experience.

Accelerate AI Application Development

NVIDIA NIM streamlines the creation of complex AI applications by enabling the integration of specialized microservices across domains. Using NIM microservices, organizations can bypass the complexities of building AI models from scratch, saving time and resources. This allows for the assembly of customized AI solutions that meet specific business needs.

For example, a company can combine ACE NIM microservices, including speech recognition, with LLM NIM microservices to create digital humans for personalized customer service across industries such as healthcare, finance, and retail.

NIM microservices can also be integrated into supply chain management systems, combining cuOpt NIM microservice for route optimization with NeMo Retriever NIM microservices for retrieval-augmented generation and LLM NIM microservices for business communication.

Get Started

NVIDIA NIM empowers enterprises to fully harness AI, accelerating innovation, maintaining a competitive edge, and delivering superior customer experiences. Explore the latest AI models available with NIM microservices and discover how these powerful tools can transform your business.


Read More