NVIDIA Unleashes NeMo Microservices to Supercharge AI Agents

NVIDIA announced on April 23, 2025 that NeMo Microservices is now publicly available, offering developers and enterprises a modular platform to build intelligent AI Agents that learn from experience and leverage existing data for deeper insights.

NVIDIA today announced the general availability of neural module (NeMo) Microservices, a modular platform for building and customizing generative AI models and AI Agents in enterprise environments CIO.

In a post on NVIDIA’s Developer Forum, the company said, “Today, we announced the availability of the NVIDIA NeMo™ microservices—an end-to-end, fully accelerated platform for building data flywheels, ensuring that AI Agents continuously deliver peak performance.” NVIDIA Developer Forums

Also Read:Google Pays Samsung Monthly to Preinstall Gemini AI on Galaxy Phones, DOJ Trial Reveals

The initial release includes NeMo Customizer, which accelerates large language model fine-tuning by up to 1.8× higher training throughput; NeMo Evaluator, which simplifies model evaluation with just five API calls; and NeMo Guardrails, which enforces compliance and security measures with only a half-second latency impact.

According to the company, NeMo Microservices integrate seamlessly into the NVIDIA AI Enterprise platform and can run on any accelerated computing infrastructure, both on-premises and in cloud environments, while providing enterprise-grade security, stability and support.

NeMo Microservices support a broad set of models and frameworks, including Meta’s Llama, Microsoft’s Phi family, Google’s Gemma and Mistral, as well as NVIDIA’s Llama Nemotron Ultra. Leading AI frameworks such as LangChain, Haystack, LlamaIndex and Llamastack can interface with the microservices, and technology partners including Cloudera, DataStax, DataRobot, SuperAnnotate and Weights & Biases have embedded NeMo components into their offerings .

Enterprise adopters have begun deploying NeMo Microservices in production. AT&T used NeMo Customizer and Evaluator to fine-tune a Mistral 7B model for customer support, improving agent accuracy by 40%. BlackRock integrated the microservices into its Aladdin platform to unify investment management workflows through a common data language. Cisco’s Outshift team, working with Galileo, reported 40% fewer tool selection errors and up to 10× faster response times in its coding assistant pilot.

NVIDIA views NeMo Microservices as a key building block for agentic AI systems that leverage data flywheels—a continuous loop where interaction data guides model refinement, evaluation and redeployment with guardrails to maintain performance and compliance CIO.

The NeMo Microservices announcement follows NVIDIA’s March launch of its Llama Nemotron family of open reasoning models and the AI-Q Blueprint for enterprise data flywheels, underscoring the company’s strategy to combine hardware and software solutions for the next generation of enterprise AI NVIDIA Investor Relations.

NVIDIA made NeMo Microservices immediately available for download via its Developer Forum and the NGC container registry. Developers and enterprises can access the services as part of the NVIDIA AI Enterprise platform, with ongoing updates and expanded toolsets planned throughout 2025

Menu

Search

NVIDIA Unleashes NeMo Microservices to Supercharge AI Agents

SHARE

Comments (0)

Leave a Comment