Today at the VMware Explore 2025 General Session, we shared several exciting private AI developments.
It’s been three years since we first introduced private AI as an important market category. While we have our own offering, we always knew that private AI was bigger than Broadcom. The need to be able to deploy and operate AI models within your private cloud – while preserving privacy, flexibility and control – matters to all industries across all geographies. Greater government interest globally around sovereign AI is driving new data residency requirements while accelerating private AI market growth. In the past six months, multiple hyperscalers have introduced solutions that bring their AI services on-premises to meet growing demand for private AI use cases.
Even in the face of increased hyperscaler competition, customers are buying our solution because, in addition to better enabling privacy and control, it also offers choice and a more vibrant ecosystem. As the AI market continues to evolve, our customers are able to quickly pivot to new models or accelerators from their vendors or providers of choice. They don’t want to make a large infrastructure investment and have it bound to a single AI solution. They want the freedom to switch to whatever best meets their needs in the future without having to start over from scratch. When you combine the AI choice and flexibility we provide with the best on-premises private cloud platform in VMware Cloud Foundation (VCF), it’s no wonder why customers see tremendous value in partnering with us. In fact, today we shared that we’ve added over 80 new customers in the past year, including Loomis, NTT Data, Rio Tinto, The University of Bristol, and the United States Senate Federal Credit Union.
Customers have shared that the TCO benefits of running VCF Private AI services can be as much as 3-5x better than competing solutions, and they like that they can run the most demanding NVIDIA deployments, such as NVIDIA HGX Server on VCF, scaling to serve thousands to tens of thousands of tokens-per-second for AI inference, all while maintaining a single pane of glass for managing and operating all apps and services.
To streamline customer adoption and enable our customers to onboard new services at the speed demanded by AI innovations, we announced that customers are now entitled to VMware’s Private AI services as part of their VCF subscription. With this, organizations are getting a lot of value from advanced infrastructure scheduling and load balancing delivered using our Distributed Resource Scheduler (DRS), to automation blueprints that allow you to deploy complex AI services in a matter of minutes, all with optimized device drivers, accelerator-specific kernels, and more. Customers will also get all of the new AI capabilities introduced in VCF 9.0 that are required to build an end-to-end AI service, including our model runtime, model store for model governance, data indexing and retrieval service, vector database, API gateway, and AI agent builder.
And we’re certainly not slowing down. In fact, we previewed several new features that will be available in VCF in the future, including an integrated AI support assistant which was demonstrated at Explore earlier today.
Introducing VCF Intelligent Assist
From the screen below, you can see that with our AI-driven VCF Intelligent Assist, IT administrators will be able to diagnose and resolve any challenges they run into. The assistant will point them to the documents or KB articles that will help them resolve their issue. Going forward, we will also be adding in automated remediation. Further, we architected the solution to allow choice of language model, including models that can be deployed on-premises as well as models hosted in the cloud.
We are sharing all of the code samples used to create our support service demo from the Explore General Session. Learn more from our blog “Building your GenAI Agents on VCF: Getting Started” from Tasha Drew.

New VCF Private AI Services Features
On stage at Explore, we demonstrated several new features that will be available soon, including:
- Model Context Protocol (MCP) - MCP has gained massive momentum as the standard method for integrating AI services with a variety of structured and unstructured data sets. MCP servers can be built quickly, but speed can bring risk. This is where with VCF Private AI services, you’ll have the ability to govern and secure MCP-based data collection across your organization.
- Multi-accelerator Model runtime – We have continued to expand the capabilities of our model runtime, allowing organizations to deploy AI models to NVIDIA and AMD GPUs, as well as to CPUs. This allows infrastructure teams to deploy models to a variety of accelerators without having to refactor AI applications.
- Multi-tenant Models-as-a-service - In many on-premises AI deployments, IT teams have resorted to deploying separate instances of the same model as a way to ensure that private data is not compromised. The challenge with this approach is that it can dramatically drive up costs and power consumption. We solve that by allowing tenants or separate lines of business to access shared models all while providing controls to enforce privacy and security.

New Integrations with NVIDIA and AMD
Finally, we have also been very busy with partners. With NVIDIA and AMD, we announced the following new capabilities:
- Expanded GPU support: VCF will support NVIDIA’s next-generation GPU architecture, engineered for massive AI training, inference, and high-performance computing (HPC), including support for NVIDIA Blackwell 200 (B200) and NVIDIA RTX™ PRO 6000 Server Edition GPUs.
- GPU Passthrough support: Our customers have shared with us that they like to use vGPU to share GPU capacity in some use cases and for other use cases prefer passthrough (e.g., when a single model is loaded onto one or more GPUs). We are working with NVIDIA to ensure that all customer design patterns are supported.
- High-Speed Networking with DirectPath I/O: VCF will incorporate support for NVIDIA ConnectX®-7 and NVIDIA BlueField®-3 400G NICs with DirectPath I/O. This enables customers to leverage advanced capabilities like GPUDirect® RDMA and GPUDirect Storage for high-speed, multi-host AI model training and data transfer, crucial for demanding Generative AI workloads.
- Support for AMD Instinct MI350 Series GPUs: We’re also excited to announce the expansion of our longstanding partnership with AMD by offering future virtualization support forAMD Instinct MI350 Series GPUs. This allows organizations to deploy and run AMD GPUs, as well as leverage AMD Enterprise AI software, while serving AI models using our model runtime, all while maintaining the massive operational and consolidation benefits provided through our virtualization technology.
Looking Ahead
This past year has been extremely exciting. Not only have we made it simple for our customers and partners to build, deploy and operate their own AI services, we are also deeply integrating AI into VCF. Our AI roadmap is entirely customer- and partner-driven, so please reach out if there is more you need or if you have any questions. Expect even more innovations and partnership announcements from us in the coming months. In the meantime, take VCF Private AI Services for a spin and let us know what you think!
Ready to get started on your AI journey with VMware Private AI? Check out these helpful resources:
- Learn more about VMware Private AI
- Learn more about VMware Private AI Foundation with NVIDIA
- Explore VMware Private AI Ecosystem Partners
- Contact us using this Interest Request form.
- Connect with us on Twitter at @VMwareVCF and on LinkedIn at VMware VCF