While practically all organizations are looking to take advantage of AI technologies, their concerns are generally threefold: minimizing risk to their intellectual property, ensuring their private data will not be shared externally, and ensuring complete control over access to their AI models.
These concerns are driving the necessity of Private AI. Private AI is an architectural approach that aims to balance the business gains from AI with the practical privacy and compliance needs of the organization, and is comprised of the following core tenets:
- Highly distributed: Compute capacity and trained AI models reside adjacent to where data is created, processed, and/or consumed, whether the data resides in a public cloud, virtual private cloud, enterprise data center, or at the edge. This will require an AI infrastructure that is capable of seamlessly connecting disparate data locales in order to ensure centralized governance and operations for all AI services.
- Data privacy and control: An organization’s data remains private to the organization and is not used to train, tune, or augment any commercial or OSS models without the organization’s consent. The organization maintains full control of its data and can leverage other AI models for a shared data set as its business needs require.
- Access Control and auditability: Access controls are in place to govern access and changes to AI models, associated training data, and applications. Audit logs and associated controls are also essential to ensure that compliance mandates – regulatory or otherwise – are satisfied.
To be clear, Private AI is about the platform and infrastructure architecture built in support of AI, which can be deployed in public clouds, virtual private clouds, data centers, and edge sites. Private clouds can be architected to satisfy the requirements of Private AI but are not a requirement; what is important is that the privacy and control requirements are satisfied, regardless of where AI models and data are deployed.
VMware’s Role in Private AI
Since our inception, VMware has always worked to make it simple for organizations to run and manage heterogeneous workloads on shared infrastructure pools, reducing costs and energy consumption while improving security, compliance, privacy, and resiliency. The challenges posed with making Private AI successful fall into areas where VMware has decades of engineering expertise. To that end, we see our role in Private AI as going beyond the core tenets listed above to include the following additional benefits:
- Choice: Organizations can change commercial AI services or leverage different open source models as business needs evolve. In addition, organizations might desire to leverage models trained in a public cloud for inferencing against their private data sets that reside in another locale. Further, the organization can curate its own catalog of models trained on private data representing a variety of business functions. This can help an organization to build more differentiated capabilities to stand out from competition.
- Confidentiality: Training sets, model weights, and inference data can be protected by modern confidential computing constructs, enabling data confidentiality at rest, in flight, and in use. In the future, our solutions aim to support trust management with a portable, vendor-independent, cloud-independent developer and management framework that allows customers to retain control over their trusted computing base.
- Performance: Performance is equal to and even exceeds bare metal in some ML inference use cases as proven in recent industry benchmarks.
- Unified management and operations: Management and operations are unified across Private AI and all other enterprise services, reducing total cost of ownership, complexity, and additive risk.
- Time-to-value: AI environments are quickly spun up and torn down in seconds, enabling available resources to be provisioned within seconds.
- Lower cost/Increased efficiency: The platform maximizes usage of all compute resources (e.g., GPU and CPU) to lower overall costs, intelligently making use of idle capacity. As a result, organizations can leverage existing compute resources spread across teams and create a pooled resource environment that can be efficiently shared by using policy and governance enforced by IT teams.
VMware Private AI Foundation with NVIDIA
VMware is collaborating with NVIDIA to offer VMware Private AI Foundation with NVIDIA. This solution will enable enterprises to fine-tune large language models, produce more secure and private models for their internal usage, enable enterprises to offer generative AI as a service to their users, and more securely run inference workloads at scale. The solution is built on VMware Cloud Foundation and NVIDIA AI Enterprise, and offers the following benefits:
- Data-Center Scale: Multi-GPU IO pathing allows AI workloads to scale across up to 16 vGPUs/GPUs in a single virtual machine and across multiple nodes to speed generative AI model customization and deployment.
- Accelerated Storage: VMware vSAN Express Storage Architecture provides performance optimized NVMe storage and supports GPUDirect storage over RDMA, allowing for direct IO transfer from storage to GPUs without CPU involvement.
- vSphere Deep Learning VM images: Fast prototyping capability by offering a stable turnkey solution image that includes frameworks, libraries with version compatible drivers pre-installed.
The solution will feature NVIDIA NeMo, an end-to-end, cloud-native framework included in NVIDIA AI Enterprise — the operating system of the NVIDIA AI platform — that allows enterprises to build, customize and deploy generative AI models anywhere. NeMo combines customization frameworks, guardrail toolkits, data curation tools and pre-trained models to offer enterprises an easy, cost-effective and fast way to adopt generative AI.
To learn more about our exciting upcoming solution with NVIDIA, see Introducing VMware Private AI Foundation.
VMware Private AI Reference Architecture
VMware AI Labs has worked across VMware R&D and with our industry partners to create a solution for AI services that enables privacy and control of corporate data, choice of open source and commercial AI solutions, quick time-to-value, and integrated security and management. The Private AI Reference Architecture offers customers and partners the flexibility to:
- Leverage best-of-breed models, frameworks, app and data services, tooling and hardware tailored to their business needs.
- Realize quick time-to-value by leveraging a fully documented architecture and associated code samples.
- Take advantage of popular open-source projects such as ray.io, Kubeflow, PyTorch, pgvector, and models served from Hugging Face.
The reference architecture includes the flexibility to start with Kubeflow – a very popular open source MLOps toolkit for Kubernetes – and take advantage of the many commercial MLOps tools from VMware partners (e.g., Anyscale, cnvrg.io, Domino Data Lab, NVIDIA, One Convergence, Run:ai, and Weights & Biases). PyTorch – the most popular open source framework for generative AI – is included, but again customers and partners are free to choose what’s best for their needs. Ray is the most popular open source AI cluster scheduler and orchestration tool, allowing organizations to quickly scale out AI compute across distributed systems, and is used in some of the largest AI deployments today, such as by OpenAI.
VMware is partnering with AnyScale to expand the reach of Ray to on-premises use cases. Most important, our Ray integration with VMware vCenter allows IT organizations to quickly onboard new AI ISVs because their software can run on VMware Private AI infrastructure via Ray; native VMware API integrations are not required. You can see the breadth of Ray AI software ecosystem here.
Finally, our collaboration with Hugging Face will provide speed and simplicity in how you bring open source models to your data. Our collaboration has resulted in today’s Hugging Face SafeCoder announcement, which allows organizations to offer AI-assisted software development with their software code repos remaining on-premises. VMware has been running this solution in our own data centers for the past couple of months, and we have seen extremely impressive results in cost, scale, and developer productivity.
To get started with the VMware Private AI, consult our reference architecture and code samples.
With VMware Private AI Foundation with NVIDIA, our reference architecture, and large partner ecosystem, there is no reason to debate trade-offs in choice, privacy, and control. You can have it all, and future proof your AI infrastructure to take advantage of the next disruptive AI model that will inevitably come along. Let’s go!
This article may contain hyperlinks to non-VMware websites that are created and maintained by third parties who are solely responsible for the content on such websites.