Edgeless Systems and NVIDIA Enhance AI Security with Continuum AI Framework
Edgeless Systems has introduced Continuum AI, a pioneering generative AI (GenAI) framework designed to maintain encrypted prompts at all times through confidential computing. This innovative solution integrates confidential virtual machines (VMs) with NVIDIA H100 GPUs and secure sandboxing, according to the NVIDIA Technical Blog.
Ensuring Data Privacy and Security
The launch of Continuum AI marks a significant advancement in AI deployment, enabling businesses to leverage powerful large language models (LLMs) without compromising data privacy and security. By collaborating with NVIDIA, Edgeless Systems aims to empower organizations across various sectors to integrate AI securely. This platform is not just a technological breakthrough but also a critical step towards a future where AI can be utilized securely, even with the most sensitive data.
Security Goals of Continuum AI
Continuum AI has two primary security objectives: to protect user data and to safeguard AI model weights against infrastructure and service providers. Infrastructure encompasses all the underlying hardware and software stacks that an AI application runs on, such as cloud platforms like Microsoft Azure. Service providers control the AI applications, such as OpenAI for ChatGPT.
How Continuum AI Operates
Continuum AI relies on two core mechanisms: confidential computing and advanced sandboxing. Confidential computing is a hardware-based technology that ensures data remains encrypted even during processing, verifying the integrity of workloads. This approach, powered by NVIDIA H100 Tensor Core GPUs, creates a secure environment that separates infrastructure and service providers from data and models. It also supports popular AI inference services like the NVIDIA Triton Inference Server.
Despite these security measures, AI code from third parties could potentially leak prompts accidentally or maliciously. A thorough review of AI code is impractical due to its complexity and frequent updates. Continuum addresses this by running AI code inside a sandbox on a confidential computing-protected AI worker, using an adapted version of Google’s gVisor sandbox. This ensures that AI code can only handle encrypted prompts and responses, preventing plaintext data leaks.
System Architecture
Continuum AI consists of two main components: the server side, which hosts the AI service and processes prompts securely, and the client side, which encrypts prompts and verifies the server. The server-side architecture includes worker nodes and an attestation service.
Worker nodes, central to the backend, host AI models and serve inference requests. Each worker operates within a confidential VM (CVM) running Continuum OS, a minimal and verifiable system through remote attestation. The CVM hosts workloads in a sandbox and mediates network traffic through an encryption proxy, ensuring secure data handling.
The attestation service ensures the integrity and authenticity of worker nodes, allowing both service providers and clients to verify that they are interacting with a secure deployment. The service runs in a CVM and manages key exchanges for prompt encryption.
Workflow and User Interaction
Admins verify the attestation service's integrity through the CLI and configure AI code using the worker API. Verified workers receive inference secrets and can serve requests securely. Users interact with the attestation service and worker nodes, verifying deployments and sending encrypted prompts for processing. The encryption proxy decrypts these prompts, processes them in the sandbox, and re-encrypts responses before sending them back to users.
For further insights into this cutting-edge technology, visit the Continuum page.