home / skills / near / agent-skills / near-ai-cloud

near-ai-cloud skill

safe

This skill helps you integrate verifiable private AI inference with NEAR AI Cloud, ensuring encrypted prompts, attested hardware, and signed responses.

npx playbooks add skill near/agent-skills --skill near-ai-cloud

Review the files below or copy the command above to add this skill to your agents.

Files (3)

SKILL.md

4.9 KB

---
name: near-ai-cloud
description: NEAR AI Cloud private inference and verification. Use when integrating NEAR AI Cloud API for verifiable private AI inference, verifying model or gateway TEE attestation (NVIDIA NRAS, Intel TDX), verifying chat message signatures, implementing end-to-end encrypted chat, or using the OpenAI-compatible API with NEAR AI Cloud.
metadata:
  author: near
  version: "1.0.0"
---

# NEAR AI Cloud

Verifiable private AI inference through Trusted Execution Environments (TEEs). All inference runs inside Intel TDX confidential VMs with NVIDIA TEE GPUs — your data stays encrypted and isolated from infrastructure providers, model providers, and NEAR itself.

## Quick Start

The API is OpenAI-compatible. Point any OpenAI SDK at `https://cloud-api.near.ai/v1`:

```python
import openai

client = openai.OpenAI(
    base_url="https://cloud-api.near.ai/v1",
    api_key="YOUR_API_KEY"  # from cloud.near.ai dashboard
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3.1",
    messages=[{"role": "user", "content": "Hello, NEAR AI!"}]
)
print(response.choices[0].message.content)
```

```javascript
import OpenAI from 'openai';

const openai = new OpenAI({
    baseURL: 'https://cloud-api.near.ai/v1',
    apiKey: 'YOUR_API_KEY',
});

const completion = await openai.chat.completions.create({
    model: 'deepseek-ai/DeepSeek-V3.1',
    messages: [{ role: 'user', content: 'Hello, NEAR AI!' }]
});
console.log(completion.choices[0].message.content);
```

## How It Works

- All inference runs inside **Intel TDX** confidential VMs with **NVIDIA TEE** GPUs
- TLS terminates **inside the TEE**, not at a load balancer — prompts are never exposed in plaintext
- TEEs generate **cryptographic attestation proofs** verifiable via NVIDIA NRAS and Intel TDX
- Every chat response is **signed by a key that never leaves the TEE**
- You can independently verify hardware attestation and bind it to message signatures

## Verification Flow

```
1. Generate nonce
2. Request model attestation  →  get signing_address, nvidia_payload, intel_quote
3. Verify GPU attestation     →  submit nvidia_payload to NVIDIA NRAS, check JWT fields
4. Verify CPU attestation     →  verify intel_quote via dcap-qvl or TEE Explorer
5. Verify GPU-CPU binding     →  signing_address + nonce bound in TDX report data; same nonce in NRAS eat_nonce
6. Make chat request           →  use the API as normal
7. Fetch chat signature       →  GET /v1/signature/{chat_id}
8. Verify signature            →  recover signer, compare to attested signing_address
```

## API Endpoints

Base URL: `https://cloud-api.near.ai`

| Endpoint                               | Method | Description                        |
|----------------------------------------|--------|------------------------------------|
| `/v1/chat/completions`                 | POST   | OpenAI-compatible chat completions |
| `/v1/models`                           | GET    | List available models              |
| `/v1/attestation/report?model={model}` | GET    | Model attestation (GPU + CPU)      |
| `/v1/attestation/report`               | GET    | Gateway attestation                |
| `/v1/signature/{chat_id}`              | GET    | Chat message signature             |

## Critical Knowledge

- Base URL is `https://cloud-api.near.ai/v1` — use with any OpenAI SDK
- `signing_algo` can be `ecdsa` or `ed25519`
- Nonce should be a random 64-char hex string (32 bytes) for attestation freshness
- NRAS response is a two-part array: `[["JWT", "..."], {"GPU-0": "..."}]` — overall JWT + per-GPU JWTs
- The `signing_address` from model attestation **must match** the address that signed chat messages
- Chat signatures are persistent and can be queried at any time after completion

## References

| Topic                            | File                                                                 |
|----------------------------------|----------------------------------------------------------------------|
| **Private vs Anonymised Models** | [references/private-vs-anonymised.md](references/model-list.md)      |
| **Model TEE verification**       | [references/model-verification.md](references/model-verification.md) |

**Planned:**

- Gateway verification (TDX attestation for the API gateway + source provenance)
- Chat verification (request/response hashing + signature verification)
- E2E encrypted chat (ECDH key exchange, AES-256-GCM / ChaCha20-Poly1305)
- OpenAI compatibility (streaming, reasoning models, Files API)

## Resources

- NEAR AI Cloud: https://cloud.near.ai
- Documentation: https://docs.near.ai/cloud/introduction
- Verification Example: https://github.com/near-examples/nearai-cloud-verification-example
- Full Verifier: https://github.com/nearai/nearai-cloud-verifier
- NVIDIA NRAS API: https://docs.api.nvidia.com/attestation/reference/attestmultigpu_1
- TEE Attestation Explorer: https://proof.t16z.com/
- DCAP QVL (TDX verification): https://github.com/Phala-Network/dcap-qvl

Overview

This skill integrates NEAR AI Cloud for verifiable private AI inference using Trusted Execution Environments. It provides an OpenAI-compatible API endpoint, attestation verification for Intel TDX and NVIDIA NRAS, and persistent chat signature verification. Use it to run private model inference where data, model execution, and responses are cryptographically verifiable.

How this skill works

Requests are routed to Intel TDX confidential VMs with NVIDIA TEE GPUs so TLS terminates inside the TEE and prompts are never exposed outside the enclave. The platform issues attestation artifacts (NVIDIA NRAS payloads and Intel quotes) and a signing_address tied to an in-TEE key; each chat response is signed by that key. You verify freshness with a nonce, validate GPU and CPU attestations, bind attestations to the signing_address, then recover and compare the signer from the persistent chat signature.

When to use it

You need verifiable, private inference where cloud operators and model providers cannot read inputs.
You must verify hardware attestation for regulatory or compliance reasons (Intel TDX, NVIDIA NRAS).
You want OpenAI-compatible integration but with cryptographic assurances and message signatures.
You’re implementing end-to-end encrypted chat or need to bind responses to a specific TEE instance.
You need persistent, auditable chat signatures for post-hoc verification.

Best practices

Always generate a high-entropy nonce (64 hex chars / 32 bytes) per attestation request to prevent replay.
Verify both NVIDIA NRAS JWTs and Intel TDX quotes using official validators (NRAS APIs, DCAP QVL or trusted attestation explorers).
Confirm signing_address from the attestation report matches the recovered signer from the chat signature before trusting output.
Use standard OpenAI SDKs pointed at https://cloud-api.near.ai/v1 to minimize integration work; treat the service as an OpenAI-compatible backend.
Store attestation reports, nonces, and chat signatures alongside any audit logs to enable later verification.

Example use cases

Securely running a customer’s PII-sensitive inference workloads while proving the run occurred inside a TEE.
Verifying a model provider’s TEE attestation before using responses in regulated workflows.
Implementing an end-to-end encrypted chat where the cloud signs messages and you verify the signer and attestation chain.
Replacing an OpenAI endpoint with a TEE-backed drop-in that provides cryptographic proof of execution and signed outputs.
Auditing historical chat completions by fetching persistent signatures and validating them against stored attestation reports.

FAQ

What base URL do I use with existing OpenAI SDKs?

Point SDKs at https://cloud-api.near.ai/v1 and supply your NEAR AI Cloud API key in place of an OpenAI key.

How do I prove a chat response came from the attested TEE?

Fetch the model attestation report (signing_address + NRAS payload + Intel quote), perform GPU and CPU verification, then GET /v1/signature/{chat_id} and verify the recovered signer matches signing_address.