home / skills / one-covenant / sacred-arts / basilica

basilica skill

/skills/basilica

npx playbooks add skill one-covenant/sacred-arts --skill basilica

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
19.1 KB
---
name: basilica
description: Interact with the Basilica GPU compute platform for rentals, deployments, and account actions. Use when the user mentions Basilica, GPU rentals, deployments, balance, or the basilica CLI/SDK.
---

# Basilica Platform

You are an expert operator for the Basilica GPU compute platform. Translate user intent into the correct `basilica` CLI command and execute it. If the user asks for integration code, generate Python SDK snippets only (do not execute SDK code).

## Operating Modes

### 1) CLI (default)
- Convert the user's request into the appropriate `basilica` CLI command.
- Execute the command and summarize the results.
- Use `--json` if you need to parse programmatically.

### 2) Python SDK (code generation only)
- Provide a runnable code snippet.
- Do NOT execute SDK code.

## CLI Location

```bash
~/.local/bin/basilica
```

If you are unsure, resolve it at runtime:

```bash
command -v basilica
```

## CLI Installation + Login

```bash
# Install CLI (macOS / Linux)
curl -sSL https://basilica.ai/install.sh | bash

# Login
basilica login
```

## Request Parsing

Identify the operation type and map to a command.

### Balance & Account

| User Request | Command |
|-------------|---------|
| "what's my balance?" | `basilica balance` |
| "check balance" | `basilica balance` |
| "how much credit do I have?" | `basilica balance` |
| "show my account" | `basilica balance` |
| "fund my account" | `basilica fund` |
| "show deposit address" | `basilica fund` |
| "list my deposits" | `basilica fund list` |
| "create API token" | `basilica tokens create` |
| "list my tokens" | `basilica tokens list` |
| "add SSH key" | `basilica ssh-keys add` |
| "upgrade CLI" | `basilica upgrade` |

### GPU Rentals

| User Request | Command |
|-------------|---------|
| "list available GPUs" | `basilica ls` |
| "show H100 GPUs" | `basilica ls h100` |
| "show cheap GPUs under $2/hr" | `basilica ls --price-max 2` |
| "list my rentals" | `basilica ps` |
| "show active instances" | `basilica ps` |
| "show rental history" | `basilica ps --history` |
| "status of <uid>" | `basilica status <uid>` |
| "show logs for <uid>" | `basilica logs <uid>` |
| "stop rental <uid>" | `basilica down <uid>` |
| "stop all rentals" | `basilica down --all` |
| "restart <uid>" | `basilica restart <uid>` |

### Deployments

| User Request | Command |
|-------------|---------|
| "list my deployments" | `basilica deploy ls` |
| "show deployments" | `basilica deploy ls` |
| "status of deployment <name>" | `basilica deploy status <name>` |
| "logs for deployment <name>" | `basilica deploy logs <name>` |
| "delete deployment <name>" | `basilica deploy delete <name> -y` |
| "stop deployment <name>" | `basilica deploy delete <name> -y` |
| "scale <name> to 3 replicas" | `basilica deploy scale <name> --replicas 3` |
| "deploy vLLM with llama" | `basilica deploy vllm meta-llama/Llama-3-8b` |
| "deploy sglang model" | `basilica deploy sglang Qwen/Qwen2.5-0.5B-Instruct` |

## Command Reference

### Account

```bash
# Check balance
basilica balance

# Show deposit address for funding
basilica fund

# List deposit history
basilica fund list --limit 100
```

### Authentication & Tokens

```bash
# Login
basilica login
basilica login --device-code  # For WSL/SSH/containers

# Logout
basilica logout

# API tokens
basilica tokens create <name>
basilica tokens list
basilica tokens revoke <name> -y

# SSH keys (for secure cloud rentals)
basilica ssh-keys add -n my-key -f ~/.ssh/id_ed25519.pub
basilica ssh-keys list
basilica ssh-keys delete -y
```

### Upgrade CLI

```bash
basilica upgrade
basilica upgrade --version 0.5.4
basilica upgrade --dry-run
```

### List Available GPUs

```bash
# All available GPUs
basilica ls

# Filter by GPU type
basilica ls h100
basilica ls a100

# Filter by price
basilica ls --price-max 2.50

# Filter by GPU count
basilica ls --gpu-min 4 --gpu-max 8

# Filter by memory
basilica ls --memory-min 80

# Combine filters
basilica ls h100 --gpu-min 4 --price-max 5
```

### Manage Rentals

```bash
# List active rentals
basilica ps

# Show all statuses
basilica ps --status active
basilica ps --status stopped
basilica ps --status failed

# Rental history
basilica ps --history

# Check specific rental
basilica status <uid>

# View logs
basilica logs <uid>
basilica logs <uid> --follow
basilica logs <uid> --tail 100

# Stop rental
basilica down <uid>

# Stop ALL rentals
basilica down --all

# Restart
basilica restart <uid>

# SSH into instance
basilica ssh <uid>

# Execute command (target optional if only one rental)
basilica exec "nvidia-smi"
basilica exec "python train.py" --target <uid>

# Copy files
basilica cp local_file.py <uid>:/workspace/
basilica cp <uid>:/workspace/output.txt ./
```

### Manage Deployments

```bash
# List deployments
basilica deploy ls

# Check status
basilica deploy status <name>

# View logs
basilica deploy logs <name>
basilica deploy logs <name> -f        # Follow
basilica deploy logs <name> --tail 100

# Scale
basilica deploy scale <name> --replicas 3

# Delete
basilica deploy delete <name> -y
```

### Start GPU Rental

```bash
# Basic H100 rental
basilica up h100

# Multiple GPUs
basilica up 4xh100

# With SSH key
basilica up h100 --ssh-key ~/.ssh/id_rsa.pub

# Community cloud with Docker image
basilica up --compute community-cloud --image pytorch/pytorch:latest

# Detached mode (don't auto-SSH)
basilica up h100 -d
```

### Deploy Application

```bash
# Deploy Python file
basilica deploy app.py

# Deploy with GPU
basilica deploy app.py --gpu 1 --gpu-model H100

# Deploy vLLM
basilica deploy vllm meta-llama/Llama-3-8b --gpu 1

# Deploy with storage
basilica deploy app.py --storage --storage-path /data

# Deploy with pip packages
basilica deploy app.py --pip fastapi uvicorn

# Deploy with env vars
basilica deploy app.py -e API_KEY=secret -e DEBUG=true

# Deploy SGLang
basilica deploy sglang Qwen/Qwen2.5-0.5B-Instruct

# Detached (don't wait for ready)
basilica deploy app.py --detach
```

## Workflow

1. Parse the user's natural language request.
2. Identify the operation type (balance, list, stop, etc.).
3. Extract any parameters (UIDs, names, filters).
4. Construct the `basilica` command.
5. Execute the command.
6. Summarize results and suggest next actions if relevant.

## Output Formatting

- Use `--json` when parsing output programmatically.
- Default table output is human-readable.
- Use `-v` for verbose output when debugging.

## Common Patterns

**Stop everything:**
```bash
# Stop all GPU rentals
basilica down --all

# Delete all deployments (must be done individually)
basilica deploy ls --json | jq -r '.[].name' | xargs -I {} basilica deploy delete {} -y
```

**Check resource usage:**
```bash
# Balance
basilica balance

# Active rentals
basilica ps

# Active deployments
basilica deploy ls
```

---

# Python SDK (Reference for Code Generation)

Provide code snippets only. Do not execute SDK code.

## Installation

```bash
uv pip install basilica-sdk
```

**Requirements:** Python 3.10+

## Authentication

```bash
# Create an API token
basilica tokens create

# Set environment variable
export BASILICA_API_TOKEN="basilica_..."
```

Or pass directly:
```python
from basilica import BasilicaClient

client = BasilicaClient(api_key="basilica_...")
```

## SDK Request Parsing

Parse the user's natural language and generate the corresponding SDK code snippet.

### Deployments

| User Request | SDK Code |
|-------------|----------|
| "deploy my app" | `client.deploy("my-app", source="app.py", port=8000)` |
| "deploy this code" | `client.deploy("name", source="inline code...", port=8000)` |
| "deploy fastapi app" | `client.deploy("api", source="app.py", port=8000, pip_packages=["fastapi", "uvicorn"])` |
| "deploy with GPU" | `client.deploy("ml", source="train.py", gpu_count=1, image="pytorch/pytorch:latest")` |
| "deploy with storage" | `client.deploy("app", source="app.py", storage=True)` |
| "deploy vllm/llama" | `client.deploy_vllm("meta-llama/Llama-2-7b")` |
| "deploy sglang model" | `client.deploy_sglang("Qwen/Qwen2.5-0.5B-Instruct")` |
| "delete deployment X" | `client.get("X").delete()` |
| "list my deployments" | `client.list()` |
| "get deployment logs" | `client.get("name").logs(tail=100)` |
| "check deployment status" | `client.get("name").status()` |

### Account & Nodes

| User Request | SDK Code |
|-------------|----------|
| "check balance" | `client.get_balance()` |
| "list available GPUs" | `client.list_nodes(available=True)` |
| "find H100 nodes" | `client.list_nodes(gpu_type="H100")` |
| "nodes with 80GB VRAM" | `client.list_nodes(min_gpu_memory=80)` |

## SDK Quick Reference

### Basic Deployment

```python
from basilica import BasilicaClient

client = BasilicaClient()

# Deploy from file
deployment = client.deploy(
    name="my-api",
    source="app.py",
    port=8000,
    pip_packages=["fastapi", "uvicorn"],
    ttl_seconds=600,  # Auto-delete after 10 minutes
)

print(f"Live at: {deployment.url}")
print(deployment.logs(tail=50))
deployment.delete()
```

### Deploy Inline Code

```python
deployment = client.deploy(
    name="hello",
    source="""
from http.server import HTTPServer, BaseHTTPRequestHandler

class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.end_headers()
        self.wfile.write(b'Hello from Basilica!')

HTTPServer(('', 8000), Handler).serve_forever()
""",
    port=8000,
)
```

### GPU Deployment

```python
deployment = client.deploy(
    name="pytorch-train",
    source="train.py",
    image="pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime",
    port=8000,
    gpu_count=1,
    gpu_models=["A100", "H100"],  # Optional: specific models
    memory="16Gi",
    storage=True,  # Persistent storage at /data
)
```

### vLLM Inference Server

```python
deployment = client.deploy_vllm(
    model="meta-llama/Llama-2-7b",
    # gpu_count auto-detected from model size
    storage=True,  # Cache models at /root/.cache
    ttl_seconds=3600,
)

print(f"OpenAI API: {deployment.url}/v1/chat/completions")
```

### SGLang Server

```python
deployment = client.deploy_sglang(
    model="Qwen/Qwen2.5-0.5B-Instruct",
    tensor_parallel_size=1,
    trust_remote_code=True,
)
```

### Decorator API

```python
import basilica

@basilica.deployment(
    name="my-service",
    port=8000,
    pip_packages=["fastapi", "uvicorn"],
    ttl_seconds=600,
)
def serve():
    from fastapi import FastAPI
    import uvicorn

    app = FastAPI()

    @app.get("/")
    def root():
        return {"status": "running"}

    uvicorn.run(app, host="0.0.0.0", port=8000)

# Deploy by calling the function
deployment = serve()
print(f"Live at: {deployment.url}")
```

### With Volumes

```python
import basilica

cache = basilica.Volume.from_name("my-cache", create_if_missing=True)

@basilica.deployment(
    name="app-with-storage",
    port=8000,
    volumes={"/data": cache},
)
def serve():
    # App can read/write to /data
    pass
```

### Deploy Docker Image (No Source)

```python
deployment = client.deploy(
    name="nginx",
    image="nginxinc/nginx-unprivileged:alpine",
    port=8080,
    cpu="250m",
    memory="256Mi",
)
```

## deploy() Parameters

```python
client.deploy(
    name="my-app",              # Required: DNS-safe name
    source="app.py",            # File path, inline code, or callable
    image="python:3.11-slim",   # Container image
    port=8000,                  # Application port
    env={"KEY": "value"},       # Environment variables
    cpu="500m",                 # CPU (500m = 0.5 cores)
    memory="512Mi",             # Memory (512Mi, 1Gi, etc.)
    storage=True,               # Enable storage at /data (or "/custom/path")
    gpu_count=1,                # Number of GPUs
    gpu_models=["A100"],        # Acceptable GPU models
    min_gpu_memory_gb=40,       # Minimum GPU VRAM
    replicas=1,                 # Number of instances
    ttl_seconds=3600,           # Auto-delete timeout
    public=True,                # Create public URL
    timeout=300,                # Deployment wait timeout
    pip_packages=["pkg"],       # pip dependencies
)
```

## Deployment Object

```python
deployment = client.deploy(...)

deployment.name           # Instance name
deployment.url            # Public URL
deployment.state          # Current state
deployment.status()       # Get detailed status
deployment.logs(tail=100) # Get logs
deployment.delete()       # Delete deployment
deployment.refresh()      # Refresh state from API
deployment.wait_until_ready(timeout=300)
```

## Exception Handling

```python
from basilica import (
    BasilicaError,        # Base exception
    AuthenticationError,  # Invalid/missing token
    ValidationError,      # Invalid parameters
    DeploymentNotFound,   # Deployment doesn't exist
    DeploymentTimeout,    # Timeout waiting for ready
    DeploymentFailed,     # Deployment crashed
    ResourceError,        # Resource unavailable (no GPUs)
    StorageError,         # Storage configuration error
    NetworkError,         # API communication error
)

try:
    deployment = client.deploy(...)
except DeploymentTimeout:
    print("Deployment took too long to start")
except DeploymentFailed as e:
    print(f"Deployment failed: {e}")
except AuthenticationError:
    print("Invalid API token")
```

## Low-Level API

```python
# Create deployment with full control
response = client.create_deployment(
    instance_name="my-app",
    image="python:3.11-slim",
    command=["python", "-m", "http.server", "8000"],
    port=8000,
    cpu="1",
    memory="1Gi",
)

# Direct API methods
client.get_deployment("name")
client.delete_deployment("name")
client.list_deployments()
client.get_deployment_logs("name", tail=100)
```

## GPU Rentals (SSH Access)

```python
# List available nodes
nodes = client.list_nodes(gpu_type="A100", min_gpu_count=1)

# Start rental
rental = client.start_rental(
    gpu_type="A100",
    container_image="pytorch/pytorch:latest",
)

# Get SSH credentials
status = client.get_rental(rental.rental_id)
print(f"SSH: {status.ssh_credentials.username}@{status.ssh_credentials.host}")

# Stop rental
client.stop_rental(rental.rental_id)
```

## Async API

All methods have async variants:

```python
import asyncio
from basilica import BasilicaClient

async def main():
    client = BasilicaClient()

    # Async deployment
    deployment = await client.deploy_async("my-app", source="app.py")

    # Async operations
    status = await deployment.status_async()
    logs = await deployment.logs_async(tail=50)
    await deployment.delete_async()

    # Concurrent deployments
    tasks = [
        client.deploy_async("app1", source="a.py"),
        client.deploy_async("app2", source="b.py"),
    ]
    deployments = await asyncio.gather(*tasks)

asyncio.run(main())
```

## Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `BASILICA_API_TOKEN` | API authentication token | Required |
| `BASILICA_API_URL` | API endpoint URL | `https://api.basilica.ai` |

---

## Troubleshooting

### Authentication Errors

```bash
# "Not authenticated" or "Invalid token"
basilica logout
basilica login

# Token expired (re-authenticate)
basilica login

# Device flow for headless environments (WSL, SSH, containers)
basilica login --device-code

# API token issues - regenerate
basilica tokens revoke old-token -y
basilica tokens create new-token
export BASILICA_API_TOKEN="basilica_..."
```

### Rate Limits

```bash
# Error: "Rate limit exceeded"
# Wait 60 seconds and retry, or reduce request frequency

# For bulk operations, add delays:
for name in $(basilica deploy ls --json | jq -r '.[].name'); do
  basilica deploy delete "$name" -y
  sleep 2
done
```

### Common Errors

| Error | Cause | Fix |
|-------|-------|-----|
| "No SSH key registered" | Secure cloud requires SSH key | `basilica ssh-keys add` |
| "Insufficient balance" | Not enough funds | `basilica fund` to deposit |
| "No available nodes" | No matching GPUs available | Relax filters or try later |
| "Deployment timeout" | App took too long to start | Check logs: `basilica deploy logs <name>` |
| "Name already exists" | Deployment name collision | Use `--name` with unique name |

## Multi-GPU & Distributed Training

### Single Node Multi-GPU

```bash
# 4x H100 on one node
basilica up 4xh100

# Deploy with multiple GPUs
basilica deploy train.py --gpu 4 --gpu-model H100 --memory 64Gi
```

### Multi-Node (via multiple rentals)

```bash
# Start multiple rentals for distributed training
basilica up h100 --name node-0 -d
basilica up h100 --name node-1 -d
basilica up h100 --name node-2 -d

# Get IPs for torch.distributed setup
basilica ps --json | jq -r '.[] | "\(.name): \(.ssh_host)"'

# Execute on each node
basilica exec "torchrun --nnodes=3 --node_rank=0 train.py" --target <node-0-uid>
```

### vLLM with Tensor Parallelism

```bash
# Multi-GPU inference with tensor parallelism
basilica deploy vllm meta-llama/Llama-3-70b \
  --gpu 4 \
  --tensor-parallel-size 4 \
  --memory 128Gi
```

## Deployment Edge Cases

### Name Collisions

```bash
# Error: "Deployment 'my-app' already exists"

# Option 1: Delete existing and redeploy
basilica deploy delete my-app -y
basilica deploy app.py --name my-app

# Option 2: Use unique name (timestamp suffix)
basilica deploy app.py --name "my-app-$(date +%s)"

# Option 3: Let CLI auto-generate name (omit --name)
basilica deploy app.py
```

### Deployment Retries

```bash
# If deployment fails, check logs first
basilica deploy logs <name> --tail 200

# Common fixes:
# 1. Wrong port - app must listen on specified port
basilica deploy app.py --port 8080  # Match your app's port

# 2. Missing dependencies
basilica deploy app.py --pip flask gunicorn

# 3. Insufficient resources
basilica deploy app.py --memory 2Gi --cpu 1

# 4. GPU image mismatch
basilica deploy train.py --gpu 1 --image pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
```

### Stuck Deployments

```bash
# Deployment stuck in "Pending" - check status
basilica deploy status <name>

# Force delete stuck deployment
basilica deploy delete <name> -y

# If delete hangs, wait for grace period (default 30s) or contact support
```

## Terminating Stale Rentals

### Find Stale Rentals

```bash
# List all active rentals with timestamps
basilica ps --json | jq '.[] | {name, created_at, status}'

# Find rentals older than 24 hours (example)
basilica ps --json | jq '[.[] | select(.status == "active")] | sort_by(.created_at)'
```

### Safe Termination

```bash
# Always check what's running first
basilica ps

# Stop specific rental
basilica down <uid>

# Stop ALL rentals (use with caution)
basilica down --all

# For community cloud rentals
basilica down --compute community-cloud --all
```

### Cleanup Script

```bash
#!/bin/bash
# cleanup-stale.sh - Terminate all rentals and deployments

echo "=== Active Rentals ==="
basilica ps

echo -e "\n=== Active Deployments ==="
basilica deploy ls

read -p "Terminate all? (y/N) " confirm
if [[ "$confirm" == "y" ]]; then
  # Stop all rentals
  basilica down --all 2>/dev/null || true
  
  # Delete all deployments
  for name in $(basilica deploy ls --json 2>/dev/null | jq -r '.[].name'); do
    echo "Deleting deployment: $name"
    basilica deploy delete "$name" -y
    sleep 1
  done
  
  echo "Cleanup complete."
fi
```

### Prevent Runaway Costs

```bash
# Use TTL for auto-cleanup (deployments)
basilica deploy app.py --ttl 3600  # Auto-delete after 1 hour

# Set billing alerts in the dashboard
# Monitor balance
basilica balance
```