home / skills / sickn33 / antigravity-awesome-skills / azure-cosmos-db-py

azure-cosmos-db-py skill

/skills/azure-cosmos-db-py

This skill helps you implement production-grade Azure Cosmos DB services in Python with dual authentication, clean code, and TDD patterns.

npx playbooks add skill sickn33/antigravity-awesome-skills --skill azure-cosmos-db-py

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
9.2 KB
---
name: azure-cosmos-db-py
description: Build Azure Cosmos DB NoSQL services with Python/FastAPI following production-grade patterns. Use when implementing database client setup with dual auth (DefaultAzureCredential + emulator), service layer classes with CRUD operations, partition key strategies, parameterized queries, or TDD patterns for Cosmos. Triggers on phrases like "Cosmos DB", "NoSQL database", "document store", "add persistence", "database service layer", or "Python Cosmos SDK".
package: azure-cosmos
---

# Cosmos DB Service Implementation

Build production-grade Azure Cosmos DB NoSQL services following clean code, security best practices, and TDD principles.

## Installation

```bash
pip install azure-cosmos azure-identity
```

## Environment Variables

```bash
COSMOS_ENDPOINT=https://<account>.documents.azure.com:443/
COSMOS_DATABASE_NAME=<database-name>
COSMOS_CONTAINER_ID=<container-id>
# For emulator only (not production)
COSMOS_KEY=<emulator-key>
```

## Authentication

**DefaultAzureCredential (preferred)**:
```python
from azure.cosmos import CosmosClient
from azure.identity import DefaultAzureCredential

client = CosmosClient(
    url=os.environ["COSMOS_ENDPOINT"],
    credential=DefaultAzureCredential()
)
```

**Emulator (local development)**:
```python
from azure.cosmos import CosmosClient

client = CosmosClient(
    url="https://localhost:8081",
    credential=os.environ["COSMOS_KEY"],
    connection_verify=False
)
```

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                         FastAPI Router                          │
│  - Auth dependencies (get_current_user, get_current_user_required)
│  - HTTP error responses (HTTPException)                         │
└──────────────────────────────┬──────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────┐
│                        Service Layer                            │
│  - Business logic and validation                                │
│  - Document ↔ Model conversion                                  │
│  - Graceful degradation when Cosmos unavailable                 │
└──────────────────────────────┬──────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────┐
│                     Cosmos DB Client Module                     │
│  - Singleton container initialization                           │
│  - Dual auth: DefaultAzureCredential (Azure) / Key (emulator)   │
│  - Async wrapper via run_in_threadpool                          │
└─────────────────────────────────────────────────────────────────┘
```

## Quick Start

### 1. Client Module Setup

Create a singleton Cosmos client with dual authentication:

```python
# db/cosmos.py
from azure.cosmos import CosmosClient
from azure.identity import DefaultAzureCredential
from starlette.concurrency import run_in_threadpool

_cosmos_container = None

def _is_emulator_endpoint(endpoint: str) -> bool:
    return "localhost" in endpoint or "127.0.0.1" in endpoint

async def get_container():
    global _cosmos_container
    if _cosmos_container is None:
        if _is_emulator_endpoint(settings.cosmos_endpoint):
            client = CosmosClient(
                url=settings.cosmos_endpoint,
                credential=settings.cosmos_key,
                connection_verify=False
            )
        else:
            client = CosmosClient(
                url=settings.cosmos_endpoint,
                credential=DefaultAzureCredential()
            )
        db = client.get_database_client(settings.cosmos_database_name)
        _cosmos_container = db.get_container_client(settings.cosmos_container_id)
    return _cosmos_container
```

**Full implementation**: See [references/client-setup.md](references/client-setup.md)

### 2. Pydantic Model Hierarchy

Use five-tier model pattern for clean separation:

```python
class ProjectBase(BaseModel):           # Shared fields
    name: str = Field(..., min_length=1, max_length=200)

class ProjectCreate(ProjectBase):       # Creation request
    workspace_id: str = Field(..., alias="workspaceId")

class ProjectUpdate(BaseModel):         # Partial updates (all optional)
    name: Optional[str] = Field(None, min_length=1)

class Project(ProjectBase):             # API response
    id: str
    created_at: datetime = Field(..., alias="createdAt")

class ProjectInDB(Project):             # Internal with docType
    doc_type: str = "project"
```

### 3. Service Layer Pattern

```python
class ProjectService:
    def _use_cosmos(self) -> bool:
        return get_container() is not None
    
    async def get_by_id(self, project_id: str, workspace_id: str) -> Project | None:
        if not self._use_cosmos():
            return None
        doc = await get_document(project_id, partition_key=workspace_id)
        if doc is None:
            return None
        return self._doc_to_model(doc)
```

**Full patterns**: See [references/service-layer.md](references/service-layer.md)

## Core Principles

### Security Requirements

1. **RBAC Authentication**: Use `DefaultAzureCredential` in Azure — never store keys in code
2. **Emulator-Only Keys**: Hardcode the well-known emulator key only for local development
3. **Parameterized Queries**: Always use `@parameter` syntax — never string concatenation
4. **Partition Key Validation**: Validate partition key access matches user authorization

### Clean Code Conventions

1. **Single Responsibility**: Client module handles connection; services handle business logic
2. **Graceful Degradation**: Services return `None`/`[]` when Cosmos unavailable
3. **Consistent Naming**: `_doc_to_model()`, `_model_to_doc()`, `_use_cosmos()`
4. **Type Hints**: Full typing on all public methods
5. **CamelCase Aliases**: Use `Field(alias="camelCase")` for JSON serialization

### TDD Requirements

Write tests BEFORE implementation using these patterns:

```python
@pytest.fixture
def mock_cosmos_container(mocker):
    container = mocker.MagicMock()
    mocker.patch("app.db.cosmos.get_container", return_value=container)
    return container

@pytest.mark.asyncio
async def test_get_project_by_id_returns_project(mock_cosmos_container):
    # Arrange
    mock_cosmos_container.read_item.return_value = {"id": "123", "name": "Test"}
    
    # Act
    result = await project_service.get_by_id("123", "workspace-1")
    
    # Assert
    assert result.id == "123"
    assert result.name == "Test"
```

**Full testing guide**: See [references/testing.md](references/testing.md)

## Reference Files

| File | When to Read |
|------|--------------|
| [references/client-setup.md](references/client-setup.md) | Setting up Cosmos client with dual auth, SSL config, singleton pattern |
| [references/service-layer.md](references/service-layer.md) | Implementing full service class with CRUD, conversions, graceful degradation |
| [references/testing.md](references/testing.md) | Writing pytest tests, mocking Cosmos, integration test setup |
| [references/partitioning.md](references/partitioning.md) | Choosing partition keys, cross-partition queries, move operations |
| [references/error-handling.md](references/error-handling.md) | Handling CosmosResourceNotFoundError, logging, HTTP error mapping |

## Template Files

| File | Purpose |
|------|---------|
| [assets/cosmos_client_template.py](assets/cosmos_client_template.py) | Ready-to-use client module |
| [assets/service_template.py](assets/service_template.py) | Service class skeleton |
| [assets/conftest_template.py](assets/conftest_template.py) | pytest fixtures for Cosmos mocking |

## Quality Attributes (NFRs)

### Reliability
- Graceful degradation when Cosmos unavailable
- Retry logic with exponential backoff for transient failures
- Connection pooling via singleton pattern

### Security
- Zero secrets in code (RBAC via DefaultAzureCredential)
- Parameterized queries prevent injection
- Partition key isolation enforces data boundaries

### Maintainability
- Five-tier model pattern enables schema evolution
- Service layer decouples business logic from storage
- Consistent patterns across all entity services

### Testability
- Dependency injection via `get_container()`
- Easy mocking with module-level globals
- Clear separation enables unit testing without Cosmos

### Performance
- Partition key queries avoid cross-partition scans
- Async wrapping prevents blocking FastAPI event loop
- Minimal document conversion overhead

Overview

This skill teaches how to build production-grade Azure Cosmos DB NoSQL services in Python and FastAPI, with secure dual authentication (DefaultAzureCredential + emulator). It provides a tested architecture: a singleton client module, a clear service layer with CRUD, five-tier Pydantic models, and TDD-ready testing patterns. The focus is pragmatic: secure defaults, partition key strategies, parameterized queries, and graceful degradation when Cosmos is unavailable.

How this skill works

The implementation creates a singleton Cosmos client that selects authentication based on endpoint (DefaultAzureCredential in Azure, key+verify disabled for local emulator). Service classes encapsulate business logic, perform document↔model conversions, validate partition keys, and call parameterized queries. Async calls are wrapped to avoid blocking FastAPI, and tests mock the container via fixtures so services can be exercised without a live Cosmos instance.

When to use it

  • You need a production-ready Cosmos DB client module with RBAC and emulator support.
  • Implementing a service layer that exposes CRUD for a document store in FastAPI.
  • Enforcing partition key validation and parameterized queries for security.
  • Writing tests that mock Cosmos container interactions before integrating with Azure.
  • Migrating a simple datastore to a scalable NoSQL pattern with clear separation of concerns.

Best practices

  • Use DefaultAzureCredential for cloud deployments; use emulator key only for local development.
  • Keep a singleton container client and wrap blocking SDK calls with run_in_threadpool.
  • Always use parameterized queries (@param) to prevent injection.
  • Validate partition keys against user authorization and return graceful defaults if Cosmos is unavailable.
  • Follow a five-tier Pydantic model pattern (Base / Create / Update / Response / InDB) for schema evolution.
  • Write tests first and mock get_container() to isolate service logic from Cosmos.

Example use cases

  • Create a ProjectService that implements get_by_id, list_by_workspace, create, update, and delete with partition key enforcement.
  • Set up CI tests that run unit tests with mocked Cosmos container fixtures without requiring network access.
  • Switch between Azure RBAC and local emulator seamlessly during development and deployment.
  • Implement cross-service consistency by converting documents to typed Pydantic models before returning API responses.
  • Add retry with exponential backoff and graceful fallback responses when Cosmos is temporarily unreachable.

FAQ

How do I authenticate in production versus local development?

In production use DefaultAzureCredential; for local development use the emulator endpoint and key, disabling SSL verification only for the emulator.

How should I structure Pydantic models for DB and API?

Use a five-tier pattern: shared Base, Create, Update, API Response, and InDB with internal doc_type to keep concerns separated and support migrations.