home / skills / sickn33 / antigravity-awesome-skills / azure-storage-blob-py

azure-storage-blob-py skill

/skills/azure-storage-blob-py

This skill helps you manage Azure Blob Storage in Python by uploading, downloading, listing, and organizing blobs and containers efficiently.

This is most likely a fork of the azure-storage-blob-py skill from openclaw
npx playbooks add skill sickn33/antigravity-awesome-skills --skill azure-storage-blob-py

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.8 KB
---
name: azure-storage-blob-py
description: |
  Azure Blob Storage SDK for Python. Use for uploading, downloading, listing blobs, managing containers, and blob lifecycle.
  Triggers: "blob storage", "BlobServiceClient", "ContainerClient", "BlobClient", "upload blob", "download blob".
package: azure-storage-blob
---

# Azure Blob Storage SDK for Python

Client library for Azure Blob Storage — object storage for unstructured data.

## Installation

```bash
pip install azure-storage-blob azure-identity
```

## Environment Variables

```bash
AZURE_STORAGE_ACCOUNT_NAME=<your-storage-account>
# Or use full URL
AZURE_STORAGE_ACCOUNT_URL=https://<account>.blob.core.windows.net
```

## Authentication

```python
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient

credential = DefaultAzureCredential()
account_url = "https://<account>.blob.core.windows.net"

blob_service_client = BlobServiceClient(account_url, credential=credential)
```

## Client Hierarchy

| Client | Purpose | Get From |
|--------|---------|----------|
| `BlobServiceClient` | Account-level operations | Direct instantiation |
| `ContainerClient` | Container operations | `blob_service_client.get_container_client()` |
| `BlobClient` | Single blob operations | `container_client.get_blob_client()` |

## Core Workflow

### Create Container

```python
container_client = blob_service_client.get_container_client("mycontainer")
container_client.create_container()
```

### Upload Blob

```python
# From file path
blob_client = blob_service_client.get_blob_client(
    container="mycontainer",
    blob="sample.txt"
)

with open("./local-file.txt", "rb") as data:
    blob_client.upload_blob(data, overwrite=True)

# From bytes/string
blob_client.upload_blob(b"Hello, World!", overwrite=True)

# From stream
import io
stream = io.BytesIO(b"Stream content")
blob_client.upload_blob(stream, overwrite=True)
```

### Download Blob

```python
blob_client = blob_service_client.get_blob_client(
    container="mycontainer",
    blob="sample.txt"
)

# To file
with open("./downloaded.txt", "wb") as file:
    download_stream = blob_client.download_blob()
    file.write(download_stream.readall())

# To memory
download_stream = blob_client.download_blob()
content = download_stream.readall()  # bytes

# Read into existing buffer
stream = io.BytesIO()
num_bytes = blob_client.download_blob().readinto(stream)
```

### List Blobs

```python
container_client = blob_service_client.get_container_client("mycontainer")

# List all blobs
for blob in container_client.list_blobs():
    print(f"{blob.name} - {blob.size} bytes")

# List with prefix (folder-like)
for blob in container_client.list_blobs(name_starts_with="logs/"):
    print(blob.name)

# Walk blob hierarchy (virtual directories)
for item in container_client.walk_blobs(delimiter="/"):
    if item.get("prefix"):
        print(f"Directory: {item['prefix']}")
    else:
        print(f"Blob: {item.name}")
```

### Delete Blob

```python
blob_client.delete_blob()

# Delete with snapshots
blob_client.delete_blob(delete_snapshots="include")
```

## Performance Tuning

```python
# Configure chunk sizes for large uploads/downloads
blob_client = BlobClient(
    account_url=account_url,
    container_name="mycontainer",
    blob_name="large-file.zip",
    credential=credential,
    max_block_size=4 * 1024 * 1024,  # 4 MiB blocks
    max_single_put_size=64 * 1024 * 1024  # 64 MiB single upload limit
)

# Parallel upload
blob_client.upload_blob(data, max_concurrency=4)

# Parallel download
download_stream = blob_client.download_blob(max_concurrency=4)
```

## SAS Tokens

```python
from datetime import datetime, timedelta, timezone
from azure.storage.blob import generate_blob_sas, BlobSasPermissions

sas_token = generate_blob_sas(
    account_name="<account>",
    container_name="mycontainer",
    blob_name="sample.txt",
    account_key="<account-key>",  # Or use user delegation key
    permission=BlobSasPermissions(read=True),
    expiry=datetime.now(timezone.utc) + timedelta(hours=1)
)

# Use SAS token
blob_url = f"https://<account>.blob.core.windows.net/mycontainer/sample.txt?{sas_token}"
```

## Blob Properties and Metadata

```python
# Get properties
properties = blob_client.get_blob_properties()
print(f"Size: {properties.size}")
print(f"Content-Type: {properties.content_settings.content_type}")
print(f"Last modified: {properties.last_modified}")

# Set metadata
blob_client.set_blob_metadata(metadata={"category": "logs", "year": "2024"})

# Set content type
from azure.storage.blob import ContentSettings
blob_client.set_http_headers(
    content_settings=ContentSettings(content_type="application/json")
)
```

## Async Client

```python
from azure.identity.aio import DefaultAzureCredential
from azure.storage.blob.aio import BlobServiceClient

async def upload_async():
    credential = DefaultAzureCredential()
    
    async with BlobServiceClient(account_url, credential=credential) as client:
        blob_client = client.get_blob_client("mycontainer", "sample.txt")
        
        with open("./file.txt", "rb") as data:
            await blob_client.upload_blob(data, overwrite=True)

# Download async
async def download_async():
    async with BlobServiceClient(account_url, credential=credential) as client:
        blob_client = client.get_blob_client("mycontainer", "sample.txt")
        
        stream = await blob_client.download_blob()
        data = await stream.readall()
```

## Best Practices

1. **Use DefaultAzureCredential** instead of connection strings
2. **Use context managers** for async clients
3. **Set `overwrite=True`** explicitly when re-uploading
4. **Use `max_concurrency`** for large file transfers
5. **Prefer `readinto()`** over `readall()` for memory efficiency
6. **Use `walk_blobs()`** for hierarchical listing
7. **Set appropriate content types** for web-served blobs

Overview

This skill provides a compact, practical guide to using the Azure Blob Storage SDK for Python. It covers authentication, the client hierarchy, core operations (upload, download, list, delete), performance tuning, SAS tokens, properties/metadata, and async usage. The content targets developers building file storage, backups, and streaming scenarios with Azure Blob Storage.

How this skill works

The skill explains how to authenticate with DefaultAzureCredential and construct BlobServiceClient, ContainerClient, and BlobClient for account-, container-, and blob-level operations. It shows patterns for uploading from files, bytes, and streams; downloading to files or memory; listing blobs and walking virtual directories; and deleting blobs. It also covers performance options (chunk sizes, concurrency), SAS token generation, setting metadata and content types, and async client usage.

When to use it

  • Upload and download files to Azure Blob Storage from Python apps
  • Stream large objects efficiently and control memory usage
  • List containers, blobs, and walk virtual directory hierarchies
  • Generate SAS tokens for short-lived external access
  • Tune parallel uploads/downloads for large files
  • Manage blob metadata, content types, and lifecycle operations

Best practices

  • Prefer DefaultAzureCredential over static connection strings for production security
  • Use context managers for async clients to ensure proper cleanup
  • Set overwrite=True explicitly when replacing blobs
  • Tune max_block_size and max_single_put_size for large objects to balance memory and performance
  • Use max_concurrency for parallel uploads/downloads to improve throughput
  • Prefer readinto() or streaming reads over readall() to avoid high memory usage

Example use cases

  • Backup application logs and rotate blobs using container-level operations
  • Serve static assets by setting proper Content-Type headers and using SAS tokens for access
  • Stream large media files to avoid loading full objects into memory
  • Implement folder-like listings and cleanup by walking blobs with a delimiter
  • Provide temporary shareable URLs for clients using generated SAS tokens

FAQ

How do I authenticate without storing keys in code?

Use DefaultAzureCredential to pick up managed identities, environment variables, or Azure CLI credentials instead of embedding account keys.

When should I use async clients?

Use async clients in asyncio-based applications or high-concurrency services to avoid blocking threads during large uploads/downloads.

How do I avoid high memory usage when downloading large blobs?

Stream the download and use readinto() into a preallocated buffer or write directly to a file rather than readall().