home / skills / openclaw / skills / audio-upload-aioz-stream

audio-upload-aioz-stream skill

safe

/skills/vinhbui3004/audio-upload-aioz-stream

This skill uploads audio to AIOZ Stream via API, creates objects, uploads data, and returns a streaming link for easy playback.

npx playbooks add skill openclaw/skills --skill audio-upload-aioz-stream

Review the files below or copy the command above to add this skill to your agents.

Files (8)

SKILL.md

7.1 KB

---
name: aioz-stream-audio-upload
description: Quick upload audio to AIOZ Stream API. Create audio objects with default or custom encoding configurations, upload the file, complete the upload, then return the audio link to the user.
metadata:
  openclaw:
    emoji: "🎵"
    requires:
      bins: 
        - curl
        - jq
        - md5sum
---

# AIOZ Stream Audio Upload

Upload audio to AIOZ Stream API quickly with API key authentication. The full upload flow requires 3 API calls: Create → Upload Part → Complete.

## When to use this skill

- User wants to upload or create an audio on AIOZ Stream
- User mentions "upload audio", "create audio", "aioz stream audio"
- User wants to get an HLS streaming link for their audio

## Authentication

This skill uses API key authentication. The user must provide:

- `stream-public-key`: their AIOZ Stream public key
- `stream-secret-key`: their AIOZ Stream secret key

Ask the user for these keys if not provided. They will be sent as HTTP headers on ALL API calls.

## Usage Options

When the user wants to upload audio, ask them to choose:

### Option 1: Default Upload (Quick)

Creates an audio object with minimal config — just a title. Then uploads the file.

Example user prompt:
> "Upload audio file /path/to/audio.mp3 with title My Podcast"

### Option 2: Custom Upload (Advanced)

Creates an audio object with full encoding configuration including quality presets, bitrate, sample rate, tags, metadata, etc. Then uploads the file.

Example user prompt:
> "Upload audio with custom config: title My Podcast, highest quality HLS, 320kbps, 48000Hz, tags podcast,tech"

## Full Upload Flow (3 Steps)

### Step 1: Create Audio Object

**Default:**

```bash
curl -s -X POST 'https://api-w3stream.attoaioz.cyou/api/videos/create' \
  -H 'stream-public-key: PUBLIC_KEY' \
  -H 'stream-secret-key: SECRET_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "title": "AUDIO_TITLE",
    "type": "audio"
  }'
```

**Custom (with encoding config):**

```bash
curl -s -X POST 'https://api-w3stream.attoaioz.cyou/api/videos/create' \
  -H 'stream-public-key: PUBLIC_KEY' \
  -H 'stream-secret-key: SECRET_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "title": "AUDIO_TITLE",
    "type": "audio",
    "description": "DESCRIPTION",
    "is_public": true,
    "tags": ["tag1", "tag2"],
    "metadata": [
      {"key": "KEY", "value": "VALUE"}
    ],
    "qualities": [
      {
        "resolution": "highest",
        "type": "hls",
        "container_type": "mpegts",
        "audio_config": {
          "codec": "aac",
          "bitrate": 320000,
          "channels": "2",
          "sample_rate": 48000,
          "language": "en",
          "index": 0
        }
      },
      {
        "resolution": "standard",
        "type": "hls",
        "container_type": "mpegts",
        "audio_config": {
          "codec": "aac",
          "bitrate": 128000,
          "channels": "2",
          "sample_rate": 44100,
          "language": "en",
          "index": 0
        }
      }
    ]
  }'
```

Response: Extract `data.id` — this is the `AUDIO_ID` used in the next steps.

### Step 2: Upload File Part

Upload the actual audio file binary to the created audio object.

First, get the file size and compute the MD5 hash:

```bash
# Get file size (cross-platform compatible)
FILE_SIZE=$(stat -f%z /path/to/audio.mp3 2>/dev/null || stat -c%s /path/to/audio.mp3)
END_POS=$((FILE_SIZE - 1))

# Compute MD5 hash
HASH=$(md5sum /path/to/audio.mp3 | awk '{print $1}')
```

Then upload via multipart form-data with the Content-Range header:

```bash
curl -s -X POST "https://api-w3stream.attoaioz.cyou/api/videos/AUDIO_ID/part" \
  -H 'stream-public-key: PUBLIC_KEY' \
  -H 'stream-secret-key: SECRET_KEY' \
  -H "Content-Range: bytes 0-$END_POS/$FILE_SIZE" \
  -F "file=@/path/to/audio.mp3" \
  -F "index=0" \
  -F "hash=$HASH"
```

**Important:** The `Content-Range` header is required for the upload to succeed. Format: `bytes {start}-{end}/{total_size}` where:
- For single-part uploads: `start=0`, `end=file_size-1`, `total_size=file_size`
- For multi-part uploads: adjust start/end positions for each chunk

Form-data fields:
- `file`: the audio file binary (use `@/path/to/file`)
- `index`: 0 (for single-part upload, increment for multi-part)
- `hash`: MD5 hash of the file part

### Step 3: Complete Upload

After the file part is uploaded, call the complete endpoint to finalize:

```bash
curl -s -X GET "https://api-w3stream.attoaioz.cyou/api/videos/AUDIO_ID/complete" \
  -H 'accept: application/json' \
  -H 'stream-public-key: PUBLIC_KEY' \
  -H 'stream-secret-key: SECRET_KEY'
```

This triggers transcoding. The upload is now considered successful.

## After Upload — Get Audio Link

After completing the upload, fetch the audio detail to get the streaming URL:

```bash
curl -s 'https://api-w3stream.attoaioz.cyou/api/videos/AUDIO_ID' \
  -H 'stream-public-key: PUBLIC_KEY' \
  -H 'stream-secret-key: SECRET_KEY'
```

Parse the response to find the HLS URL from the `assets` or `hls` field and return it to the user.

**Important:** Audio outputs do NOT have an `mp4_url` field. Only HLS/DASH streaming links are available.

## Custom Upload Config Reference

### Quality Presets (`resolution` field):
- `standard` — Standard quality
- `good` — Good quality
- `highest` — Highest quality
- `lossless` — Lossless quality

### Streaming Formats (`type` field):
- `hls` — HTTP Live Streaming (container: `mpegts` or `mp4`)
- `dash` — Dynamic Adaptive Streaming (container: `fmp4`)

### Audio Config:
- `codec`: `aac` (only supported codec)
- `bitrate`: integer in bits/sec (e.g., 128000, 256000, 320000)
- `channels`: "2" (stereo)
- `sample_rate`: 8000, 11025, 16000, 22050, 32000, 44100, 48000, 88200, 96000
- `language`: BCP 47 code (e.g., `en`, `vi`)
- `index`: 0

### Recommended bitrates:
- Podcast/Voice: 64000 - 128000 bps
- Music standard: 128000 - 192000 bps
- Music high quality: 192000 - 256000 bps
- Music highest: 256000 - 320000 bps

### Recommended sample rates:
- Voice: 22050 or 32000
- Music: 44100 or 48000

## Response Handling

1. Parse the JSON response from the create call → extract `data.id`
2. Compute MD5 hash of the audio file
3. Upload the file part with the hash
4. Call complete endpoint
5. Fetch audio detail to get streaming URL
6. Return the audio link to the user
7. If the audio is still transcoding (status: transcoding), inform the user and suggest checking back later

## Error Handling

- **401**: Invalid API keys — ask user to verify their public and secret keys
- **400**: Bad request — check the request body format
- **500**: Server error — suggest retrying

## Example Interaction Flow

1. User: "Upload my audio to AIOZ Stream"
2. Ask for API keys (public + secret) if not known
3. Ask for the audio file path
4. Ask: "Default upload (quick) or custom config?"
   - If default: ask for title only
   - If custom: ask for title, quality preset, bitrate, sample rate, tags, etc.
5. **Step 1:** Create audio object → get `AUDIO_ID`
6. **Step 2:** Compute file hash, upload file part
7. **Step 3:** Call complete endpoint
8. Fetch audio detail → return streaming URL to user

Overview

This skill uploads audio files to the AIOZ Stream API and returns an HLS streaming link. It supports a quick default flow or a detailed custom encoding configuration, handling create → upload part → complete steps automatically. API key authentication (public + secret) is required.

How this skill works

The skill creates an audio object on AIOZ Stream (minimal or custom encoding), uploads the audio file as one or more parts with Content-Range and MD5 hash verification, then calls the complete endpoint to trigger transcoding. Finally it fetches the audio details and extracts the HLS streaming URL to return to the user.

When to use it

You need an HLS streaming link for an audio file.
You want a fast default upload with minimal inputs.
You need custom encoding settings (bitrate, sample rate, tags, metadata).
You want automated multi-step upload handling (create → part → complete).
You need guidance when uploads fail or are transcoding.

Best practices

Collect the AIOZ Stream public and secret keys before starting; both are sent as headers on every request.
For simple use, use the default create flow with a title; use custom only when specific quality or metadata is required.
Compute the MD5 hash and set Content-Range correctly; single-part uploads use bytes 0-(size-1)/size.
Choose recommended bitrates and sample rates: voice (64–128 kbps, 22050/32000 Hz), music (128–320 kbps, 44100/48000 Hz).
After complete, poll the audio detail and inform the user if status is transcoding; return the HLS URL when available.

Example use cases

Quickly upload a podcast episode and get an HLS link for embedding.
Upload a music track with high-quality settings (320 kbps, 48 kHz) and tags/metadata for discovery.
Batch or scripted uploads where you compute MD5 and post parts programmatically.
Support a web uploader that creates the audio object, streams the file part, and finalizes the upload for playback.

FAQ

What API keys are required?

You must provide your stream-public-key and stream-secret-key; those headers are required on every API call.

What streaming formats are returned?

Audio uploads return HLS (and possibly DASH) streaming URLs. There is no mp4_url for audio outputs.

What causes a 401 or 400 error?

401 means invalid API keys; ask the user to verify them. 400 indicates a malformed request body or missing required fields.