home / skills / supercent-io / skills-template / kling-ai

kling-ai skill

/.agent-skills/utilities/kling-ai

This skill helps you generate AI images and videos from text prompts or existing images using Kling AI, accelerating media creation workflows.

npx playbooks add skill supercent-io/skills-template --skill kling-ai

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
16.6 KB
---
name: kling-ai
description: Generate images and videos using Kling AI API. Use when creating AI-generated images from text prompts, converting images to videos, or generating videos from text descriptions.
tags: [ai, image-generation, video-generation, kling, text-to-image, text-to-video, image-to-video]
platforms: [Claude, ChatGPT, Gemini]
---

# Kling AI - Image & Video Generation

## When to use this skill
- Generating images from text prompts (Text-to-Image)
- Creating videos from text descriptions (Text-to-Video)
- Animating static images into videos (Image-to-Video)
- Producing high-quality AI video content (up to 1080p, 30fps)
- Building applications with AI-generated media

## Instructions

### Step 1: Authentication Setup

Kling AI uses JWT (JSON Web Tokens) for authentication.

**Required credentials**:
- Access Key (AK) - Public identifier
- Secret Key (SK) - Private signing key

**Get credentials**: [app.klingai.com/global/dev](https://app.klingai.com/global/dev)

**Generate JWT token (Python)**:
```python
import jwt
import time

def generate_kling_token(access_key: str, secret_key: str) -> str:
    """Generate JWT token for Kling AI API."""
    headers = {
        "alg": "HS256",
        "typ": "JWT"
    }
    payload = {
        "iss": access_key,
        "exp": int(time.time()) + 1800,  # 30 minutes
        "nbf": int(time.time()) - 5
    }
    return jwt.encode(payload, secret_key, algorithm="HS256", headers=headers)

# Usage
access_key = "YOUR_ACCESS_KEY"
secret_key = "YOUR_SECRET_KEY"
token = generate_kling_token(access_key, secret_key)
```

**Headers for all requests**:
```python
headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}
```

### Step 2: Text-to-Image Generation

**Endpoint**: `POST https://api.klingai.com/v1/images/generations`

**Request**:
```python
import requests

def generate_image(prompt: str, token: str) -> dict:
    """Generate image from text prompt."""
    url = "https://api.klingai.com/v1/images/generations"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
    data = {
        "model_name": "kling-v1",
        "prompt": prompt,
        "negative_prompt": "blur, distortion, low quality",
        "n": 1,
        "aspect_ratio": "16:9",  # 16:9, 9:16, 1:1, 4:3, 3:4, 3:2, 2:3
        "image_fidelity": 0.5,   # 0-1
        "callback_url": ""       # Optional webhook
    }
    response = requests.post(url, json=data, headers=headers)
    return response.json()

# Usage
result = generate_image(
    prompt="A futuristic cityscape at sunset, cinematic lighting, ultra-realistic",
    token=token
)
task_id = result["data"]["task_id"]
```

**Parameters**:
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `model_name` | string | Yes | `kling-v1` |
| `prompt` | string | Yes | Image description (max 500 chars) |
| `negative_prompt` | string | No | Elements to avoid |
| `n` | integer | No | Number of images (1-9, default 1) |
| `aspect_ratio` | string | No | `16:9`, `9:16`, `1:1`, `4:3`, `3:4` |
| `image_fidelity` | float | No | 0-1, default 0.5 |
| `callback_url` | string | No | Webhook for completion |

### Step 3: Text-to-Video Generation

**Endpoint**: `POST https://api.klingai.com/v1/videos/text2video`

**Request**:
```python
def generate_video_from_text(prompt: str, token: str, duration: int = 5) -> dict:
    """Generate video from text prompt."""
    url = "https://api.klingai.com/v1/videos/text2video"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
    data = {
        "model_name": "kling-v1",           # kling-v1, kling-v1-5, kling-v1-6
        "prompt": prompt,
        "negative_prompt": "blur, distortion",
        "duration": duration,                # 5 or 10 seconds
        "aspect_ratio": "16:9",             # 16:9, 9:16, 1:1
        "mode": "std",                       # std (standard) or pro
        "cfg_scale": 0.5,                    # 0-1
        "camera_control": {
            "type": "simple",
            "config": {
                "horizontal": 0,             # -10 to 10
                "vertical": 0,               # -10 to 10
                "zoom": 0,                   # -10 to 10
                "tilt": 0,                   # -10 to 10
                "pan": 0,                    # -10 to 10
                "roll": 0                    # -10 to 10
            }
        },
        "callback_url": ""
    }
    response = requests.post(url, json=data, headers=headers)
    return response.json()

# Usage
result = generate_video_from_text(
    prompt="A dragon flying over mountains at sunset, epic cinematic shot",
    token=token,
    duration=5
)
task_id = result["data"]["task_id"]
```

**Camera control presets**:
```python
camera_presets = {
    "zoom_in": {"zoom": 5},
    "zoom_out": {"zoom": -5},
    "pan_left": {"horizontal": -5},
    "pan_right": {"horizontal": 5},
    "tilt_up": {"vertical": 5},
    "tilt_down": {"vertical": -5},
    "dolly_forward": {"zoom": 3, "vertical": 2},
    "orbit_right": {"horizontal": 5, "pan": 3}
}
```

### Step 4: Image-to-Video Generation

**Endpoint**: `POST https://api.klingai.com/v1/videos/image2video`

**Request**:
```python
import base64

def generate_video_from_image(
    image_path: str,
    prompt: str,
    token: str,
    duration: int = 5
) -> dict:
    """Animate an image into a video."""
    url = "https://api.klingai.com/v1/videos/image2video"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
    
    # Load and encode image
    with open(image_path, "rb") as f:
        image_base64 = base64.b64encode(f.read()).decode()
    
    data = {
        "model_name": "kling-v1",
        "image": image_base64,              # Base64 or URL
        "image_tail": "",                   # Optional: end frame image
        "prompt": prompt,
        "negative_prompt": "blur, distortion",
        "duration": duration,
        "mode": "std",
        "cfg_scale": 0.5,
        "callback_url": ""
    }
    response = requests.post(url, json=data, headers=headers)
    return response.json()

# Usage
result = generate_video_from_image(
    image_path="portrait.jpg",
    prompt="Person smiling and waving at camera",
    token=token
)
```

**With URL instead of base64**:
```python
data = {
    "model_name": "kling-v1",
    "image": "https://example.com/image.jpg",
    "prompt": "The subject slowly turns their head",
    "duration": 5
}
```

### Step 5: Check Task Status (Polling)

All generation tasks are asynchronous. Poll for completion:

**Endpoint**: `GET https://api.klingai.com/v1/videos/text2video/{task_id}`

```python
import time

def poll_task_status(task_id: str, token: str, task_type: str = "text2video") -> dict:
    """Poll task status until completion."""
    url = f"https://api.klingai.com/v1/videos/{task_type}/{task_id}"
    headers = {"Authorization": f"Bearer {token}"}
    
    while True:
        response = requests.get(url, headers=headers)
        result = response.json()
        status = result["data"]["task_status"]
        
        print(f"Status: {status}")
        
        if status == "succeed":
            return result
        elif status == "failed":
            raise Exception(f"Task failed: {result['data'].get('task_status_msg')}")
        
        time.sleep(10)  # Poll every 10 seconds

# Usage
result = poll_task_status(task_id, token)
video_url = result["data"]["task_result"]["videos"][0]["url"]
print(f"Video URL: {video_url}")
```

**Task status values**:
| Status | Description |
|--------|-------------|
| `submitted` | Task queued |
| `processing` | Currently generating |
| `succeed` | Completed successfully |
| `failed` | Generation failed |

### Step 6: Download Generated Media

```python
def download_media(url: str, output_path: str) -> None:
    """Download generated image or video."""
    response = requests.get(url, stream=True)
    with open(output_path, "wb") as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)
    print(f"Downloaded: {output_path}")

# Usage
download_media(video_url, "output.mp4")
```

## Examples

### Example 1: Complete Text-to-Video Workflow

```python
import jwt
import time
import requests

class KlingClient:
    BASE_URL = "https://api.klingai.com/v1"
    
    def __init__(self, access_key: str, secret_key: str):
        self.access_key = access_key
        self.secret_key = secret_key
    
    def _get_token(self) -> str:
        headers = {"alg": "HS256", "typ": "JWT"}
        payload = {
            "iss": self.access_key,
            "exp": int(time.time()) + 1800,
            "nbf": int(time.time()) - 5
        }
        return jwt.encode(payload, self.secret_key, algorithm="HS256", headers=headers)
    
    def _headers(self) -> dict:
        return {
            "Authorization": f"Bearer {self._get_token()}",
            "Content-Type": "application/json"
        }
    
    def text_to_video(self, prompt: str, duration: int = 5, mode: str = "std") -> str:
        """Generate video and return URL."""
        # Create task
        response = requests.post(
            f"{self.BASE_URL}/videos/text2video",
            json={
                "model_name": "kling-v1",
                "prompt": prompt,
                "duration": duration,
                "mode": mode,
                "aspect_ratio": "16:9"
            },
            headers=self._headers()
        )
        task_id = response.json()["data"]["task_id"]
        
        # Poll for completion
        while True:
            result = requests.get(
                f"{self.BASE_URL}/videos/text2video/{task_id}",
                headers=self._headers()
            ).json()
            
            status = result["data"]["task_status"]
            if status == "succeed":
                return result["data"]["task_result"]["videos"][0]["url"]
            elif status == "failed":
                raise Exception("Generation failed")
            
            time.sleep(10)

# Usage
client = KlingClient("YOUR_ACCESS_KEY", "YOUR_SECRET_KEY")
video_url = client.text_to_video(
    prompt="A majestic eagle soaring through clouds at golden hour",
    duration=5
)
print(f"Generated video: {video_url}")
```

### Example 2: Batch Image Generation

```python
def batch_generate_images(prompts: list, token: str) -> list:
    """Generate multiple images in parallel."""
    import concurrent.futures
    
    results = []
    
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        futures = {
            executor.submit(generate_image, prompt, token): prompt
            for prompt in prompts
        }
        
        for future in concurrent.futures.as_completed(futures):
            prompt = futures[future]
            try:
                result = future.result()
                results.append({"prompt": prompt, "task_id": result["data"]["task_id"]})
            except Exception as e:
                results.append({"prompt": prompt, "error": str(e)})
    
    return results

# Usage
prompts = [
    "A serene Japanese garden with cherry blossoms",
    "A cyberpunk street scene at night",
    "An underwater coral reef teeming with life"
]
tasks = batch_generate_images(prompts, token)
```

### Example 3: Image-to-Video with Motion Control

```python
def animate_portrait(image_url: str, token: str) -> str:
    """Animate a portrait with subtle motion."""
    url = "https://api.klingai.com/v1/videos/image2video"
    
    response = requests.post(
        url,
        json={
            "model_name": "kling-v1",
            "image": image_url,
            "prompt": "Subtle head movement, natural blinking, gentle smile",
            "negative_prompt": "distortion, unnatural movement, glitch",
            "duration": 5,
            "mode": "pro",  # Higher quality for portraits
            "cfg_scale": 0.7
        },
        headers={
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        }
    )
    
    task_id = response.json()["data"]["task_id"]
    return poll_task_status(task_id, token, "image2video")

# Usage
result = animate_portrait("https://example.com/portrait.jpg", token)
```

### Example 4: Video with Camera Movement

```python
def cinematic_video(prompt: str, camera_motion: str, token: str) -> str:
    """Generate video with specific camera movement."""
    camera_configs = {
        "epic_reveal": {"zoom": -5, "vertical": 3},
        "dramatic_push": {"zoom": 5},
        "orbit_subject": {"horizontal": 7, "pan": 3},
        "crane_up": {"vertical": 5, "zoom": 2},
        "dolly_back": {"zoom": -3, "horizontal": 2}
    }
    
    config = camera_configs.get(camera_motion, {})
    
    response = requests.post(
        "https://api.klingai.com/v1/videos/text2video",
        json={
            "model_name": "kling-v1-6",
            "prompt": prompt,
            "duration": 10,
            "mode": "pro",
            "aspect_ratio": "16:9",
            "camera_control": {
                "type": "simple",
                "config": {
                    "horizontal": config.get("horizontal", 0),
                    "vertical": config.get("vertical", 0),
                    "zoom": config.get("zoom", 0),
                    "tilt": config.get("tilt", 0),
                    "pan": config.get("pan", 0),
                    "roll": config.get("roll", 0)
                }
            }
        },
        headers={
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        }
    )
    
    task_id = response.json()["data"]["task_id"]
    return poll_task_status(task_id, token)

# Usage
video_url = cinematic_video(
    prompt="Ancient temple emerging from jungle mist",
    camera_motion="epic_reveal",
    token=token
)
```

## Model Comparison

### Video Models

| Model | Resolution | Duration | Quality | Cost |
|-------|------------|----------|---------|------|
| `kling-v1` (std) | 720p | 5-10s | Good | $ |
| `kling-v1` (pro) | 1080p | 5-10s | Better | $$ |
| `kling-v1-5` (std) | 720p | 5-10s | Good | $ |
| `kling-v1-5` (pro) | 1080p | 5-10s | Better | $$ |
| `kling-v1-6` (std) | 720p | 5-10s | Great | $$ |
| `kling-v1-6` (pro) | 1080p | 5-10s | Excellent | $$$ |

### Image Models

| Model | Max Resolution | Quality |
|-------|----------------|---------|
| `kling-v1` | 1024x1024 | Standard |
| `kling-v1` (2k) | 2048x2048 | High |

## Pricing Reference

| Feature | Standard Mode | Pro Mode |
|---------|---------------|----------|
| 5s video | ~$0.14 | ~$0.49 |
| 10s video | ~$0.28 | ~$0.98 |
| Image | ~$0.01 | - |

## Best practices

1. **Prompt quality**: Be specific and descriptive. Include style keywords like "cinematic", "photorealistic", "4K"
2. **Negative prompts**: Always include to avoid common artifacts: "blur, distortion, low quality, watermark"
3. **Start with standard mode**: Test with `std` mode first, upgrade to `pro` for final output
4. **Use 5-second videos**: More cost-effective for testing; use 10s only when needed
5. **Handle rate limits**: Implement exponential backoff for retries
6. **Cache tokens**: Reuse JWT tokens within their validity period (30 min recommended)
7. **Webhook for production**: Use `callback_url` instead of polling in production environments

## Common pitfalls

- **Token expiration**: JWT tokens expire. Regenerate before making requests if expired
- **Prompt length**: Text-to-image max 500 chars, text-to-video max 2500 chars
- **Image format**: Image-to-video supports JPEG, PNG. Max size varies by model
- **Polling too fast**: Don't poll faster than every 5-10 seconds to avoid rate limits
- **Missing negative prompt**: Always include to prevent common artifacts
- **Wrong aspect ratio**: Match aspect ratio to your use case (16:9 for landscape, 9:16 for portrait)

## Error Handling

```python
class KlingAPIError(Exception):
    pass

def handle_kling_response(response: requests.Response) -> dict:
    """Handle API response with proper error handling."""
    data = response.json()
    
    if response.status_code == 401:
        raise KlingAPIError("Authentication failed. Check your credentials.")
    elif response.status_code == 429:
        raise KlingAPIError("Rate limit exceeded. Wait and retry.")
    elif response.status_code >= 400:
        error_msg = data.get("message", "Unknown error")
        raise KlingAPIError(f"API Error: {error_msg}")
    
    if data.get("code") != 0:
        raise KlingAPIError(f"Request failed: {data.get('message')}")
    
    return data
```

## References

- [Kling AI Developer Portal](https://app.klingai.com/global/dev)
- [API Documentation](https://app.klingai.com/global/dev/document-api/apiReference)
- [Python SDK (Community)](https://github.com/TechWithTy/kling)
- [Pricing Guide](https://klingaiprompt.com/blog/kling-ai-api-pricing/)

Overview

This skill integrates with the Kling AI API to generate high-quality images and short videos from text prompts or animate existing images. It supports text-to-image, text-to-video, and image-to-video workflows with configurable aspects like aspect ratio, duration, camera motion, and fidelity. Use it to quickly prototype visual assets or produce final media up to 1080p at 30fps.

How this skill works

Authenticate using an Access Key (AK) and Secret Key (SK) to generate a short-lived JWT used in all requests. Submit async generation tasks to endpoints for images, text-to-video, or image-to-video. Poll the provided task ID until the task status is succeed or failed, then download the resulting media URL. Camera control, fidelity, negative prompts, and callbacks are supported to tune outputs.

When to use it

  • Generate concept art, hero images, or thumbnails from text prompts.
  • Produce short marketing or social videos directly from descriptions.
  • Animate portraits or static artwork into subtle motion videos.
  • Batch-generate multiple images for A/B creative testing.
  • Prototype cinematic camera moves with simple presets.

Best practices

  • Keep prompts clear and under 500 characters; use negative_prompt to exclude unwanted artifacts.
  • Start with image_fidelity ~0.5 and adjust toward 0 or 1 to trade creativity vs. faithfulness.
  • Use short polling intervals (10s) and handle failed states with retries or fallbacks.
  • Prefer pro mode for portrait or high-detail video outputs (1080p) and std for faster, cheaper results.
  • Encode images as base64 or provide a stable URL when using image-to-video to avoid upload issues.

Example use cases

  • Text-to-image: create 16:9 hero images for blog headers or ads.
  • Text-to-video: generate 5–10s cinematic clips for social media from a script line.
  • Image-to-video: animate a portrait with natural blink and head turns for an about page.
  • Batch image generation: produce multiple style variants in parallel for creative review.
  • Cinematic workflow: apply camera_presets (zoom, pan, orbit) to create dynamic scene reveals.

FAQ

How do I authenticate requests?

Generate a JWT using your Access Key (AK) and Secret Key (SK); include it as Authorization: Bearer <token> in request headers.

How long do generation tasks take?

Tasks are asynchronous; typical turnaround is a few tens of seconds to a few minutes depending on model, mode, and queue; poll every 10 seconds for updates.