home / skills / openclaw / skills / geo-multimodal-tagger

geo-multimodal-tagger skill

safe

This skill generates AI-optimized metadata for images, videos, and audio to improve discoverability on Google Lens and ChatGPT Vision.

npx playbooks add skill openclaw/skills --skill geo-multimodal-tagger

Review the files below or copy the command above to add this skill to your agents.

Files (5)

SKILL.md

3.7 KB

---
name: geo-multimodal-tagger
description: Generate AI-optimized Alt Text, file names, captions, and Schema markup for images, videos, and audio assets. Improves AI discoverability on Google Lens, ChatGPT Vision, and Perplexity. Use whenever the user mentions optimizing images for AI, writing Alt Text, generating video Schema, tagging assets for AI discoverability, or making images visible in ChatGPT Vision and Google Lens.
---

# Multimodal Asset Tagger

> Methodology by **GEOly AI** (geoly.ai) — every image and video is a citation opportunity AI can either read or miss.

Generate optimized metadata for images, videos, and audio files for AI platforms.

## Quick Start

```bash
python scripts/optimize_asset.py --type image --description "dashboard showing metrics" --output optimized.md
```

## Why Multimodal Matters

AI platforms increasingly read visual content:

| Platform | Visual Capability | Citation Type |
|----------|-------------------|---------------|
| Google Lens | Image search | Direct image citation |
| ChatGPT Vision | Image understanding | Contextual reference |
| Perplexity | Video transcripts | Transcript citations |
| Gemini | Native image processing | Multimodal answers |

## Image Optimization

### Alt Text Formula

```
[Descriptive subject] + [Brand if relevant] + [Context/use case]
```

**Examples:**

❌ `alt="image1.jpg"`  
❌ `alt="product photo"`  
✅ `alt="GEOly AI dashboard showing AIGVR score trend over 30 days"`  
✅ `alt="Brand visibility comparison chart across ChatGPT and Perplexity — GEOly AI"`

### Filename Formula

```
[primary-keyword]-[secondary-keyword]-[brand]-[descriptor].jpg
```

**Examples:**

❌ `IMG_3847.jpg`  
✅ `geo-brand-visibility-dashboard-geoly-ai.png`  
✅ `aigvr-score-chart-ai-search-monitoring.jpg`

### ImageObject Schema

```json
{
  "@context": "https://schema.org",
  "@type": "ImageObject",
  "name": "AIGVR Score Dashboard",
  "description": "Dashboard showing brand visibility scores across AI platforms",
  "contentUrl": "https://example.com/images/dashboard.jpg",
  "author": {
    "@type": "Organization",
    "name": "GEOly AI"
  },
  "keywords": "AIGVR, brand visibility, AI search, dashboard"
}
```

## Video Optimization

### Checklist

- [ ] Title contains primary keyword
- [ ] Description: first 150 chars = keyword + brand
- [ ] Transcript/captions attached (SRT/VTT)
- [ ] Chapters/timestamps for long videos
- [ ] Thumbnail: keyword-rich filename
- [ ] VideoObject Schema added

### VideoObject Schema

```json
{
  "@context": "https://schema.org",
  "@type": "VideoObject",
  "name": "How to Optimize for AI Search",
  "description": "Complete guide to GEO strategies...",
  "thumbnailUrl": "https://example.com/thumbs/geo-guide.jpg",
  "uploadDate": "2024-01-15",
  "duration": "PT12M30S",
  "contentUrl": "https://example.com/videos/geo-guide.mp4"
}
```

## Audio/Podcast Optimization

- Descriptive episode titles (not "Episode 47")
- 150+ word descriptions, keyword-rich
- Full transcript as page content
- Guest names and topics as entities

## Asset Optimization Tool

```bash
python scripts/optimize_asset.py \
  --type [image|video|audio] \
  --description "Asset description" \
  --brand "BrandName" \
  --keywords "keyword1,keyword2"
```

**Output:**
- Optimized Alt Text
- Recommended filename
- Schema markup
- Discoverability score (Before/After)

## Scoring

| Factor | Weight | Best Practice |
|--------|--------|---------------|
| Descriptiveness | 30% | Specific, detailed |
| Keyword presence | 25% | Natural inclusion |
| Brand mention | 20% | When relevant |
| Context | 15% | Use case clear |
| Length | 10% | 100-150 chars for Alt |

**Discoverability Score**: 0-10
- 8-10: Excellent
- 6-7: Good
- 4-5: Fair
- <4: Poor

Overview

This skill generates AI-optimized Alt Text, filenames, captions, and Schema markup for images, videos, and audio assets to improve discoverability on AI-first platforms like Google Lens, ChatGPT Vision, and Perplexity. It produces concise, search-friendly metadata and recommended filenames, plus ImageObject/VideoObject/Audio markup and a simple discoverability score. Use it to make visual and audio assets readable and citable by multimodal AI systems.

How this skill works

The tool inspects the asset type and a short description, then composes descriptive Alt Text following a subject + brand + context formula, generates keyword-rich filenames, and outputs JSON-LD Schema for ImageObject, VideoObject, or AudioObject. It also recommends transcripts/captions for video and audio, creates thumbnail filename suggestions, and produces a discoverability score based on descriptiveness, keyword presence, brand mention, context, and length.

When to use it

Preparing images, thumbnails, or product photos for AI search and visual discovery
Publishing videos that need VideoObject schema, transcripts, and keyword-first descriptions
Adding Alt Text and filenames to improve visibility in ChatGPT Vision, Google Lens, and visual search
Optimizing podcast or audio episodes with transcripts, rich descriptions, and AudioObject schema
Batch-tagging archived media for long-term AI discoverability and citation

Best practices

Write Alt Text as: descriptive subject + brand (if relevant) + clear context/use case
Keep Alt Text between ~100–150 characters and include the primary keyword naturally
Use hyphenated, keyword-first filenames: primary-secondary-brand-descriptor.jpg
Attach transcripts/captions (SRT/VTT) for any video or audio asset and include chapters for long videos
Publish JSON-LD ImageObject/VideoObject/AudioObject on the same page as the asset

Example use cases

Generate Alt Text and ImageObject schema for product images to increase Google Lens citations
Create VideoObject schema, thumbnail filename, and transcript checklist for a how-to video
Convert podcast episode metadata into AudioObject schema with a full transcript and guest entities
Batch-optimize an image archive: filenames, alt text, keywords, and discoverability score
Produce keyword-rich captions and filenames for marketing assets to improve multimodal search

FAQ

What length should Alt Text be for AI discoverability?

Aim for roughly 100–150 characters: specific, descriptive, and naturally including the primary keyword.

Do I always include the brand in Alt Text?

Include the brand when it adds context or authority; omit it for generic or neutral imagery to avoid keyword stuffing.