home / skills / zpankz / mcp-skillset / screenapp-cli

screenapp-cli skill

safe

This skill integrates ScreenApp multimodal analysis into the context retrieval workflow, enabling semantic search, transcripts analysis, and graph-based

npx playbooks add skill zpankz/mcp-skillset --skill screenapp-cli

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

3.9 KB

---
name: screenapp-cli
description: ScreenApp multimodal video/audio analysis CLI with graph-based context retrieval
version: 1.0.0
category: context-extraction
triggers:
  - screenapp
  - recording
  - video transcript
  - screen capture
  - demo recording
  - teaching session
  - multimodal ai
tags:
  - cli
  - video
  - audio
  - transcription
  - graph
  - embeddings
  - context
  - falkordb
---

# ScreenApp CLI Skill

## Purpose

Integrate ScreenApp's multimodal video/audio analysis API into the PEX context retrieval ecosystem as the Σₛ (Screen Context) primitive.

## Capabilities

### 1. File Operations
- List and search recordings from ScreenApp
- Retrieve detailed file information with transcripts
- Tag and organize recordings

### 2. Multimodal AI Queries
- Ask questions about video content
- Analyze transcripts, video frames, or screenshots
- Time-segment specific queries

### 3. Graph-Based Context
- Semantic search over transcript embeddings
- Topic extraction and co-occurrence
- Speaker identification and linking
- Recording similarity relationships

### 4. Workflow Pipelines
- Daily digest of recordings
- Cross-recording semantic search
- Export to Obsidian format
- Temporal correlation with limitless

## CLI Reference

```bash
# Core commands
screenapp files list|get|search|tag|untag
screenapp ask <fileId> "<question>" [--mode transcript|video|screenshots]
screenapp sync run|status|build-similarity|build-topics
screenapp graph query|stats|traverse
screenapp workflow daily|search|recent|export
screenapp config init|show|get|set|validate
```

## Σₛ Primitive Definition

```yaml
Σₛ — Screen Context (screenapp)

λο.τ Form: Query(ο) → Multimodal-Search(λ) → Screen-Context(τ)

Sub-Primitives:
| Σₛ₁ | Transcript Search  | 0.90 | screenapp files search --semantic |
| Σₛ₂ | Visual Query       | 0.85 | screenapp ask --mode video |
| Σₛ₃ | Temporal Context   | 0.80 | screenapp workflow daily |
| Σₛ₄ | AI Insights        | 0.88 | screenapp ask --mode all |
```

## Integration Points

### Context Router
- Triggers on: screenapp, recording, video, transcript, screen, demo
- Parallel execution with limitless, research, pieces

### Grounding Router
- Part of Σ (Source) primitive family
- Composition with other primitives: (Σₛ ⊗ Σₐ) ∘ Τ

### Quality Gates
- G₀ health check: `screenapp config validate`
- G₁ diversity: Screen sources boost context diversity

## Graph Schema

```cypher
(:Recording)-[:HAS_TRANSCRIPT]->(:Transcript)
(:Transcript)-[:HAS_SEGMENT]->(:Segment)
(:Segment)-[:SPOKEN_BY]->(:Speaker)
(:Recording)-[:COVERS_TOPIC]->(:Topic)
(:Recording)-[:SIMILAR_TO]->(:Recording)
```

## Usage Examples

### Context Extraction

```bash
# Semantic search for relevant recordings
screenapp files search "machine learning" --semantic --limit 5 --json

# Get AI insights from a recording
screenapp ask abc123 "What were the key decisions?" --json

# Daily context with temporal correlation
screenapp workflow daily 2025-01-10 --json
```

### Graph Queries

```bash
# Find recordings by topic
screenapp graph query "
  MATCH (r:Recording)-[:COVERS_TOPIC]->(t:Topic)
  WHERE t.name CONTAINS 'AI'
  RETURN r.name, r.id
  LIMIT 10
" --json

# Find similar recordings
screenapp graph query "
  MATCH (r1:Recording {id: 'abc123'})-[:SIMILAR_TO]->(r2)
  RETURN r2.name, r2.id
" --json
```

## Dependencies

| Service | Port | Required |
|---------|------|----------|
| FalkorDB/FalkorDBLite | 6379 / socket | Yes |
| Ollama | 11434 | For embeddings |
| ScreenApp API | - | Yes |

**Note**: Can use FalkorDBLite (embedded) similar to limitless-cli for zero-config setup.

## Configuration

Location: `~/.screenapp-cli/config.toml`

Required:
- `SCREENAPP_API_TOKEN`
- `SCREENAPP_TEAM_ID`

## Related Skills

- `limitless-cli` - Personal lifelogs (temporal correlation)
- `context-orchestrator` - Multi-source context extraction
- `grounding-router` - Medical education grounding

Overview

This skill integrates ScreenApp's multimodal video and audio analysis into a CLI focused on graph-based context retrieval. It exposes file management, semantic multimodal queries, and graph primitives for linking recordings, topics, speakers, and temporal context. The CLI is built for pipeline automation, temporal digests, and exporting contextual data for downstream knowledge workflows.

How this skill works

The CLI lists, tags, and retrieves recordings and transcripts, then runs semantic search over transcript embeddings and visual/frame analysis. It builds a graph where recordings link to transcripts, segments, speakers, and topics, and provides AI-driven question answering over transcripts, frames, or combined multimodal sources. Workflows assemble daily digests, cross-recording semantic search, similarity graphs, and exports (for example to Obsidian).

When to use it

When you need semantic search across meeting transcripts and video frames.
When you want to generate AI summaries, Q&A, or insights tied to specific time segments.
When you need to build or query a graph of recordings, topics, and speaker relationships.
When automating daily or periodic context digests from multiple recordings.
When exporting structured context into personal knowledge systems (Obsidian, etc.).

Best practices

Keep SCREENAPP_API_TOKEN and SCREENAPP_TEAM_ID configured in ~/.screenapp-cli/config.toml and validate with config validate.
Run periodic sync and build-similarity to maintain an up-to-date graph and embeddings.
Prefer semantic search (--semantic) for recall, then refine with time-segment filters for precision.
Use --mode transcript|video|screenshots explicitly when asking to target the right modality.
Use FalkorDBLite for zero-config local deployments, and Ollama for embeddings if available.

Example use cases

Semantic search for relevant meeting recordings about a topic: files search "machine learning" --semantic --limit 5.
Ask a recording for decisions or action items tied to timestamps: ask <fileId> "What were the key decisions?" --mode transcript.
Build and query similarity graphs to discover related demos or talks: sync build-similarity then graph query.
Generate a daily digest correlating events across recordings and export to Obsidian: workflow daily <date> --json then workflow export.
Identify speakers and extract co-occurring topics across a set of recordings for research or knowledge bases.

FAQ

What configuration is required to run the CLI?

Set SCREENAPP_API_TOKEN and SCREENAPP_TEAM_ID in ~/.screenapp-cli/config.toml and run screenapp config validate.

How do I target video frames instead of transcripts?

Use the ask command with --mode video or --mode screenshots to include visual analysis rather than only transcript analysis.