home / skills / omer-metin / skills-for-antigravity / computer-vision-deep

computer-vision-deep skill

/skills/computer-vision-deep

This skill helps you implement advanced computer vision tasks such as object detection and segmentation by applying best-practice patterns and references.

npx playbooks add skill omer-metin/skills-for-antigravity --skill computer-vision-deep

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
1.1 KB
---
name: computer-vision-deep
description: Use when implementing object detection, semantic/instance segmentation, 3D vision, or video understanding - covers YOLO, SAM, depth estimation, and multi-modal visionUse when ", " mentioned. 
---

# Computer Vision Deep

## Identity



## Reference System Usage

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

* **For Creation:** Always consult **`references/patterns.md`**. This file dictates *how* things should be built. Ignore generic approaches if a specific pattern exists here.
* **For Diagnosis:** Always consult **`references/sharp_edges.md`**. This file lists the critical failures and "why" they happen. Use it to explain risks to the user.
* **For Review:** Always consult **`references/validations.md`**. This contains the strict rules and constraints. Use it to validate user inputs objectively.

**Note:** If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.

Overview

This skill helps implement advanced computer vision pipelines for object detection, semantic and instance segmentation, 3D vision, and video understanding. It covers common models and tools such as YOLO, Segment Anything Model (SAM), depth estimation, and multi-modal vision components. The skill is designed to be practical and prescriptive: follow the provided reference patterns for creation, sharp_edges for diagnosis, and validations for review.

How this skill works

When building or debugging a vision pipeline, the skill consults three authoritative reference files: patterns.md to determine the correct implementation patterns, sharp_edges.md to identify likely failure modes and root causes, and validations.md to apply strict review rules and constraints. It then proposes concrete model choices, integration steps, expected outputs, and verification checks tailored to detection, segmentation, 3D, or video tasks.

When to use it

  • Start a new project requiring object detection, instance/semantic segmentation, or depth estimation.
  • Design multi-modal vision systems combining image, depth, and temporal data.
  • Debug recurring failures or unexplained model behavior in production vision pipelines.
  • Perform a formal review or validation of dataset splits, metrics, and postprocessing.
  • Integrate pre-trained models like YOLO or SAM into a larger application stack.

Best practices

  • Follow patterns.md as the primary source of truth for architecture, data flow, and deployment patterns.
  • Use sharp_edges.md to classify failures (data, label, architecture, or inference) before changing models.
  • Run validations.md checks on datasets, metric thresholds, and edge-case behaviors prior to release.
  • Prefer modular design: separate detection, segmentation, and depth components with clear I/O contracts.
  • Automate repeatable tests on synthetic and real edge-case samples to catch regressions early.

Example use cases

  • Build a multi-camera object detection and tracking pipeline using YOLO for detection and a re-identification stage for tracking.
  • Integrate SAM for interactive instance segmentation in an annotation tool to accelerate label creation.
  • Add monocular depth estimation to a mobile app to enable AR placement and scene understanding.
  • Analyze video understanding failures by mapping error patterns to sharp_edges.md causes and applying targeted fixes.

FAQ

Which reference should I consult first when starting a new implementation?

Begin with patterns.md to establish the correct architecture and integration pattern, then use validations.md to define acceptance criteria.

What if a model performs well in training but fails in production?

Use sharp_edges.md to diagnose root causes—check dataset distribution shifts, preprocessing mismatches, and inference-time postprocessing first.