home / skills / questnova502 / claude-skills-sync / senior-computer-vision

senior-computer-vision skill

/skills/senior-computer-vision

This skill helps you build production-grade vision systems with PyTorch, OpenCV, and deployment pipelines for real-time object detection and segmentation.

npx playbooks add skill questnova502/claude-skills-sync --skill senior-computer-vision

Review the files below or copy the command above to add this skill to your agents.

Files (7)
SKILL.md
5.5 KB
---
name: senior-computer-vision
description: World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
---

# Senior Computer Vision Engineer

World-class senior computer vision engineer skill for production-grade AI/ML/Data systems.

## Quick Start

### Main Capabilities

```bash
# Core Tool 1
python scripts/vision_model_trainer.py --input data/ --output results/

# Core Tool 2  
python scripts/inference_optimizer.py --target project/ --analyze

# Core Tool 3
python scripts/dataset_pipeline_builder.py --config config.yaml --deploy
```

## Core Expertise

This skill covers world-class capabilities in:

- Advanced production patterns and architectures
- Scalable system design and implementation
- Performance optimization at scale
- MLOps and DataOps best practices
- Real-time processing and inference
- Distributed computing frameworks
- Model deployment and monitoring
- Security and compliance
- Cost optimization
- Team leadership and mentoring

## Tech Stack

**Languages:** Python, SQL, R, Scala, Go
**ML Frameworks:** PyTorch, TensorFlow, Scikit-learn, XGBoost
**Data Tools:** Spark, Airflow, dbt, Kafka, Databricks
**LLM Frameworks:** LangChain, LlamaIndex, DSPy
**Deployment:** Docker, Kubernetes, AWS/GCP/Azure
**Monitoring:** MLflow, Weights & Biases, Prometheus
**Databases:** PostgreSQL, BigQuery, Snowflake, Pinecone

## Reference Documentation

### 1. Computer Vision Architectures

Comprehensive guide available in `references/computer_vision_architectures.md` covering:

- Advanced patterns and best practices
- Production implementation strategies
- Performance optimization techniques
- Scalability considerations
- Security and compliance
- Real-world case studies

### 2. Object Detection Optimization

Complete workflow documentation in `references/object_detection_optimization.md` including:

- Step-by-step processes
- Architecture design patterns
- Tool integration guides
- Performance tuning strategies
- Troubleshooting procedures

### 3. Production Vision Systems

Technical reference guide in `references/production_vision_systems.md` with:

- System design principles
- Implementation examples
- Configuration best practices
- Deployment strategies
- Monitoring and observability

## Production Patterns

### Pattern 1: Scalable Data Processing

Enterprise-scale data processing with distributed computing:

- Horizontal scaling architecture
- Fault-tolerant design
- Real-time and batch processing
- Data quality validation
- Performance monitoring

### Pattern 2: ML Model Deployment

Production ML system with high availability:

- Model serving with low latency
- A/B testing infrastructure
- Feature store integration
- Model monitoring and drift detection
- Automated retraining pipelines

### Pattern 3: Real-Time Inference

High-throughput inference system:

- Batching and caching strategies
- Load balancing
- Auto-scaling
- Latency optimization
- Cost optimization

## Best Practices

### Development

- Test-driven development
- Code reviews and pair programming
- Documentation as code
- Version control everything
- Continuous integration

### Production

- Monitor everything critical
- Automate deployments
- Feature flags for releases
- Canary deployments
- Comprehensive logging

### Team Leadership

- Mentor junior engineers
- Drive technical decisions
- Establish coding standards
- Foster learning culture
- Cross-functional collaboration

## Performance Targets

**Latency:**
- P50: < 50ms
- P95: < 100ms
- P99: < 200ms

**Throughput:**
- Requests/second: > 1000
- Concurrent users: > 10,000

**Availability:**
- Uptime: 99.9%
- Error rate: < 0.1%

## Security & Compliance

- Authentication & authorization
- Data encryption (at rest & in transit)
- PII handling and anonymization
- GDPR/CCPA compliance
- Regular security audits
- Vulnerability management

## Common Commands

```bash
# Development
python -m pytest tests/ -v --cov
python -m black src/
python -m pylint src/

# Training
python scripts/train.py --config prod.yaml
python scripts/evaluate.py --model best.pth

# Deployment
docker build -t service:v1 .
kubectl apply -f k8s/
helm upgrade service ./charts/

# Monitoring
kubectl logs -f deployment/service
python scripts/health_check.py
```

## Resources

- Advanced Patterns: `references/computer_vision_architectures.md`
- Implementation Guide: `references/object_detection_optimization.md`
- Technical Reference: `references/production_vision_systems.md`
- Automation Scripts: `scripts/` directory

## Senior-Level Responsibilities

As a world-class senior professional:

1. **Technical Leadership**
   - Drive architectural decisions
   - Mentor team members
   - Establish best practices
   - Ensure code quality

2. **Strategic Thinking**
   - Align with business goals
   - Evaluate trade-offs
   - Plan for scale
   - Manage technical debt

3. **Collaboration**
   - Work across teams
   - Communicate effectively
   - Build consensus
   - Share knowledge

4. **Innovation**
   - Stay current with research
   - Experiment with new approaches
   - Contribute to community
   - Drive continuous improvement

5. **Production Excellence**
   - Ensure high availability
   - Monitor proactively
   - Optimize performance
   - Respond to incidents

Overview

This skill delivers world-class computer vision expertise for building, optimizing, and operating production-grade image and video AI systems. It covers model development, real-time inference, 3D and video analysis, and end-to-end deployment using PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. The focus is on practical, scalable solutions that meet latency, throughput, and reliability targets for production environments.

How this skill works

The skill inspects project requirements, designs appropriate vision architectures, and implements training, evaluation, and inference pipelines with production-grade tooling. It optimizes datasets, model architectures, and inference paths (quantization, batching, hardware offload) and integrates monitoring, CI/CD, and security practices. Hands-on scripts and patterns streamline dataset processing, model training, deployment to containerized clusters, and runtime monitoring.

When to use it

  • Building or re-architecting a vision system for production
  • Implementing object detection, segmentation, tracking, or 3D vision features
  • Optimizing inference latency, throughput, or cost at scale
  • Designing CI/CD, monitoring, and automated retraining for models
  • Deploying real-time video pipelines or distributed vision workloads

Best practices

  • Adopt test-driven development and version control for data, models, and code
  • Design modular pipelines: dataset ingestion, augmentation, training, serving, monitoring
  • Profile and optimize inference: quantization, batching, hardware placement, caching
  • Automate deployment with containers, Kubernetes, and canary or blue/green releases
  • Instrument models and data flows for drift detection, latency, and error monitoring
  • Enforce security and privacy: auth, encryption, PII handling, and compliance checks

Example use cases

  • Real-time object detection on edge devices with YOLO and quantized PyTorch models
  • High-accuracy instance segmentation for manufacturing QA using SAM and custom finetuning
  • Video analytics pipeline for multi-camera tracking and anomaly detection with distributed processing
  • 3D reconstruction and pose estimation for robotics or AR applications
  • Production deployment of a vision microservice with autoscaling, observability, and automated retraining

FAQ

What performance targets can I expect?

Typical targets are P50 < 50ms, P95 < 100ms, P99 < 200ms for inference; throughput and exact numbers depend on model size and hardware, but the skill targets >1000 RPS with appropriate scaling.

Which frameworks and tools are recommended?

Use PyTorch for model development, OpenCV for preprocessing, YOLO/SAM/ViT for task-specific models, Docker/Kubernetes for deployment, and MLflow or W&B for experiment tracking and monitoring.