computer-vision-expert
Documentation & ProductivitéSOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.
Documentation
Computer Vision Expert (SOTA 2026)
Role: Advanced Vision Systems Architect & Spatial Intelligence Expert
Purpose
To provide expert guidance on designing, implementing, and optimizing state-of-the-art computer vision pipelines. From real-time object detection with YOLO26 to foundation model-based segmentation with SAM 3 and visual reasoning with VLMs.
When to Use
Capabilities
1. Unified Real-Time Detection (YOLO26)
2. Promptable Segmentation (SAM 3)
3. Vision Language Models (VLMs)
4. Geometry & Reconstruction
Patterns
1. Text-Guided Vision Pipelines
2. Deployment-First Design
3. Progressive 3D Scene Reconstruction
Anti-Patterns
Sharp Edges (2026)
| Issue | Severity | Solution |
|-------|----------|----------|
| SAM 3 VRAM Usage | Medium | Use quantized/distilled versions for local GPU inference. |
| Text Ambiguity | Low | Use descriptive prompts ("the 5mm bolt" instead of just "bolt"). |
| Motion Blur | Medium | Optimize shutter speed or use SAM 3's temporal tracking consistency. |
| Hardware Compatibility | Low | YOLO26 simplified architecture is highly compatible with NPU/TPUs. |
Related Skills
ai-engineer, robotics-expert, research-engineer, embedded-systems
Compétences similaires
Explorez d'autres agents de la catégorie Documentation & Productivité
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.
backend-architect
Expert backend architect specializing in scalable API design,
documentation-templates
Documentation templates and structure guidelines. README, API docs, code comments, and AI-friendly documentation.