Dokumentation (english)

Computer Vision

AI tasks involving images, videos, and spatial understanding

Computer vision tasks work with images and videos.

Classification Tasks

  • Image Classification: Assign labels or categories to images
  • Video Classification: Classify actions or scenes in video content
  • Zero-Shot Image Classification: Classify images without task-specific training

Detection and Localization

  • Object Detection: Detect and localize objects within images
  • Zero-Shot Object Detection: Localize unseen object categories without training
  • Keypoint Detection: Detect body, pose, or facial keypoints

Segmentation

  • Image Segmentation: Pixel-level labeling for object or region boundaries
  • Mask Generation: Generate segmentation masks automatically

Generation Tasks

  • Text-to-Image: Generate images from text prompts
  • Text-to-Video: Generate videos from text descriptions
  • Image-to-Image: Modify or restyle images using another image or prompt
  • Image-to-Video: Generate videos based on input images
  • Video-to-Video: Transform or modify video content
  • Unconditional Image Generation: Generate images without any prompt or condition

3D Tasks

  • Text-to-3D: Generate 3D models from text descriptions
  • Image-to-3D: Reconstruct 3D shapes from images

Other Vision Tasks

  • Depth Estimation: Predict a per-pixel depth map from images
  • Image-to-Text: Convert images into natural language descriptions
  • Image Feature Extraction: Generate embeddings or semantic features from images
  • OCR: Extract text from images and documents

Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor etwa 10 Stunden
Release: v4.0.0-production
Buildnummer: master@d237a7f
Historie: 10 Items