COCO Processing
The COCO Processing module handles loading, parsing, and extracting data from COCO-format annotation files.
Overview
COCO (Common Objects in Context) is a widely-used dataset format for object detection, segmentation, and captioning tasks. This module provides tools to work with COCO-format data.
Features
- Load COCO annotation JSON files
- Extract instance segmentation masks
- Filter by category or image
- Convert annotations to standard formats
- Validate dataset integrity
Usage
Basic Loading
from src.coco_processing import COCOProcessor
processor = COCOProcessor('data/coco/annotations.json')
annotations = processor.load_annotations()
Extracting Masks
# Extract all instance masks
masks = processor.extract_masks()
# Extract masks for specific category
person_masks = processor.extract_masks(category='person')
# Extract masks for specific image
image_masks = processor.extract_masks(image_id=12345)
Category Information
# Get all categories
categories = processor.get_categories()
# Get instances by category
cars = processor.get_instances_by_category('car')
COCO Format Structure
A typical COCO annotation file has the following structure:
{
"images": [
{
"id": 1,
"file_name": "image1.jpg",
"width": 640,
"height": 480
}
],
"annotations": [
{
"id": 1,
"image_id": 1,
"category_id": 1,
"segmentation": [[x1, y1, x2, y2, ...]],
"area": 1234.5,
"bbox": [x, y, width, height]
}
],
"categories": [
{
"id": 1,
"name": "person",
"supercategory": "person"
}
]
}
Advanced Features
Filtering
# Filter by minimum area
large_objects = processor.filter_by_area(min_area=1000)
# Filter by bounding box size
filtered = processor.filter_by_bbox(min_width=50, min_height=50)
Visualization
# Visualize annotations on image
processor.visualize_annotations(image_id=12345, save_path='output.png')
# Show mask overlay
processor.show_mask_overlay(annotation_id=67890)
Configuration
Configure COCO processing via configs/coco_processing.yaml:
coco_processing:
# Filter settings
min_area: 100
max_area: 50000
# Category filtering
categories: ["person", "car", "bicycle"]
# Image filtering
min_image_width: 640
min_image_height: 480
# Preprocessing
normalize_masks: true
resize_to: [256, 256]
API Reference
For detailed API documentation, see the COCO Processing API Reference.
Next Steps
Once you have extracted masks from COCO annotations, proceed to 3D Generation to create 3D models.