Explore COCO dataset and manipulate elements in the context of semantic segmentation
Explore COCO dataset and manipulate elements in the context of semantic segmentation
This notebook explores the COCO (Common Objects in Context) image dataset and can provide helpers functions for Semantic Image Segmentation in Python. It uses the initial tools and approach described in two publications from Viraf Patrawala. For the originals, you can visit his github repo here. You will also find links to his excellent COCO walk-through papers.
“COCO is a large-scale object detection, segmentation, and captioning dataset.”
COCO provides multi-object labeling, segmentation mask annotations, image captioning, key-point detection and panoptic segmentation annotations with a total of 81 categories, making it a very versatile and multi-purpose dataset.
COCO 2017 dataset comes with nearly 120.000 training images, each with at least 5 captions, pixelwise semantic segmentation, keypoints… as well as 40.670 test images and 5000 validation images.
Visit for more information on COCO, including for the data, paper, and tutorials. I recommend checking for the exact format of the annotations described on the COCO website here.
I used this folder structure which you can eventually replicate. My working drive where to put the notebook is the top folder.