Useful Data Sets

"Natural" Images

  • 80 Million Tiny Images: 7,527,697 32x32 color images collected from the web
  • CIFAR-10: A labeled subset of 80MTI with 60,000 images in ten classes
  • CIFAR-100: A labeled subset of 80MTI with 60,000 images in 100 classes. The 100 classes are additionally grouped into 20 "super classes".
  • Caltech-101: A labeled set of 9,144 images in 102 classes.
  • Caltech-256: A labeled set of 30,608 images in 257 classes.

Stereo Images

  • Small NORB: A set of 24,300 96x96 greyscale stereo pairs of 50 unique objects with five distinct categories. There are labeled variations in lighting and camera pose.

Faces

Handwriting Pen Trajectories

Text Corpora