Supermarket Product Images Dataset

The largest dataset of grocery product images from North American supermarkets (15,000+ items)

Download it on Kaggle ->

My latest grocery store images dataset is a large dataset containing over 15,000 labeled grocery product images commonly found in U.S. and North American retail stores. (I plan on getting it to a million!) This is the largest supermarket product images dataset that you can find online. And it contains real*** images.

It came out of a need for prototyping a web application I am building. That use case involved a simple case of inserting the images into a relational database for semantic search. For the rest of you all, the dataset is annotated for computer vision and image classification.

Screenshot of the Supermarket Product Images dataset displayed in the Kaggle Data Explorer, showing thumbnail images of grocery products across multiple categories.

Viewing sample grocery product images from the Supermarket Product Images dataset in the Kaggle Data Explorer.

The dataset consists of JPEG images, organized by category, alongside CSV files. The CSV’s provide metadata, labels, and the same images. By reading the CSV’s, you will find the same images encoded as base64 strings. Go right ahead and read these images into Pandas, Jupyter jockeys 😉

Contents of the Supermarket Product Images Dataset

Image Files

All images are stored as JPEGs using a predictable directory layout:

./jpg/:category_id/:id.jpg

  • Images are grouped by category
  • Each image has a stable numeric ID
  • Image sizes may vary slightly (see metadata)

This layout makes the dataset easy to load with common computer vision pipelines, PyTorch datasets, or custom data loaders.


CSV Files

The Supermarket Product Images dataset includes multiple CSV files depending on how you prefer to work with data:

metadata.csv

Contains image‑level metadata:

  • Category ID (tokenized)
  • Image ID
  • Width (pixels)
  • Height (pixels)

tokens.csv

A lookup table mapping:

  • category_id → category name in plain English

images.csv

  • Base64‑encoded image data
  • Reading images from a single file can be simple and fast!

Schema

metadata.csv ──┐
               ├── id ──> :category_id/:id.jpg
images.csv ────┘


tokens.csv: category_id ──> category (plain English)

Notice that you can

  • Load images directly from disk
  • Join metadata and labels (and even images) via CSV

Screenshot of the Supermarket Product Images dataset opened in the Kaggle Data Explorer, showing the combined CSV file with image metadata and labels.

Viewing the combined CSV file for the Supermarket Product Images dataset in the Kaggle Data Explorer.

Use Cases for the Supermarket Product Images Dataset

Use the Supermarket Product Images dataset for a wide range of applications:

  • Image classification
  • Grocery and retail product recognition
  • Computer vision model benchmarking
  • Educational machine learning projects
  • Rapid prototyping with CNNs or vision transformers

The dataset fills a tough gap, reflecting real consumer products. I personally could not find this anywhere else on the internet. Looking at datasets, there is no other self-contained dataset of real product images. Cheers to shopping? Maybe.

Keep reading:

Sample grocery product images from the Supermarket Product Images dataset, showing packaged food items across multiple retail categories.
Machine Learning
Alexander Wei

Supermarket Product Images Dataset

The largest dataset of grocery product images from North American supermarkets (15,000+ items) Download it on Kaggle -> My latest grocery store images dataset is

Read More »