The largest dataset of grocery product images from North American supermarkets (15,000+ items)
My latest grocery store images dataset is a large dataset containing over 15,000 labeled grocery product images commonly found in U.S. and North American retail stores. (I plan on getting it to a million!) This is the largest supermarket product images dataset that you can find online. And it contains real*** images.
It came out of a need for prototyping a web application I am building. That use case involved a simple case of inserting the images into a relational database for semantic search. For the rest of you all, the dataset is annotated for computer vision and image classification.

Viewing sample grocery product images from the Supermarket Product Images dataset in the Kaggle Data Explorer.
The dataset consists of JPEG images, organized by category, alongside CSV files. The CSV’s provide metadata, labels, and the same images. By reading the CSV’s, you will find the same images encoded as base64 strings. Go right ahead and read these images into Pandas, Jupyter jockeys 😉
Contents of the Supermarket Product Images Dataset
Image Files
All images are stored as JPEGs using a predictable directory layout:
./jpg/:category_id/:id.jpg
- Images are grouped by category
- Each image has a stable numeric ID
- Image sizes may vary slightly (see metadata)
This layout makes the dataset easy to load with common computer vision pipelines, PyTorch datasets, or custom data loaders.
CSV Files
The Supermarket Product Images dataset includes multiple CSV files depending on how you prefer to work with data:
metadata.csv
Contains image‑level metadata:
- Category ID (tokenized)
- Image ID
- Width (pixels)
- Height (pixels)
tokens.csv
A lookup table mapping:
category_id→ category name in plain English
images.csv
- Base64‑encoded image data
- Reading images from a single file can be simple and fast!
Schema
metadata.csv ──┐
├── id ──> :category_id/:id.jpg
images.csv ────┘
tokens.csv: category_id ──> category (plain English)
Notice that you can
- Load images directly from disk
- Join metadata and labels (and even images) via CSV

Viewing the combined CSV file for the Supermarket Product Images dataset in the Kaggle Data Explorer.
Use Cases for the Supermarket Product Images Dataset
Use the Supermarket Product Images dataset for a wide range of applications:
- Image classification
- Grocery and retail product recognition
- Computer vision model benchmarking
- Educational machine learning projects
- Rapid prototyping with CNNs or vision transformers
The dataset fills a tough gap, reflecting real consumer products. I personally could not find this anywhere else on the internet. Looking at datasets, there is no other self-contained dataset of real product images. Cheers to shopping? Maybe.

