Skip to content

DJL - Basic Dataset


This module contains a number of basic and standard datasets in the Deep Java Library's (DJL). These datasets are used to train deep learning models.

List of datasets

This module contains the following datasets:

  • MNIST - A handwritten digits dataset
  • CIFAR10 - A dataset consisting of 60,000 32x32 color images in 10 classes
  • Coco - A large-scale object detection, segmentation, and captioning dataset that contains 1.5 million object instances
    • You have to manually add com.twelvemonkeys.imageio:imageio-jpeg:3.5 dependency to your project
  • ImageNet - An image database organized according to the WordNet hierarchy

    Note: You have to manually download the ImageNet dataset due to licensing requirements.

  • Pikachu - 1000 Pikachu images of different angles and sizes created using an open source 3D Pikachu model


The latest javadocs can be found on the website.

You can also build the latest javadocs locally using the following command:

# for Linux/macOS:
./gradlew javadoc

# for Windows:
..\gradlew javadoc

The javadocs output is built in the build/doc/javadoc folder.


You can pull the module from the central Maven repository by including the following dependency in your pom.xml file:


Some datasets(e.g. COCO) contains non-standard image files. OpenJDK may fail to load these images. twelvemonkeys ImageIO plugins provide a wide range of image format support. If you need to load images that not supported by default JDK, you can consider add the following dependencies into your project: