ISLR Datasets

This section describes the list of all supported pose-based ISLR datasets available for training, as well as instructions on how to add support for your own dataset for training.

Supported Datasets

The following pose datasets are available out-of-the-box.

Dataset

Sign Language

Download Pose Data

ASLLVDDataset

American

0.3GB

AUTSLDataset

Turkish

2.2GB

Bosphorus22kDataset

Turkish

Not public

CSLDataset

Chinese

Not public

DeviSignDataset

Chinese

Not public

GSLDataset

Greek

0.7GB

INCLUDEDataset

Indian

0.6GB

LSA64Dataset

Argentinian

0.2GB

MSASLDataset

American

1.3GB

RWTH_Phoenix_Signer03_Dataset

German

0.03GB

WLASLDataset

American

1GB

Usage

To use one of the above existing datasets, follow the steps below:

  • Download the zip from above for the required sign langauge.
    • Extract it to any desired folder.

  • Mention the dataset class and path to the extracted dataset in the config.
    • For example configs, click here.

    • Feel free to change any other parameters pertaining to the dataset usage in the config.

You can now directly proceed to train!

Custom Datasets

To add support for your own dataset, create a class of the following structure:

from .base import BaseIsolatedDataset

class MyDatasetDataset(BaseIsolatedDataset):
    def read_glosses(self):
        self.glosses = ... # Populate the list of all glosses

    def read_original_dataset(self):
        self.data = ... # Populate the list of all video files and gloss IDs as tuples

    def read_video_data(self, index):
        # Read the following ...
        return imgs, label, video_name
  • For implementation examples, check this folder in the source code

  • This class can now be referenced in your config file appropriately, and used for training or inference.

Finger-spelling Datasets

This section describes the list of all supported pose-based finger-spelling datasets available.

Supported Datasets

The following pose datasets are available out-of-the-box.

Sign Language

Download Pose Data

American

63MB

Argentine

25MB

Chinese

Not public

German

46MB

Greek

19MB

Indian

28MB

Turkish

40MB