MID-1K

Multimodal ISR Dataset [MID-1K]

https://github.com/kennedyk1/MID-1K

For RAC/RML2024 Course - Google Notebook

This dataset, collected by the ISR (Institute of Systems and Robotics) Team, is a new multi-sensory dataset that was organized, calibrated, curated, and annotated. The sensory data was collected using ROS and a Jackal Clearpath mobile robot (see Fig. 1), operating in two indoor environments: three floors of the DEEC building and two floors of DEI building at the University of Coimbra, Polo 2, Portugal.

You can clone this dataset to your local environment using the git clone command. Simply open the terminal and follow these steps:

   git clone https://github.com/kennedyk1/MID-1K
Jackal Clearpath
Fig. 1: Sensors on mobile robot, Clearpath Jackal model: (RGB) Ximea MQ013CG-E2, (Thermal) Flir Boson 640x512 pixels, Lens 50º 8.7mm, 60 fps, and (LiDAR) Ouster OS1-64.

The dataset consists of 1100 selected frames, containing RGB, thermal and depth-map images (generated from LiDAR) totaling 3300 image files (see Fig. 2). The sensors have been calibrated, but a small temporal misalignment is present due to hardware limitations.

Depth Modality RGB Modality Thermal Modality
Fig. 2: Dataset frame-examples composed of Depth-LiDAR, RGB and Thermal modalities.

The dataset was split into training and test sets. The training set contains 703 images and 2191 annotations, while the test set contains 397 images and 1629 annotations. The depth image representation was generated from the projected LiDAR point clouds using a modified bilateral filter. In particular, each LiDAR point cloud is projected on the RGB image-plane, considering the Camera-LiDAR calibration matrix, and then a sliding-window based weighing function, dependent on the range dispersion, is used to interpolate the points inside the mask therefore generating a dense representation.

Dataset Details
Total Files6,600 files
Total Images3,300 images
Total Annotations3,300 txt files
Dataset SizeApproximately 750 MB
Images Resolution640x512 pixels
Annotation FormatYOLO
Data Distribution#Images
Depth Images1,100
RGB Images1,100
Thermal Images1,100
RGB Thermal Depth
Model Ximea MQ013CG-E2 FLIR BOSON 640 LWIR OUSTER OS1-64-U
Type Colour Camera Thermal Camera LiDAR
Spec. 1280x1024 pixels, 1.3 MP 640x512 pixels Vert. Res.: 64 channels
Hor. Res.: 1024 points
Image Type .png .png .png
Total Images 1,100 1,100 1,100
Total Files Annotations 1,100 txt files 1,100 txt files 1,100 txt files
Total Annotations (people) 3,820 3,797 3,820¹
Images Resolution 640x512 640x512 640x512
Annotation Format YOLO xywhn² YOLO xywhn² YOLO xywhn²
¹ The labels from the RGB modality were used because the LiDAR was calibrated with the RGB camera.
² YOLO normalized xywh format class x_center y_center width height

References