TomatoMAP: Solanum lycopersicum (Tomato) Multi-Angle Multi-Pose Dataset

A Novel Dataset for Fine-Grained Phenotyping

1Institute for Breeding Research on Horticultural Crops, Julius Kuehn-Institute, Erwin-Baur-Street 27, 06484, Quedlinburg, Germany,
2Computer Graphics Group, Center for Sensor Systems (ZESS), University of Siegen, 57076 Siegen, Germany

Abstract

Observer bias and inconsistencies in traditional plant phenotyping methods limit the accuracy and re- producibility of fine-grained plant analysis. To overcome these challenges, we developed TomatoMAP, a comprehensive dataset for Solanum lycopersicum using an Internet of Things (IoT) based imaging system with standardized data acquisition protocols. Our dataset contains 64,464 RGB images that capture 12 different plant poses from four camera elevation angles. Each image includes manually annotated bounding boxes for seven regions of interest (ROIs), including leaves, panicle, batch of flowers, batch of fruits, axillary shoot, shoot and whole plant area, along with 50 fine-grained growth stage classifications based on the BBCH scale. Additionally, we provide 3,616 high-resolution image subset with pixel-wise semantic and instance segmentation annotations for fine-grained phenotyping. We validated our dataset using a cascading model deep learning framework combining MobileNetv3 for classification, YOLOv11 for object detection, and MaskRCNN for segmentation. Through AI vs. Human analysis involving five domain experts, we demonstrate that the models trained on our dataset achieve accuracy and speed comparable to the experts. Cohen's Kappa and inter-rater agreement heatmap confirm the reliability of automated fine-grained phenotyping using our approach.


Keywords: phenotyping, fine-grained, Solanum lycopersicum, Internet of Things, cascading model, AI vs. Human

Showcase

Phenotyper v1.0

To enable multi-angle multi-pose imaging of S. lycopersicum, the data acquisition system developed integrates a synchronized multi-camera array with a rotational platform, facilitating systematic and repeatable image capture across both spatial and temporal dimensions. The imaging station comprised four OV5647 color CMOS 5-megapixel image sensors: three equipped with 90° lens and one equipped with 170° fisheye lens. Those cameras are mounted at vertical inclination angles of 45°, 135°, and 180°, each offering an adjustable focal length, an aperture of F/NO 2.2, and fields of view of 90° (diagonal) and 72° (horizontal), allowing comprehensive coverage of the entire plant structure in short range.

Fine-Grained Phenotyping

Citation

BibTeX
@misc{zhang2025tomatomultianglemultiposedataset,
      title={Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping}, 
      author={Yujie Zhang and Sabine Struckmeyer and Andreas Kolb and Sven Reichardt},
      year={2025},
      eprint={2507.11279},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.11279}, 
}

@dataset{tomatomap,
  title={TomatoMAP: Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping},
  author={Yujie Zhang, Sabine Struckmeyer, Andreas Kolb, and Sven Reichardt},
  journal={e!DAL - Plant Genomics and Phenomics Research Data Repository (PGP)},
  year={2025}
}