The synthetic Synset Signset Germany dataset addresses the task of traffic sign recognition for the country of Germany. In this, it combines the advantages of data-driven and analytical modeling: GAN-based texture generation enables data-driven dirt and wear artifacts to create unique and realistic traffic sign surfaces, while the analytical scene modulation achieves physically correct lighting along with proper geometric transformations, and allows detailed parameterization.
The resulting synthetic traffic sign recognition dataset Synset Signset Germany contains a total of 105,500 images of 211 different German traffic sign classes, including newly issued (2020) and thus comparatively rare traffic signs. In addition to a mask and a segmentation image, we also provide extensive metadata, including the stochastically selected environment and imaging effect parameters for each image. Overall, the resulting dataset is among the largest and most diverse datasets for traffic sign recognition and, to the best of our knowledge, one of the first publicly available large-scale synthetic datasets for this task.
A subset of 43 classes in the dataset aims to represent a “synthetic twin” of the well-known “German Traffic Sign Recognition Benchmark“ (GTSRB)1 dataset with similar imaging parameters. Overall, our dataset is therefore well suited for training traffic sign recognition applications or comparing real-world data with synthetic data. Thanks to the extensive metadata, it can also be used for applications in the context of explainable AI (XAI) or robustness analyses and systematic tests.