With the growing trend for developing new detection and investigation systems for Advanced Persistent Threat (APT), the urgent issue of lacking sound and authentic datasets becomes more visible. New datasets for research on APT detection and investigation have been released over the past few years in an accelerated manner. Yet, our examination of the existing datasets yields the finding that the gap between these datasets’ attack scenarios and real-world APT attacks is significant. Recognizing the flaws of prior datasets particularly in terms of attack scenario complexity and authenticity, we develop a novel sound dataset called Aviator, which is backed by MITRE emulation plans. The well-known organization MITRE has released nearly a dozen emulation plans, which closely reproduce APT groups’ real-world attack campaigns observed in the past. However MITRE has not published any datasets. Thus, we resort to stringently implementing these emulation plans. Further, we extend these emulation plans to include an industrial control system and attack steps on it, mimicking APT groups most known for their attacks against critical infrastructures in the past. Comparing to existing datasets, our dataset Aviator has the highest attack scenario complexity and authenticity. Moreover, Aviator is designed with dataset operability, usability, reproducibility and extensibility in mind, for which existing datasets lag far behind. That is, along with the Aviator dataset, we also provide log shipping tools, log parsing tools, and logging configuration files to encourage other researchers to make their own datasets, which may better suit the evaluation of their detection systems. Besides, we would add more log types in future versions of our dataset Aviator. We are committed to maintaining Aviator as a living dataset.