In: Proceedings of the 17th European Conference on Computer Vision (ECCV), 2020.
Samuel S. Sohn, Honglu Zhou, Seonghyeon Moon,
Sejong Yoon, Vladimir Pavlovic, and Mubbasir Kapadia
Manuscript (5.6 MB PDF) / Supplementary (10.6 MB PDF) / GitHub / NSF Project
Predicting the crowd behavior in complex environments is a key requirement for crowd and disaster management, architectural design, and urban planning. Given a crowd's immediate state, current approaches must be successively repeated over multiple time-steps for long-term predictions, leading to computationally expensive and error-prone results. However, most applications require the ability to accurately predict hundreds of possible simulation outcomes (e.g., under different environment and crowd situations) at real-time rates, for which these approaches are prohibitively expensive. We propose the first deep framework to instantly predict the long-term flow of crowds in arbitrarily large, realistic environments. Central to our approach are a novel representation CAGE, which efficiently encodes crowd scenarios into compact, fixed-size representations that losslessly represent the environment, and a modified SegNet architecture for instant long-term crowd flow prediction. We conduct comprehensive experiments on novel synthetic and real datasets. Our results indicate that our approach is able to capture the essence of real crowd movement over very long time periods, while generalizing to never-before-seen environments and crowd contexts.