Current proprietary models struggle to generate complex human actions



Fig. 2 in the main paper
SORA-2
VEO-3