Learning Task-Sufficient Representation Of Video Dynamics