Upload two videos, enter your Gemini API key and task description, then first shuffle them, and "Get Response" to analyze the frames of query video.
After receiving the response, you can click "Parse Response" to see the predicted task completion percentage for each frame of query video. You can toggle back to GT order to examine the predicted value function as well as the caption.