Send an image + text prompt, get back the (x,y) coordinates to click. YOLO + CLIP AI.
Notes
How to test with Capzy.ai
createTask