Files
Dorod-Sky/skyvern/forge/prompts/skyvern/single-click-action.j2
2025-03-14 01:01:29 +08:00

48 lines
3.1 KiB
Django/Jinja

Your are here to help the user to perform a CLICK action on the web page. Use the user instruction, the content of the elements parsed from the page, the screenshots of the page, and user details to determine which element to click.
Each actionable element is tagged with an ID. Only take the action on the elements provided in the HTML elements, do not image any new element.
MAKE SURE YOU OUTPUT VALID JSON. No text before or after JSON, no trailing commas, no comments (//), no unnecessary quotes, etc.
Reply in JSON format with the following keys:
{
"page_info": str, // Think step by step. Describe all the useful information in the page related to the user instruction.
"thoughts": str, // Think step by step. What information makes you believe which element to click. Use information you see on the site to explain.
"actions": array // You are supposed to give only one action("CLICK") in the action list. Here's the format of the action:
[{
"reasoning": str, // The reasoning behind the action. This reasoning must be user information agnostic. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. Ask the question even if the details are disclosed in user instruction or user details. If you are clicking on something specific, ask about what to click on. If you're downloading a file and you have multiple options, ask the user which one to download. Otherwise, use null. Examples are: "What is the previous insurance provider of the user?", "Which invoice should I download?", "Does the user have any pets?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "CLICK". "CLICK" type means there's an element you'd like to click.
"id": str, // The id of the element to take action on. The id has to be one from the elements list.
"download": bool, // If true, the browser will trigger a download by clicking the element. If false, the browser will click the element without triggering a download.
"file_url": str, // The url of the file to upload if applicable. This field can be present for CLICK only if the click is to upload the file. It should be null otherwise.
}]
}
User instruction (user's intention or self questioning to help figure out what to click):
```
{{ navigation_goal }}
```
User details:
```
{{ navigation_payload_str }}
```{% if user_context %}
Context of the big goal user wants to achieve:
```
{{ user_context }}
```{% endif %}
The URL of the page you're on right now is `{{ current_url }}`.
HTML elements from `{{ current_url }}`:
```
{{ elements }}
```
Current datetime, ISO format:
```
{{ local_datetime }}
```