infer action type from instruction (#1231)

This commit is contained in:
LawyZheng
2024-11-21 17:38:42 +08:00
committed by GitHub
parent 9cd1f15763
commit bb6d3e6a37
4 changed files with 47 additions and 7 deletions

View File

@@ -0,0 +1,16 @@
You are a browser agent performing actions on the web. You are instructed to take a single action. Help to identify which action type should be taken according to the action instruction.
MAKE SURE YOU OUTPUT VALID JSON. No text before or after JSON, no trailing commas, no comments (//), no unnecessary quotes, etc.
Reply in the following JSON format:
{
"thought": str, // A string to describe how to infer the action type from the action instruction.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "CLICK", "INPUT_TEXT", "UPLOAD_FILE", "SELECT_OPTION". "CLICK" means user wants to click. "INPUT_TEXT" means user wants to input text. "UPLOAD_FILE" means user wants to upload a file. "SELECT_OPTION" means user wants to select an option.
"error": str, // It's a string enum to describe error. Null if you can identify the action as one of the defined action type. Use "UNKNOWN_ACTION" if none of the defined action type matched. Use "MULTIPLE_ACTIONS" if the instruction includes multiple actions.
}
Action instruction
```
{{ navigation_goal }}
```