Fix caching bug do not use user data (#4708)

This commit is contained in:
Shuchang Zheng
2026-02-11 20:16:29 -08:00
committed by GitHub
parent 384b8c0ac5
commit a58847c27c
6 changed files with 12 additions and 12 deletions

View File

@@ -15,8 +15,8 @@ Reply in JSON format with the following keys:
"actions": array // An array of actions. Here's the format of each action:
[{
"reasoning": str, // The reasoning behind the action. This reasoning must be user information agnostic. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.
"user_detail_query": str, // Think of this value as a Jeopardy question and the intention behind the action. Ask the user for the details you need for executing this action. Ask the question even if the details are disclosed in user goal or user details. If it's a text field, ask for the text. If it's a file upload, ask for the file. If it's a dropdown, ask for the relevant information. If you are clicking on something specific, ask about what the intention is behind the click and what to click on. If you're downloading a file and you have multiple options, ask the user which one to download. Examples are: "What product ID should I input into the search bar?", "What file should I upload?", "What is the previous insurance provider of the user?", "Which invoice should I download?", "Does the user have any pets?". If the action doesn't require any user details, describe the intention behind the action.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user goal or user details.
"user_detail_query": str, // Think of this value as a Jeopardy question and the intention behind the action. Ask the user for the details you need for executing this action. IMPORTANT: This question must be user information agnostic and must NOT contain specific user data (like names, IDs, specific values, etc.). Ask what information is needed generically. Ask the question even if the details are disclosed in user goal or user details. If it's a text field, ask for the text. If it's a file upload, ask for the file. If it's a dropdown, ask for the relevant information. If you are clicking on something specific, ask about what the intention is behind the click and what to click on. If you're downloading a file and you have multiple options, ask the user which one to download. Examples are: "What product ID should I input into the search bar?", "What file should I upload?", "What is the previous insurance provider of the user?", "Which invoice should I download?", "Does the user have any pets?", "Which member should I take action on?". NEVER include specific user data in this question - keep it generic so it can be answered with different user contexts.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user goal or user details. This answer CAN contain specific user data.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "CLICK", "HOVER", "INPUT_TEXT", "UPLOAD_FILE", "SELECT_OPTION", "WAIT", "SOLVE_CAPTCHA", "COMPLETE", "TERMINATE", "KEYPRESS"{{', "CLOSE_PAGE"' if has_magic_link_page else ""}}. "CLICK" is an element you'd like to click. "HOVER" is used to move the mouse over an element without clicking, particularly when revealing hover-only menus or buttons before clicking, or when the UI hints that a control (like a CTA button) only appears after hovering a card, tile, or model name. "INPUT_TEXT" is an element you'd like to input text into. "UPLOAD_FILE" is an element you'd like to upload a file into. "SELECT_OPTION" is an element you'd like to select an option from. "WAIT" action should be used if there are no actions to take and there is some indication on screen that waiting could yield more actions. "WAIT" should not be used if there are actions to take. "SOLVE_CAPTCHA" should be used if there's a captcha to solve on the screen. "COMPLETE" is used when the {{"complete criterion has been met" if complete_criterion else "user goal has been achieved"}} AND if there's any data extraction goal, you should be able to get data from the page. Never return a COMPLETE action unless the {{ "complete criterion is met" if complete_criterion else "user goal is achieved" }}. "TERMINATE" is used to terminate the whole task with a failure when it doesn't seem like the user goal can be achieved. Do not use "TERMINATE" if waiting could lead the user towards the goal. Only return "TERMINATE" if you are on a page where the user goal cannot be achieved. All other actions are ignored when "TERMINATE" is returned. "KEYPRESS" is used to press a keyboard key when no clickable button or element achieves the same result. Only use KEYPRESS when pressing a key is the sole way to proceed (e.g., pressing Enter to submit a search with no search button, or Escape to close a modal with no close button). KEYPRESS does not require an element id. Requires the "key" field.{{' "CLOSE_PAGE" is used to close the current page when it is impossible to achieve the user goal on the current page.' if has_magic_link_page else ''}}
"id": str, // The id of the element to take action on. The id has to be one from the elements list

View File

@@ -15,8 +15,8 @@ Reply in JSON format with the following keys:
"actions": array // An array of actions. Here's the format of each action:
[{
"reasoning": str, // The reasoning behind the action. This reasoning must be user information agnostic. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.
"user_detail_query": str, // Think of this value as a Jeopardy question and the intention behind the action. Ask the user for the details you need for executing this action. Ask the question even if the details are disclosed in user goal or user details. If it's a text field, ask for the text. If it's a file upload, ask for the file. If it's a dropdown, ask for the relevant information. If you are clicking on something specific, ask about what the intention is behind the click and what to click on. If you're downloading a file and you have multiple options, ask the user which one to download. Examples are: "What product ID should I input into the search bar?", "What file should I upload?", "What is the previous insurance provider of the user?", "Which invoice should I download?", "Does the user have any pets?". If the action doesn't require any user details, describe the intention behind the action.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user goal or user details.
"user_detail_query": str, // Think of this value as a Jeopardy question and the intention behind the action. Ask the user for the details you need for executing this action. IMPORTANT: This question must be user information agnostic and must NOT contain specific user data (like names, IDs, specific values, etc.). Ask what information is needed generically. Ask the question even if the details are disclosed in user goal or user details. If it's a text field, ask for the text. If it's a file upload, ask for the file. If it's a dropdown, ask for the relevant information. If you are clicking on something specific, ask about what the intention is behind the click and what to click on. If you're downloading a file and you have multiple options, ask the user which one to download. Examples are: "What product ID should I input into the search bar?", "What file should I upload?", "What is the previous insurance provider of the user?", "Which invoice should I download?", "Does the user have any pets?", "Which member should I take action on?". NEVER include specific user data in this question - keep it generic so it can be answered with different user contexts.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user goal or user details. This answer CAN contain specific user data.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "CLICK", "HOVER", "INPUT_TEXT", "UPLOAD_FILE", "SELECT_OPTION", "WAIT", "SOLVE_CAPTCHA", "COMPLETE", "TERMINATE", "KEYPRESS"{{', "CLOSE_PAGE"' if has_magic_link_page else ""}}. "CLICK" is an element you'd like to click. "HOVER" is used to move the mouse over an element without clicking, particularly when revealing hover-only menus or buttons before clicking, or when the UI hints that a control (like a CTA button) only appears after hovering a card, tile, or model name. "INPUT_TEXT" is an element you'd like to input text into. "UPLOAD_FILE" is an element you'd like to upload a file into. "SELECT_OPTION" is an element you'd like to select an option from. "WAIT" action should be used if there are no actions to take and there is some indication on screen that waiting could yield more actions. "WAIT" should not be used if there are actions to take. "SOLVE_CAPTCHA" should be used if there's a captcha to solve on the screen. "COMPLETE" is used when the {{"complete criterion has been met" if complete_criterion else "user goal has been achieved"}} AND if there's any data extraction goal, you should be able to get data from the page. Never return a COMPLETE action unless the {{ "complete criterion is met" if complete_criterion else "user goal is achieved" }}. "TERMINATE" is used to terminate the whole task with a failure when it doesn't seem like the user goal can be achieved. Do not use "TERMINATE" if waiting could lead the user towards the goal. Only return "TERMINATE" if you are on a page where the user goal cannot be achieved. All other actions are ignored when "TERMINATE" is returned. "KEYPRESS" is used to press a keyboard key when no clickable button or element achieves the same result. Only use KEYPRESS when pressing a key is the sole way to proceed (e.g., pressing Enter to submit a search with no search button, or Escape to close a modal with no close button). KEYPRESS does not require an element id. Requires the "key" field.{{' "CLOSE_PAGE" is used to close the current page when it is impossible to achieve the user goal on the current page.' if has_magic_link_page else ''}}
"id": str, // The id of the element to take action on. The id has to be one from the elements list

View File

@@ -9,8 +9,8 @@ Reply in JSON format with the following keys:
"actions": array // You are supposed to give only one action("CLICK") in the action list. Here's the format of the action:
[{
"reasoning": str, // The reasoning behind the action. This reasoning must be user information agnostic. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. Ask the question even if the details are disclosed in user instruction or user details. If you are clicking on something specific, ask about what to click on. If you're downloading a file and you have multiple options, ask the user which one to download. Otherwise, use null. Examples are: "What is the previous insurance provider of the user?", "Which invoice should I download?", "Does the user have any pets?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. IMPORTANT: This question must be user information agnostic and must NOT contain specific user data (like names, IDs, specific values, etc.). Ask what information is needed generically. Ask the question even if the details are disclosed in user instruction or user details. If you are clicking on something specific, ask about what to click on. If you're downloading a file and you have multiple options, ask the user which one to download. Otherwise, use null. Examples are: "What is the previous insurance provider of the user?", "Which invoice should I download?", "Does the user have any pets?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details. This answer CAN contain specific user data.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "CLICK". "CLICK" type means there's an element you'd like to click.
"id": str, // The id of the element to take action on. The id has to be one from the elements list.

View File

@@ -9,8 +9,8 @@ Reply in JSON format with the following keys:
"actions": array // You are supposed to give only one action("INPUT_TEXT") in the action list. Here's the format of the action:
[{
"reasoning": str, // The reasoning behind the action. This reasoning must be user information agnostic. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. Ask the question even if the details are disclosed in user instruction or user details. If it's a text field, ask for the text. Otherwise, use null. Examples are: "What product ID should I input into the search bar?", "What is the previous insurance provider of the user?", "Does the user have any pets?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. IMPORTANT: This question must be user information agnostic and must NOT contain specific user data (like names, IDs, specific values, etc.). Ask what information is needed generically. Ask the question even if the details are disclosed in user instruction or user details. If it's a text field, ask for the text. Otherwise, use null. Examples are: "What product ID should I input into the search bar?", "What is the previous insurance provider of the user?", "Does the user have any pets?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details. This answer CAN contain specific user data.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "INPUT_TEXT". "INPUT_TEXT" is an element you'd like to input text into.
"id": str, // The id of the element to take action on. The id has to be one from the elements list.

View File

@@ -9,8 +9,8 @@ Reply in JSON format with the following keys:
"actions": array // You are supposed to give only one action("SELECT_OPTION") in the action list. Here's the format of the action:
[{
"reasoning": str, // The reasoning behind the action. This reasoning must be user information agnostic. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. Ask the question even if the details are disclosed in user instruction or user details. If it's a dropdown, ask for the relevant information. Otherwise, use null. Examples are: "Which invoice should I download?", "Does the user have any pets?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. IMPORTANT: This question must be user information agnostic and must NOT contain specific user data (like names, IDs, specific values, etc.). Ask what information is needed generically. Ask the question even if the details are disclosed in user instruction or user details. If it's a dropdown, ask for the relevant information. Otherwise, use null. Examples are: "Which invoice should I download?", "Does the user have any pets?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details. This answer CAN contain specific user data.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "SELECT_OPTION". "SELECT_OPTION" is an element you'd like to select an option from.
"id": str, // The id of the element to take action on. The id has to be one from the elements list.

View File

@@ -9,8 +9,8 @@ Reply in JSON format with the following keys:
"actions": array // You are supposed to give only one action("UPLOAD_FILE") in the action list. Here's the format of the action:
[{
"reasoning": str, // The reasoning behind the action. This reasoning must be user information agnostic. Mention why you chose the action type, and why you chose the element id. Keep the reasoning short and to the point.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. Ask the question even if the details are disclosed in user instruction or user details. If it's a file upload, ask for the file. Otherwise, use null. Examples are: "What file should I upload?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details.
"user_detail_query": str, // Think of this value as a Jeopardy question. Ask the user for the details you need for executing this action. IMPORTANT: This question must be user information agnostic and must NOT contain specific user data (like names, IDs, specific values, etc.). Ask what information is needed generically. Ask the question even if the details are disclosed in user instruction or user details. If it's a file upload, ask for the file. Otherwise, use null. Examples are: "What file should I upload?". If the action doesn't require any user details, use null.
"user_detail_answer": str, // The answer to the `user_detail_query`. The source of this answer can be user instruction or user details. This answer CAN contain specific user data.
"confidence_float": float, // The confidence of the action. Pick a number between 0.0 and 1.0. 0.0 means no confidence, 1.0 means full confidence
"action_type": str, // It's a string enum: "UPLOAD_FILE". "UPLOAD_FILE" is an element you'd like to upload a file into.
"id": str, // The id of the element to take action on. The id has to be one from the elements list.