cloud ui docs + cookbooks (#4759)
Co-authored-by: Ritik Sahni <ritiksahni0203@gmail.com> Co-authored-by: Kunal Mishra <kunalm2345@gmail.com>
This commit is contained in:
80
docs/cloud/getting-started/monitor-a-run.mdx
Normal file
80
docs/cloud/getting-started/monitor-a-run.mdx
Normal file
@@ -0,0 +1,80 @@
|
||||
---
|
||||
title: Watching Live Execution
|
||||
subtitle: Monitor, interact with, and control running tasks
|
||||
slug: cloud/monitor-a-run
|
||||
---
|
||||
|
||||
When you run a task from the [Discover page](/cloud/getting-started/run-a-task), you're taken to the live execution screen where you can watch the browser in real time.
|
||||
|
||||
<img src="/images/cloud/live-execution-overview.png" alt="Live execution screen" />
|
||||
|
||||
---
|
||||
|
||||
## The execution screen
|
||||
|
||||
The execution view has three panels:
|
||||
|
||||
| Panel | What it shows |
|
||||
|-------|---------------|
|
||||
| **Left: Task configuration** | The block being executed, its URL, and prompt. A status badge shows the current state. |
|
||||
| **Center: Live browser** | Real-time view of the browser. You see pages load, forms fill, and buttons click. |
|
||||
| **Right: Agent logs** | Real-time LLM reasoning and action decisions. Shows why the AI made each choice. |
|
||||
|
||||
---
|
||||
|
||||
## When the live view is available
|
||||
|
||||
The live browser stream is active while the task is still in progress:
|
||||
|
||||
| Status | Live view |
|
||||
|--------|-----------|
|
||||
| `created` | Waiting to start |
|
||||
| `queued` | Waiting for a browser |
|
||||
| `running` | **Active**: the browser is navigating |
|
||||
| `paused` | Waiting for human interaction |
|
||||
| `completed` | Stream closed. View the recording instead. |
|
||||
| `failed` | Stream closed. View the recording instead. |
|
||||
| `terminated` | Stream closed. View the recording instead. |
|
||||
| `timed_out` | Stream closed. View the recording instead. |
|
||||
| `canceled` | Stream closed. View the recording instead. |
|
||||
|
||||
Once a task reaches a final state, the live stream closes. Open the run from **Runs** in the sidebar to access the full recording, screenshots, and action history.
|
||||
|
||||
---
|
||||
|
||||
## Taking control of the browser
|
||||
|
||||
The **Take Control** button lets you interact directly with the browser. This is useful when:
|
||||
- A CAPTCHA appears that the AI can't solve
|
||||
- The site has an unusual login flow
|
||||
- You need to navigate past an unexpected popup
|
||||
|
||||
Click **Take Control** to start interacting. Your mouse and keyboard input goes directly to the browser. Click **Stop Controlling** to hand control back to the AI.
|
||||
|
||||
<Warning>
|
||||
Taking control pauses the AI agent. Remember to release control so the agent can resume.
|
||||
</Warning>
|
||||
|
||||
---
|
||||
|
||||
## Stopping a running task
|
||||
|
||||
You can cancel a task at any time while it's running or queued. Click the **Cancel** button in the task header. A confirmation dialog appears before the task is stopped. The task transitions to `canceled` and any configured webhook fires with the canceled status.
|
||||
|
||||
<Note>
|
||||
Credits for actions already taken are still consumed. Canceling stops future actions but does not refund past ones.
|
||||
</Note>
|
||||
|
||||
---
|
||||
|
||||
## Reviewing results
|
||||
|
||||
Once a task finishes, open it from **Runs** to see the full results. The run detail page has five tabs:
|
||||
|
||||
- **Overview**: The AI's reasoning timeline alongside browser screenshots. Each Thought, Block, and Action card shows what the agent saw and why it acted.
|
||||
- **Output**: The complete JSON output and any downloaded files.
|
||||
- **Parameters**: The configuration you submitted: URL, prompt, engine, proxy location, webhook URL, data schema, and other settings.
|
||||
- **Recording**: Full video replay of the browser session. Every task is recorded automatically.
|
||||
- **Code**: Auto-generated Python code to reproduce this task via the API or SDK (when code generation is enabled).
|
||||
|
||||
For a full walkthrough of each tab, see [Run Details](/cloud/viewing-results/run-details).
|
||||
81
docs/cloud/getting-started/overview.mdx
Normal file
81
docs/cloud/getting-started/overview.mdx
Normal file
@@ -0,0 +1,81 @@
|
||||
---
|
||||
title: UI Overview
|
||||
slug: cloud/overview
|
||||
subtitle: Navigate the Skyvern Cloud dashboard
|
||||
---
|
||||
|
||||
Skyvern Cloud ([app.skyvern.com](https://app.skyvern.com)) lets you automate any website from your browser. Describe what you want in plain English, watch an AI-powered browser do it live, and get structured results back. No code required.
|
||||
|
||||
<Note>
|
||||
Looking to integrate Skyvern into your own app? See the [API Quickstart](/getting-started/quickstart) instead.
|
||||
</Note>
|
||||
|
||||
## The dashboard
|
||||
|
||||
Sign in and you'll land on the **Discover** page, the starting point for running automations.
|
||||
|
||||
<img src="/images/cloud/skyvern-cloud-discover.png" alt="Skyvern Cloud dashboard showing the Discover page" />
|
||||
|
||||
The left sidebar is your navigation hub. Here's what each section does:
|
||||
|
||||
### Build
|
||||
|
||||
Where you create and monitor automations.
|
||||
|
||||
| Page | Purpose |
|
||||
|------|---------|
|
||||
| **Discover** | Run one-off tasks. Type your instructions and target URL into a single prompt, pick an engine, and hit send. |
|
||||
| **Workflows** | Build multi-step automations with the visual workflow editor. Add loops, conditionals, and data passing between steps. |
|
||||
| **Runs** | Execution history for every task and workflow. Filter by status, drill into any run to see actions, recordings, and extracted data. |
|
||||
| **Browsers** | Active browser sessions. Useful for persistent sessions that keep login state across tasks. |
|
||||
|
||||
### Agents
|
||||
|
||||
Ready-made automation templates. Each agent is preconfigured with a prompt, target URL, and settings. Pick one to see it work or use it as a starting point for your own task.
|
||||
|
||||
### General
|
||||
|
||||
| Page | Purpose |
|
||||
|------|---------|
|
||||
| **Billing** | Usage, remaining credits, and plan management. |
|
||||
| **Credentials** | Store website logins securely. Skyvern uses these to authenticate automatically when it encounters a login page. |
|
||||
| **Settings** | API key, account preferences, and organization management. |
|
||||
|
||||
## How it works
|
||||
|
||||
Every automation in Skyvern Cloud follows the same pattern:
|
||||
|
||||
<Steps>
|
||||
<Step title="Describe your task">
|
||||
Type what you want into the prompt bar. Include the target URL and your instructions in one go. Something like "Get the top post from https://news.ycombinator.com" or "Fill out the contact form at https://example.com/contact with my details."
|
||||
</Step>
|
||||
<Step title="Watch it happen">
|
||||
A cloud browser opens and you see it navigate in real time. Pages load, elements highlight, actions fire. An agent log streams the AI's reasoning — what it sees on the page, what it plans to do, and why — so you can follow along. If the AI gets stuck, hit **Take Control** to jump in and help.
|
||||
</Step>
|
||||
<Step title="Get your results">
|
||||
Extracted data appears as structured JSON on the run detail page. Every run also includes an output view, full recording, the parameters you submitted, and auto-generated code to reproduce the task via API.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
That's it. The next guide walks you through this flow with a real example.
|
||||
|
||||
---
|
||||
|
||||
## Next steps
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card
|
||||
title="Run Your First Task"
|
||||
icon="play"
|
||||
href="/cloud/getting-started/run-your-first-task"
|
||||
>
|
||||
Follow along with a real example to see Skyvern Cloud in action
|
||||
</Card>
|
||||
<Card
|
||||
title="Core Concepts"
|
||||
icon="book"
|
||||
href="/getting-started/core-concepts"
|
||||
>
|
||||
Understand tasks, workflows, and other foundational concepts
|
||||
</Card>
|
||||
</CardGroup>
|
||||
185
docs/cloud/getting-started/run-a-task.mdx
Normal file
185
docs/cloud/getting-started/run-a-task.mdx
Normal file
@@ -0,0 +1,185 @@
|
||||
---
|
||||
title: The Discover Page
|
||||
subtitle: Run ad-hoc browser automations with natural language
|
||||
slug: cloud/run-a-task
|
||||
---
|
||||
|
||||
The **Discover** page is where you run one-off browser automations. Type what you want done in plain language, and Skyvern opens a browser and does it for you.
|
||||
|
||||
<img src="/images/cloud/discover-page-overview.png" alt="Discover page overview" />
|
||||
|
||||
---
|
||||
|
||||
## The prompt box
|
||||
|
||||
Type a natural language instruction describing what you want automated. Be specific about the goal and any data you want extracted.
|
||||
|
||||
**Examples:**
|
||||
- "Go to amazon.com and find the price of the MacBook Air M4"
|
||||
- "Fill out the contact form at example.com/contact with name John Doe and email john@example.com"
|
||||
- "Get an insurance quote from geico.com for a 2020 Toyota Camry"
|
||||
|
||||
<img src="/images/cloud/prompt-box-filled.png" alt="Prompt box with a sample prompt" />
|
||||
|
||||
Click the **send button** or press **Enter** to start.
|
||||
|
||||
Below the prompt box, **quick-action buttons** offer pre-built examples like "Add a product to cart" or "Get an insurance quote." Click one to run it immediately or use it as a starting point.
|
||||
|
||||
---
|
||||
|
||||
## Choosing an engine
|
||||
|
||||
The dropdown next to the send button controls which engine runs the task.
|
||||
|
||||
| Engine | Best for |
|
||||
|--------|----------|
|
||||
| **Skyvern 2.0 with Code** | Complex, multi-step tasks. Generates reusable scripts. **(Default)** |
|
||||
| **Skyvern 2.0** | Complex tasks without script generation |
|
||||
| **Skyvern 1.0** | Simple, single-objective tasks. Faster and cheaper. |
|
||||
|
||||
<Tip>
|
||||
Start with the default. Switch to Skyvern 1.0 when you have a straightforward, single-page task and want faster execution.
|
||||
</Tip>
|
||||
|
||||
---
|
||||
|
||||
## Advanced settings
|
||||
|
||||
Click the **gear icon** next to the prompt box to expand the settings panel.
|
||||
|
||||
<img src="/images/cloud/advanced-settings-panel.png" alt="Advanced settings panel" />
|
||||
|
||||
| Setting | What it does |
|
||||
|---------|-------------|
|
||||
| **Proxy Location** | Route the browser through a residential proxy in a specific country. Default is `RESIDENTIAL` (US). Set to `NONE` to disable. Available: US, UK, Germany, France, Spain, Ireland, India, Japan, Australia, Canada, Brazil, Mexico, Argentina, New Zealand, South Africa, Italy, Netherlands, Philippines, Turkey. |
|
||||
| **Webhook URL** | URL that receives a POST request when the task finishes. The payload includes status, extracted data, screenshots, and recording URL. |
|
||||
| **Browser Session ID** | Run inside an existing persistent browser session (`pbs_xxx`). Preserves cookies and login state across multiple tasks. |
|
||||
| **CDP Address** | Connect to your own browser via Chrome DevTools Protocol (e.g., `http://127.0.0.1:9222`). For local development. |
|
||||
| **2FA Identifier** | Links your TOTP credentials to this task. Skyvern uses it to retrieve the correct code when a 2FA prompt appears. |
|
||||
| **Extra HTTP Headers** | Custom headers sent with every browser request, as JSON (e.g., `{"Authorization": "Bearer token"}`). |
|
||||
| **Publish Workflow** | Save a reusable workflow alongside the task run. Re-run the same automation later from the Workflows page. |
|
||||
| **Max Steps Override** | Cap the number of AI reasoning steps. Each step = one screenshot-analyze-act cycle. Useful for controlling cost during development. |
|
||||
| **Max Screenshot Scrolls** | Number of scrolls for post-action screenshots. Increase for pages with lazy-loaded content. `0` = viewport only. |
|
||||
|
||||
---
|
||||
|
||||
## Data extraction schema
|
||||
|
||||
The **Data Schema** field in advanced settings lets you define the structure of extracted output as [JSON Schema](https://json-schema.org/).
|
||||
|
||||
Without a schema, the AI returns data in whatever format it chooses. With a schema, output conforms to your structure, making it predictable for downstream use.
|
||||
|
||||
<img src="/images/cloud/data-schema-field.png" alt="Data schema field with JSON" />
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"product_name": {
|
||||
"type": "string",
|
||||
"description": "The name of the product"
|
||||
},
|
||||
"price": {
|
||||
"type": "number",
|
||||
"description": "The price in USD"
|
||||
},
|
||||
"in_stock": {
|
||||
"type": "boolean",
|
||||
"description": "Whether the product is in stock"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Use the `description` field on each property to guide the AI on what to extract.
|
||||
|
||||
<Accordion title="Example: Extracting a list of items">
|
||||
```json
|
||||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"quotes": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"premium_amount": {
|
||||
"type": "string",
|
||||
"description": "Total premium in USD (e.g., '$321.57')"
|
||||
},
|
||||
"coverage_type": {
|
||||
"type": "string",
|
||||
"description": "Type of coverage (e.g., 'Full Coverage')"
|
||||
},
|
||||
"deductible": {
|
||||
"type": "string",
|
||||
"description": "Deductible amount"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
---
|
||||
|
||||
## Workflow templates
|
||||
|
||||
Below the prompt box, the Discover page shows a gallery of **workflow templates**: pre-built automations for common use cases.
|
||||
|
||||
<img src="/images/cloud/workflow-templates.png" alt="Workflow template gallery" />
|
||||
|
||||
Click any template to launch it with pre-filled configuration, or use it as a starting point and customize.
|
||||
|
||||
---
|
||||
|
||||
## Tips for better results
|
||||
|
||||
**Write specific prompts.** Include the exact goal, target fields, and what "done" looks like.
|
||||
|
||||
| Instead of | Write |
|
||||
|-----------|-------|
|
||||
| "Get some data from this site" | "Extract the product name, price, and availability from the first 5 results on amazon.com/s?k=wireless+mouse" |
|
||||
| "Fill out the form" | "Fill the contact form at example.com/contact with name 'Jane Doe', email 'jane@example.com', and message 'Demo request'" |
|
||||
|
||||
**Control cost with Max Steps.** Set **Max Steps Override** to a reasonable limit (e.g., 10–20 for simple tasks) during development. Each step consumes one credit. Remove the cap once you've confirmed the task works.
|
||||
|
||||
**Debug failures in order.** If a task fails or produces wrong results:
|
||||
|
||||
1. Check the **Failure Reason** at the top of the run detail page
|
||||
2. Read the **Thought cards** in the Overview timeline to find where the AI went off track
|
||||
3. Watch the **Recording** to see what actually happened on screen
|
||||
4. Review **Parameters** to confirm the inputs were correct
|
||||
|
||||
---
|
||||
|
||||
## What happens next
|
||||
|
||||
1. Your prompt is sent to Skyvern
|
||||
2. A cloud browser opens and navigates to the target URL (or finds one from your prompt)
|
||||
3. The AI analyzes the page, plans actions, and executes them step by step
|
||||
4. You're taken to the [live execution view](/cloud/getting-started/monitor-a-run) where you can watch it happen in real time
|
||||
5. When complete, results appear on the run detail page under **Runs**
|
||||
|
||||
---
|
||||
|
||||
## Next steps
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card
|
||||
title="Watching Live Execution"
|
||||
icon="eye"
|
||||
href="/cloud/getting-started/monitor-a-run"
|
||||
>
|
||||
Monitor runs, take control of the browser, and review results
|
||||
</Card>
|
||||
<Card
|
||||
title="Build a Workflow"
|
||||
icon="diagram-project"
|
||||
href="/cloud/building-workflows/build-a-workflow"
|
||||
>
|
||||
Turn a successful task into a reusable multi-step workflow
|
||||
</Card>
|
||||
</CardGroup>
|
||||
133
docs/cloud/getting-started/run-your-first-task.mdx
Normal file
133
docs/cloud/getting-started/run-your-first-task.mdx
Normal file
@@ -0,0 +1,133 @@
|
||||
---
|
||||
title: Your First Task
|
||||
slug: cloud/run-your-first-task
|
||||
subtitle: Run a browser automation from start to finish
|
||||
---
|
||||
|
||||
Let's run a real automation. You'll tell Skyvern to visit a website, extract data, and return it as JSON. Then watch the entire thing happen live.
|
||||
|
||||
## Step 1: Write your prompt
|
||||
|
||||
Open [app.skyvern.com](https://app.skyvern.com) and you'll land on the **Discover** page.
|
||||
|
||||
<img src="/images/cloud/skyvern-cloud-discover.png" alt="Discover page with a prompt entered" />
|
||||
|
||||
The Discover page has a single input field. Type your instructions and include the target URL in the same prompt. For this example, enter:
|
||||
|
||||
```
|
||||
Get the title of the #1 post on the front page for https://news.ycombinator.com
|
||||
```
|
||||
|
||||
That's it. Skyvern parses the URL and figures out how to navigate the page and extract the data.
|
||||
|
||||
Below the input, you'll see quick-action chips like "Add a product to cart" and "What's the top post on hackernews". Click any of these to try a pre-filled example instead.
|
||||
|
||||
<Tip>
|
||||
The more specific your prompt, the better. "Get the title of the #1 post" works much better than "get some data." Include the exact fields you want, what success looks like, and any constraints.
|
||||
</Tip>
|
||||
|
||||
## Step 2: Pick an engine and run
|
||||
|
||||
Next to your prompt, you'll see an engine selector. Click it to switch engines:
|
||||
|
||||
| Engine | When to use it |
|
||||
|--------|---------------|
|
||||
| **Skyvern 1.0** | Tasks with a simple, single goal: filling a form, searching for information on Google, reading content from a page |
|
||||
| **Skyvern 2.0** | Complex, multi-step tasks. Scores state-of-the-art 85.85% on the WebVoyager benchmark |
|
||||
| **Skyvern 2.0 with Code** | The default engine. Same capabilities as Skyvern 2.0, plus auto-generates reusable code and a workflow from the run |
|
||||
|
||||
For this example, keep the default **Skyvern 2.0 with Code** selected.
|
||||
|
||||
Click the **send button** (arrow icon to the right of the input). Skyvern generates a workflow from your prompt and opens it in the workflow editor. Click **Run** in the top right, confirm the parameters, then click **Run workflow** to start execution.
|
||||
|
||||
<Accordion title="Optional: Advanced settings">
|
||||
Click the **gear icon** next to send to configure additional options before running:
|
||||
|
||||
| Setting | What it does |
|
||||
|---------|-------------|
|
||||
| **Webhook Callback URL** | Endpoint to receive the extracted data when the run completes |
|
||||
| **Proxy Location** | Route Skyvern through one of the available proxies |
|
||||
| **Browser Session ID** | Reuse a persistent browser session to keep login state |
|
||||
| **CDP Address** | Connect to your own browser via Chrome DevTools Protocol |
|
||||
| **2FA Identifier** | Identifier for a 2FA code to handle two-factor auth automatically |
|
||||
| **Extra HTTP Headers** | Custom HTTP request headers (dict format) |
|
||||
| **Generate Script** | Auto-generate reusable scripts from a successful run |
|
||||
| **Publish Workflow** | Create a workflow alongside this task run |
|
||||
| **Max Steps Override** | Cap the number of steps the AI can take |
|
||||
| **Data Schema** | Define structured JSON output format |
|
||||
| **Max Screenshot Scrolls** | Limit scrolls for post-action screenshots (default: 3) |
|
||||
|
||||
These are all optional. The defaults work for most tasks.
|
||||
</Accordion>
|
||||
|
||||
## Step 3: Watch the live browser
|
||||
|
||||
This is where it gets interesting. Once the task starts, you'll see the run detail page with a live view of the browser:
|
||||
|
||||
<img src="/images/cloud/discover-prompt-in-process.png" alt="Run detail page showing a live browser navigating Hacker News" />
|
||||
|
||||
On the left, a **live browser view**. You'll see pages load, elements highlight, and actions fire.
|
||||
|
||||
On the right, the **agent log**. A running stream of the AI's Thoughts, Decisions, and block executions. If something goes wrong, this is where you'll figure out why.
|
||||
|
||||
## Step 4: Review the results
|
||||
|
||||
When the task finishes, the status badge flips to **completed** and the extracted data appears at the top of the page.
|
||||
|
||||
<img src="/images/cloud/discover-workflow-completed.png" alt="Completed run showing extracted data and result tabs" />
|
||||
|
||||
### Extracted data
|
||||
|
||||
The **Extracted Information** block shows your results as structured JSON:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"top_post_title": "Don't rent the cloud, own instead"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
Your result will differ — the #1 post changes constantly. The structure is what matters.
|
||||
|
||||
The agent log on the right confirms what happened. You'll see a final Thought summarizing the result.
|
||||
|
||||
### Tabs
|
||||
|
||||
Below the extracted data, five tabs give you different views of the run:
|
||||
|
||||
- **Overview**: The AI's reasoning timeline alongside browser screenshots. Each Thought, Block, and Action card shows what the agent saw and why it acted.
|
||||
- **Output**: The complete JSON output and any downloaded files.
|
||||
- **Parameters**: The exact configuration that was submitted (URL, prompt, engine, schema). Useful for reproducing or tweaking the run.
|
||||
- **Recording**: Full video replay of the browser session, start to finish.
|
||||
- **Code**: Auto-generated Python code to reproduce this task via the API or SDK.
|
||||
|
||||
## Try something bigger
|
||||
|
||||
Now that you've seen the basic flow, here are a few ideas to try next:
|
||||
|
||||
- **Fill a form**: Point Skyvern at a contact form and tell it what to enter in each field
|
||||
- **Compare prices**: Extract product names and prices from an e-commerce page using a data schema
|
||||
- **Navigate a flow**: Use Skyvern 2.0 to walk through a multi-page checkout or signup process
|
||||
- **Use an Agent template**: Check the **Agents** section in the sidebar for pre-built automations you can run instantly
|
||||
|
||||
---
|
||||
|
||||
## Next steps
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card
|
||||
title="Run a Task via API"
|
||||
icon="code"
|
||||
href="/running-automations/run-a-task"
|
||||
>
|
||||
Trigger automations programmatically with the Skyvern API
|
||||
</Card>
|
||||
<Card
|
||||
title="Core Concepts"
|
||||
icon="book"
|
||||
href="/getting-started/core-concepts"
|
||||
>
|
||||
Understand tasks, workflows, and other building blocks
|
||||
</Card>
|
||||
</CardGroup>
|
||||
Reference in New Issue
Block a user