diff --git a/docs/cloud/getting-started.mdx b/docs/cloud/getting-started.mdx deleted file mode 100644 index 8ddd6724..00000000 --- a/docs/cloud/getting-started.mdx +++ /dev/null @@ -1,144 +0,0 @@ ---- -title: Getting Started -slug: cloud/getting-started ---- - -Run browser automations without writing code. This guide walks you through running your first task in the Cloud UI. - - -Prefer to integrate via code? Check out the [SDK Quickstart](/getting-started/quickstart) instead. - - -## Open Skyvern Cloud - -Visit [app.skyvern.com](https://app.skyvern.com) and sign-up or sign-in to your account. - -You'll land on the **Discover** page—your starting point for running automations. - -discover page in skyvern - -## Understand the Interface - -The left sidebar organizes everything you need: - -**Build** -- **Discover** — Run one-off tasks with natural language prompts -- **Workflows** — Create and manage multi-step automations -- **Runs** — View execution history for all your tasks and workflows -- **Browsers** — Monitor active browser sessions - -**Agents** -- Pre-built templates for common use cases: form filling, data extraction, job applications, insurance quotes, and more. Select one to start with a working example. - -**General** -- **Billing** — View usage and manage your plan -- **Credentials** — Store logins securely for authenticated sites -- **Settings** — Copy your API key and configure account settings - -## Run Your First Task - -The fastest way to see Skyvern in action is to run a task directly from the Discover page. - -### Step 1: Enter your prompt - - - -In the main input field, describe what you want to accomplish. - - -Be specific about the goal, completion criteria, visual indicators, and all necessary data. - - -Check out our detailed [Prompting Guide](/prompting-guide). - -You can also select one of the quick-action buttons below the input for common example tasks like "Add a product to cart" or "Get an insurance quote." - -### Step 2: Configure settings (optional) - -Click the gear icon to access additional options: - -additional settings - -| Setting | What it does | -|---------|-------------| -| **Proxy Location** | Run automations from different geographic locations | -| **Browser Session ID** | Share login state between workflows | -| **2FA Identifier** | Handle two-factor authentication automatically | -| **Publish Workflow** | Save this task as a reusable template | -| **Data Schema** | Define structured JSON output format | - - -Use a JSON schema to get typed output: - -```json -{ - "type": "object", - "properties": { - "product_name": { "type": "string" }, - "price": { "type": "number" } - } -} -``` - - -### Step 3: Run the task - -Click the send button to start. - -In the background, Skyvern opens a browser, navigates to the URL you specified (or finds it using search), and interacts with the website component to get the job done. - -## Watch the Live Browser - -Once your task starts, you'll see the execution screen with two main panels: - - - -| **Left: Task configuration** | **Center: Live Browser** | **Right: Agent Logs** | -|---------|---------|-------------| -| Shows the block being executed with its URL and prompt. The status badge indicates whether the task is running, completed, or failed. | A real-time view of the browser as Skyvern navigates. You'll see pages load, forms fill, and buttons click—exactly as if you were doing it yourself. | A real-time log of LLM's reasoning output and tool usage. Useful for troubleshooting complex workflows. | - -The **"take control"** button lets you intervene and use the agent's browser yourself if it needs help with something unexpected, like a CAPTCHA or unusual login flow. - -## Review the Results - -After the task completes, go to **Runs** and open your latest task. - - - -### Extracted Data - -The AI extracts data based on your prompt and returns it as structured JSON: - -```json -{ - "post_title": "There's a ridiculous amount of tech in a disposable vape" -} -``` - -### Results Tabs - -The results page has several tabs: - -- **Actions** — Step-by-step breakdown of every action the AI took, with screenshots and reasoning -- **Recording** — Full video replay of the browser session -- **Parameters** — The task configuration you submitted (URL, prompt, webhooks, proxy, schema) -- **Diagnostics** — Debug info for troubleshooting: LLM prompts, element trees, annotated screenshots - -## What's Next? - - - - Create reusable, multi-step automations with the visual workflow builder - - - Store credentials securely for sites that require authentication - - diff --git a/docs/cloud/ui-overview.mdx b/docs/cloud/ui-overview.mdx new file mode 100644 index 00000000..3d903ac8 --- /dev/null +++ b/docs/cloud/ui-overview.mdx @@ -0,0 +1,84 @@ +--- +title: UI Overview +slug: cloud/ui-overview +subtitle: Navigate the Skyvern Cloud dashboard +--- + +Skyvern Cloud ([app.skyvern.com](https://app.skyvern.com)) lets you automate any website from your browser. Describe what you want in plain English, watch an AI-powered browser do it live, and get structured results back — no code required. + + +Looking to integrate Skyvern into your own app? See the [API Quickstart](/getting-started/quickstart) instead. + + +## The dashboard + +Sign in and you'll land on the **Discover** page — the starting point for running automations. + +Skyvern Cloud dashboard showing the Discover page + +The left sidebar is your navigation hub. Here's what each section does: + +### Build + +Where you create and monitor automations. + +| Page | Purpose | +|------|---------| +| **Discover** | Run one-off tasks. Type your instructions and target URL into a single prompt, pick an engine, and hit send. | +| **Workflows** | Build multi-step automations with the visual workflow editor. Add loops, conditionals, and data passing between steps. | +| **Runs** | Execution history for every task and workflow. Filter by status, drill into any run to see actions, recordings, and extracted data. | +| **Browsers** | Active browser sessions. Useful for persistent sessions that keep login state across tasks. | + +### Agents + +{/* TODO: Replace with screenshot of Agents section */} +Agents page showing pre-built automation templates + +Ready-made automation templates. Each agent is preconfigured with a prompt, target URL, and settings — pick one to see it work or use it as a starting point for your own task. + +### General + +| Page | Purpose | +|------|---------| +| **Billing** | Usage, remaining credits, and plan management. | +| **Credentials** | Store website logins securely. Skyvern uses these to authenticate automatically when it encounters a login page. | +| **Settings** | API key, account preferences, and organization management. | + +## How it works + +Every automation in Skyvern Cloud follows the same pattern: + + + + Type what you want into the prompt bar — include the target URL and your instructions in one go. Something like "Get the top post from https://news.ycombinator.com" or "Fill out the contact form at https://example.com/contact with my details." + + + A cloud browser opens and you see it navigate in real time. Pages load, elements highlight, actions fire. An agent log streams the AI's reasoning — every Thought and Decision — so you can follow along. If the AI gets stuck, hit **Take Control** to jump in and help. + + + Extracted data appears as structured JSON on the run detail page. Every run also includes an output view, full recording, the parameters you submitted, and auto-generated code to reproduce the task via API. + + + +That's it. The next guide walks you through this flow with a real example. + +--- + +## Next steps + + + + Follow along with a real example to see Skyvern Cloud in action + + + Understand tasks, workflows, blocks, and other building blocks + + diff --git a/docs/cloud/your-first-task.mdx b/docs/cloud/your-first-task.mdx new file mode 100644 index 00000000..f31bdafe --- /dev/null +++ b/docs/cloud/your-first-task.mdx @@ -0,0 +1,131 @@ +--- +title: Your First Task +slug: cloud/your-first-task +subtitle: Run a browser automation from start to finish +--- + +Let's run a real automation. You'll tell Skyvern to visit a website, extract data, and return it as JSON. Then watch the entire thing happen live. + +## Step 1: Write your prompt + +Open [app.skyvern.com](https://app.skyvern.com) and you'll land on the **Discover** page. + +Discover page with a prompt entered + +The Discover page has a single input field. Type your instructions and include the target URL in the same prompt. For this example, enter: + +``` +Get the title of the #1 post on the front page for https://news.ycombinator.com +``` + +That's it. Skyvern parses the URL and figures out how to navigate the page and extract the data. + +Below the input, you'll see quick-action chips like "Add a product to cart" and "What's the top post on hackernews". Click any of these to try a pre-filled example instead. + + +The more specific your prompt, the better. "Get the title of the #1 post" works much better than "get some data." Include the exact fields you want, what success looks like, and any constraints. + + +## Step 2: Pick an engine and run + +Next to your prompt, you'll see an engine selector. Click it to switch engines: + +| Engine | When to use it | +|--------|---------------| +| **Skyvern 1.0** | Tasks with a simple, single goal: filling a form, searching for information on Google, reading content from a page | +| **Skyvern 2.0** | Complex, multi-step tasks. Scores state-of-the-art 85.85% on the WebVoyager benchmark | +| **Skyvern 2.0 with code** | The default engine. Same capabilities as Skyvern 2.0, plus auto-generates reusable code and a workflow from the run | + +For this example, keep the default **Skyvern 2.0 with code** selected. + +Click the **send button** (arrow icon to the right of the input). Skyvern generates a workflow from your prompt and opens it in the workflow editor. Click **Run** in the top right, confirm the parameters, then click **Run workflow** to start execution. + + +Click the **gear icon** next to send to configure additional options before running: + +| Setting | What it does | +|---------|-------------| +| **Webhook Callback URL** | Endpoint to receive the extracted data when the run completes | +| **Proxy Location** | Route Skyvern through one of the available proxies | +| **Browser Session ID** | Reuse a persistent browser session to keep login state | +| **Browser Address** | Connect to a specific browser server for the task run | +| **2FA Identifier** | Identifier for a 2FA code to handle two-factor auth automatically | +| **Extra HTTP Headers** | Custom HTTP request headers (dict format) | +| **Generate Script** | Auto-generate reusable scripts from a successful run | +| **Publish Workflow** | Create a workflow alongside this task run | +| **Max Steps Override** | Cap the number of steps the AI can take | +| **Data Schema** | Define structured JSON output format | +| **Max Screenshot Scrolls** | Limit scrolls for post-action screenshots (default: 3) | + +These are all optional. The defaults work for most tasks. + + +## Step 3: Watch the live browser + +This is where it gets interesting. Once the task starts, you'll see the run detail page with a live view of the browser: + +Run detail page showing a live browser navigating Hacker News + +On the left, a **live browser view**. You'll see pages load, elements highlight, and actions fire. + +On the right, the **agent log**. A running stream of the AI's Thoughts, Decisions, and block executions. If something goes wrong, this is where you'll figure out why. + +## Step 4: Review the results + +When the task finishes, the status badge flips to **completed** and the extracted data appears at the top of the page. + +Completed run showing extracted data and result tabs + +### Extracted data + +The **Extracted Information** block shows your results as structured JSON: + +```json +[ + { + "top_post_title": "Don't rent the cloud, own instead" + } +] +``` + +The agent log on the right confirms what happened. You'll see a final Thought summarizing the result. + +### Tabs + +Below the extracted data, five tabs give you different views of the run: + +- **Overview**: The default view. Shows extracted data and the agent log with every Thought and Decision. +- **Output**: The raw JSON output from the run. +- **Parameters**: The exact configuration that was submitted (URL, prompt, engine, schema). Useful for reproducing or tweaking the run. +- **Recording**: Full video replay of the browser session, start to finish. +- **Code**: Auto-generated code snippets to reproduce this task via the API or SDK. + +## Try something bigger + +Now that you've seen the basic flow, here are a few ideas to try next: + +- **Fill a form**: Point Skyvern at a contact form and tell it what to enter in each field +- **Compare prices**: Extract product names and prices from an e-commerce page using a data schema +- **Navigate a flow**: Use the Advanced engine to walk through a multi-page checkout or signup process +- **Use an Agent template**: Check the **Agents** section in the sidebar for pre-built automations you can run instantly + +--- + +## Next steps + + + + Trigger automations programmatically with the Skyvern API + + + Understand tasks, workflows, and other building blocks + + diff --git a/docs/docs.json b/docs/docs.json index 389878f2..2c710905 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -76,9 +76,14 @@ }, { "tab": "Cloud UI", - "pages": [ - "cloud/getting-started", - "cloud/running-tasks" + "groups": [ + { + "group": "Getting Started", + "pages": [ + "cloud/ui-overview", + "cloud/your-first-task" + ] + } ] }, { diff --git a/docs/getting-started/quickstart.mdx b/docs/getting-started/quickstart.mdx index 92250625..4e1ac155 100644 --- a/docs/getting-started/quickstart.mdx +++ b/docs/getting-started/quickstart.mdx @@ -6,7 +6,7 @@ slug: getting-started/quickstart Run your first browser automation in 5 minutes. By the end of this guide, you'll scrape the top post from Hacker News using Skyvern's AI agent. -Prefer a visual interface? Try the [Cloud UI](/cloud/getting-started) instead — no code required. +Prefer a visual interface? Try the [Cloud UI](/cloud/ui-overview) instead — no code required. ## Step 1: Get your API key diff --git a/docs/images/cloud/discover-prompt-in-process.png b/docs/images/cloud/discover-prompt-in-process.png new file mode 100644 index 00000000..3054a243 Binary files /dev/null and b/docs/images/cloud/discover-prompt-in-process.png differ diff --git a/docs/images/cloud/discover-workflow-completed.png b/docs/images/cloud/discover-workflow-completed.png new file mode 100644 index 00000000..a7895ffc Binary files /dev/null and b/docs/images/cloud/discover-workflow-completed.png differ diff --git a/docs/images/cloud/skyvern-cloud-discover.png b/docs/images/cloud/skyvern-cloud-discover.png new file mode 100644 index 00000000..3b2373ff Binary files /dev/null and b/docs/images/cloud/skyvern-cloud-discover.png differ