From 374a1003d7b211682082a7435c8fe67affda5e19 Mon Sep 17 00:00:00 2001
From: Suchintan
Date: Mon, 26 May 2025 23:09:54 -0400
Subject: [PATCH] Update README (#2470)
---
README.md | 338 ++++++++++++++----------------
fern/credentials/introduction.mdx | 8 +-
fern/credentials/totp.mdx | 2 +-
fern/docs.yml | 2 +-
skyvern/library/skyvern.py | 67 ++++++
5 files changed, 231 insertions(+), 186 deletions(-)
diff --git a/README.md b/README.md
index da94cc86..20971de4 100644
--- a/README.md
+++ b/README.md
@@ -31,91 +31,41 @@
Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
-Instead of only relying on code-defined XPath interactions, Skyvern relies on prompts in addition to computer vision and LLMs to parse items in the viewport in real-time, create a plan for interaction and interact with them.
-
-This approach gives us a few advantages:
-
-1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code
-1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
-1. Skyvern is able to take a single workflow and apply it to a large number of websites, as it’s able to reason through the interactions necessary to complete the workflow
-1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
- 1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
- 1. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
-
+Instead of only relying on code-defined XPath interactions, Skyvern relies on Vision LLMs to interact with the websites.
Want to see examples of Skyvern in action? Jump to [#real-world-examples-of-skyvern](#real-world-examples-of-skyvern)
-# How it works
-Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
-
-Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:
-1. **Interactable Element Agent**: This agent is responsible for parsing the HTML of a website and extracting the interactable elements.
-2. **Navigation Agent**: This agent is responsible for planning the navigation to complete a task. Examples include clicking buttons, inserting text, selecting options, etc.
-3. **Data Extraction Agent**: This agent is responsible for extracting data from a website. It's capable of reading the tables and text on the page, and extracting the output in a user-defined structured format
-4. **Password Agent**: This agent is responsible for filling out password forms on a website. It's capable of reading the username and password from a password manager, and filling out the form while preserving the privacy of the user-defined secrets.
-5. **2FA Agent**: This agent is responsible for filling out 2FA forms on a website. It's capable of intercepting website requests for 2FAs, and either requesting user-defined APIs for 2FA codes or waiting for users to feed 2FA codes into it, and then completing the login process.
-6. **Dynamic Auto-complete Agent**: This agent is responsible for filling out dynamic auto-complete forms on a website. It's capable of reading the options presented to it, selecting the appropriate option based on the user's input, and adjusting its inputs based on the feedback from inside the form. Popular examples include: Address forms, university dropdowns, and more.
-
-
-
-
-
-
-# Demo
-
-https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f
-
-# Skyvern Cloud
-We offer a managed cloud version of Skyvern that allows you to run Skyvern without having to manage the infrastructure. It allows you to run multiple Skyvern instances in parallel and comes bundled with anti-bot detection mechanisms, proxy network, and CAPTCHA solvers.
-
-If you'd like to try it out,
-1. Navigate to [app.skyvern.com](https://app.skyvern.com)
-1. Create an account & Get $5 of credits on us
-1. Kick off your first task and see Skyvern in action!
-
# Quickstart
-This quickstart guide will walk you through getting Skyvern up and running on your local machine.
-## Install & Run
-> ⚠️ **REQUIREMENT**: This project requires Python 3.11 ⚠️
+## Skyvern Cloud
+[Skyvern Cloud](https://app.skyvern.com) is a managed cloud version of Skyvern that allows you to run Skyvern without worrying about the infrastructure. It allows you to run multiple Skyvern instances in parallel and comes bundled with anti-bot detection mechanisms, proxy network, and CAPTCHA solvers.
+
+If you'd like to try it out, navigate to [app.skyvern.com](https://app.skyvern.com) and create an account.
+
+## Local Install & Run
+> ⚠️ **REQUIREMENT**: This project requires at least Python 3.11 ⚠️
1. **Install Skyvern**
- ```bash
- pip install skyvern
- ```
-2. **Configure Skyvern** Run the setup wizard which will guide you through the configuration process, including Skyvern [MCP](https://github.com/Skyvern-AI/skyvern/blob/main/integrations/mcp/README.md) integration. This will generate a `.env` as the configuration settings file.
- ```bash
- skyvern init
- ```
+```bash
+pip install skyvern
+```
-3. **Launch the Skyvern Server**
+2. **Run Skyvern**
- ```bash
- skyvern run server
- ```
+```bash
+skyvern quickstart
+```
-4. **Launch the Skyvern UI**
-
- ```bash
- skyvern run ui
- ```
-
-5. **Check component status**
-
- ```bash
- skyvern status
- ```
-
-6. **Run task**
+3. **Run task**
Run a skyvern task locally:
```python
from skyvern import Skyvern
skyvern = Skyvern()
- task = await skyvern.run_task(prompt="Find the top post on hackernews today")
+ task = await skyvern.local.run_task(prompt="Find the top post on hackernews today")
print(task)
```
A local browser will pop up. Skyvern will start executing the task in the browser and close the it when the task is done. You will be able to review the task from http://localhost:8080/history
@@ -135,6 +85,28 @@ This quickstart guide will walk you through getting Skyvern up and running on yo
print(task)
```
+### Helpful commands to debug issues
+
+
+**Launch the Skyvern Server Separately**
+
+```bash
+skyvern run server
+```
+
+**Launch the Skyvern UI**
+
+```bash
+skyvern run ui
+```
+
+**Check status of the Skyvern service**
+
+```bash
+skyvern status
+```
+
+
## Docker Compose setup
1. Make sure you have [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed and running on your machine
@@ -147,45 +119,6 @@ This quickstart guide will walk you through getting Skyvern up and running on yo
```
3. Navigate to `http://localhost:8080` in your browser to start using the UI
-## Local vs. Docker
-
-Skyvern supports two startup options for local development:
-
-**Before Option 1**
-* If using Option 1, working in a virtual environment may improve the overall experience. Here's how to activate it:
-
- ```bash
- python3 -m venv skyvern-env/
- cd skyvern
- source ../skyvern-env/bin/activate
- pip install skyvern (Only if not already installed)
- pip install -e .
- ```
-
- Deactivation:
- ```bash
- deactivate
- ```
-
-**Option 1: Native (Postgres created by the CLI wizard)**
-
-* Use the CLI wizard to spin up a disposable Postgres container, then run the backend natively.
-* **Start with:**
- ```bash
- skyvern init
- skyvern run server
- ```
-* This reuses the Postgres container created by the wizard.
-
-**Option 2: Docker Compose (Postgres created with Compose)**
-
-* Use Docker Compose to start all services, including a new Postgres container.
-* **Start with:**
- ```bash
- docker compose up -d
- ```
-* This launches a new Postgres instance inside Compose.
-
> **Important:** Only one Postgres container can run on port 5432 at a time. If you switch from the CLI-managed Postgres to Docker Compose, you must first remove the original container:
> ```bash
> docker rm -f postgresql-container
@@ -193,21 +126,32 @@ Skyvern supports two startup options for local development:
If you encounter any database related errors while using Docker to run Skyvern, check which Postgres container is running with `docker ps`.
-## Model Context Protocol (MCP)
-See the MCP documentation [here](https://github.com/Skyvern-AI/skyvern/blob/main/integrations/mcp/README.md)
-## Prompting Tips
+# How it works
+Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
-Here are some tips that may help you on your adventure:
-1. Skyvern is really good at carrying out a single goal. If you give it too many instructions to do, it has a high likelihood of hallucinating along the way.
-2. Being really explicit about goals is very important. For example, if you're generating an insurance quote, let it know very clearly how it can identify it has accomplished its goals. Use words like "COMPLETE" or "TERMINATE" to indicate success and failure modes, respectively.
-3. Workflows can be used if you'd like to do more advanced things such as chaining multiple instructions together, or securely logging in. If you need any help with this, please feel free to book some time with us! We're always happy to help
+Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:
+
+
+
+
+
+
+This approach has a few advantages:
+
+1. Skyvern can operate on websites it's never seen before, as it's able to map visual elements to actions necessary to complete a workflow, without any customized code
+1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
+1. Skyvern is able to take a single workflow and apply it to a large number of websites, as it's able to reason through the interactions necessary to complete the workflow
+1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
+ 1. If you wanted to get an auto insurance quote from Geico, the answer to a common question "Were you eligible to drive at 18?" could be inferred from the driver receiving their license at age 16
+ 1. If you were doing competitor analysis, it's understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
-# Supported Functionality
+# Demo
+
+https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f
-## Skyvern 2.0
-Skyvern 2.0 is a major overhaul of Skyvern that includes a multi-agent architecture with a planner + validator agent, allowing Skyvern to complete more complex tasks with a zero-shot prompt.
+# Skyvern Features
## Skyvern Tasks
Tasks are the fundamental building block inside Skyvern. Each task is a single request to Skyvern, instructing it to navigate through a website and accomplish a specific goal.
@@ -257,26 +201,42 @@ You can also specify a `data_extraction_schema` directly within the main prompt
## File Downloading
Skyvern is also capable of downloading files from a website. All downloaded files are automatically uploaded to block storage (if configured), and you can access them via the UI.
-## Authentication (Beta)
+## Authentication
Skyvern supports a number of different authentication methods to make it easier to automate tasks behind a login. If you'd like to try it out, please reach out to us [via email](mailto:founders@skyvern.com) or [discord](https://discord.gg/fG2XXEuQX3).
+
+
+
+
+
+### 🔐 2FA Support (TOTP)
+Skyvern supports a number of different 2FA methods to allow you to automate workflows that require 2FA.
+
+Examples include:
+1. QR-based 2FA (e.g. Google Authenticator, Authy)
+1. Email based 2FA
+1. SMS based 2FA
+
+🔐 Learn more about 2FA support [here](https://docs.skyvern.com/credentials/totp).
+
### Password Manager Integrations
Skyvern currently supports the following password manager integrations:
- [x] Bitwarden
- [ ] 1Password
- [ ] LastPass
-
-
-
-### 2FA
-Skyvern supports a number of different 2FA methods to allow you to automate workflows that require 2FA.
+## Model Context Protocol (MCP)
+Skyvern supports the Model Context Protocol (MCP) to allow you to use any LLM that supports MCP.
-Examples include:
-1. QR-based 2FA (e.g. Google Authenticator, Authy)
-1. Email based 2FA
-1. SMS based 2FA
+See the MCP documentation [here](https://github.com/Skyvern-AI/skyvern/blob/main/integrations/mcp/README.md)
+
+## Zapier / Make.com / N8N Integration
+Skyvern supports Zapier, Make.com, and N8N to allow you to connect your Skyvern workflows to other apps.
+
+* [Zapier](https://docs.skyvern.com/integrations/zapier)
+* [Make.com](https://docs.skyvern.com/integrations/make.com)
+* [N8N](https://docs.skyvern.com/integrations/n8n)
# Real-world examples of Skyvern
@@ -328,53 +288,19 @@ We love to see how Skyvern is being used in the wild. Here are some examples of
# Contributor Setup
-### Prerequisites
-> :warning: :warning: MAKE SURE YOU ARE USING PYTHON 3.11 :warning: :warning:
-:warning: :warning: Only well-tested on MacOS :warning: :warning:
+The following command sets up your development environment to use pre-commit (our commit hook handler)
+```
+skyvern quickstart contributors
+```
-Before you begin, make sure you have the following installed:
-- [Brew (if you're on a Mac)](https://brew.sh/)
-- [Poetry](https://python-poetry.org/docs/#installation)
- - `brew install poetry`
-- [node](https://nodejs.org/en/download/)
-- [Docker](https://docs.docker.com/engine/install/)
-
-
-Note: Our setup script does these two for you, but they are here for reference.
-- [Python 3.11](https://www.python.org/downloads/)
- - `poetry env use 3.11`
-- [PostgreSQL 14](https://www.postgresql.org/download/) (if you're on a Mac, setup script will install it for you if you have homebrew installed)
- - `brew install postgresql`
-
-## Setup (Contributors)
-1. Clone the repository and navigate to the root directory
-1. Open Docker Desktop (Works for Windows, macOS, and Linux) or run Docker Daemon
-1. Run the setup script to install the necessary dependencies and setup your environment
- ```bash
- ./setup.sh
- ```
-1. Start the server
- ```bash
- ./run_skyvern.sh
- ```
-1. You can start sending requests to the server, but we built a simple UI to help you get started. To start the UI, run the following command:
- ```bash
- ./run_ui.sh
- ```
1. Navigate to `http://localhost:8080` in your browser to start using the UI
*The Skyvern CLI supports Windows, WSL, macOS, and Linux environments.*
-## Additional Setup for Contributors
-If you're looking to contribute to Skyvern, you'll need to install the pre-commit hooks to ensure code quality and consistency. You can do this by running the following command:
-```bash
-pre-commit install
-```
-
# Documentation
-More extensive documentation can be found on our [docs page](https://docs.skyvern.com). Please let us know if something is unclear or missing by opening an issue or reaching out to us [via email](mailto:founders@skyvern.com) or [discord](https://discord.gg/fG2XXEuQX3).
+More extensive documentation can be found on our [📕 docs page](https://docs.skyvern.com). Please let us know if something is unclear or missing by opening an issue or reaching out to us [via email](mailto:founders@skyvern.com) or [discord](https://discord.gg/fG2XXEuQX3).
# Supported LLMs
| Provider | Supported Models |
@@ -391,47 +317,99 @@ More extensive documentation can be found on our [docs page](https://docs.skyver
| OpenAI-compatible | Any custom API endpoint that follows OpenAI's API format (via [liteLLM](https://docs.litellm.ai/docs/providers/openai_compatible)) |
#### Environment Variables
+
+##### OpenAI
| Variable | Description| Type | Sample Value|
| -------- | ------- | ------- | ------- |
| `ENABLE_OPENAI`| Register OpenAI models | Boolean | `true`, `false` |
-| `ENABLE_ANTHROPIC` | Register Anthropic models| Boolean | `true`, `false` |
-| `ENABLE_AZURE` | Register Azure OpenAI models | Boolean | `true`, `false` |
-| `ENABLE_BEDROCK` | Register AWS Bedrock models. To use AWS Bedrock, you need to make sure your [AWS configurations](https://github.com/boto/boto3?tab=readme-ov-file#using-boto3) are set up correctly first. | Boolean | `true`, `false` |
-| `ENABLE_GEMINI` | Register Gemini models| Boolean | `true`, `false` |
-| `ENABLE_NOVITA`| Register Novita AI models | Boolean | `true`, `false` |
-| `ENABLE_OLLAMA`| Register local models via Ollama | Boolean | `true`, `false` |
-| `ENABLE_OPENROUTER`| Register OpenRouter models | Boolean | `true`, `false` |
-| `ENABLE_OPENAI_COMPATIBLE`| Register a custom OpenAI-compatible API endpoint | Boolean | `true`, `false` |
-| `LLM_KEY` | The name of the model you want to use | String | Currently supported llm keys: `OPENAI_GPT4_TURBO`, `OPENAI_GPT4V`, `OPENAI_GPT4O`, `OPENAI_GPT4O_MINI`, `ANTHROPIC_CLAUDE3`, `ANTHROPIC_CLAUDE3_OPUS`, `ANTHROPIC_CLAUDE3_SONNET`, `ANTHROPIC_CLAUDE3_HAIKU`, `ANTHROPIC_CLAUDE3.5_SONNET`, `BEDROCK_ANTHROPIC_CLAUDE3_OPUS`, `BEDROCK_ANTHROPIC_CLAUDE3_SONNET`, `BEDROCK_ANTHROPIC_CLAUDE3_HAIKU`, `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET`, `AZURE_OPENAI`, `GEMINI_PRO`, `GEMINI_FLASH`, `BEDROCK_AMAZON_NOVA_PRO`, `BEDROCK_AMAZON_NOVA_LITE`, `OLLAMA`, `OPENROUTER`, `OPENAI_COMPATIBLE`|
-| `SECONDARY_LLM_KEY` | The name of the model for mini agents skyvern runs with | String | Currently supported llm keys: `OPENAI_GPT4_TURBO`, `OPENAI_GPT4V`, `OPENAI_GPT4O`, `OPENAI_GPT4O_MINI`, `ANTHROPIC_CLAUDE3`, `ANTHROPIC_CLAUDE3_OPUS`, `ANTHROPIC_CLAUDE3_SONNET`, `ANTHROPIC_CLAUDE3_HAIKU`, `ANTHROPIC_CLAUDE3.5_SONNET`, `BEDROCK_ANTHROPIC_CLAUDE3_OPUS`, `BEDROCK_ANTHROPIC_CLAUDE3_SONNET`, `BEDROCK_ANTHROPIC_CLAUDE3_HAIKU`, `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET`, `AZURE_OPENAI`, `GEMINI_PRO`, `GEMINI_FLASH`, `NOVITA_DEEPSEEK_R1`, `NOVITA_DEEPSEEK_V3`, `NOVITA_LLAMA_3_3_70B`, `NOVITA_LLAMA_3_2_1B`, `NOVITA_LLAMA_3_2_3B`, `NOVITA_LLAMA_3_2_11B_VISION`, `NOVITA_LLAMA_3_1_8B`, `NOVITA_LLAMA_3_1_70B`, `NOVITA_LLAMA_3_1_405B`, `NOVITA_LLAMA_3_8B`, `NOVITA_LLAMA_3_70B`, `OLLAMA`, `OPENROUTER`, `OPENAI_COMPATIBLE`|
| `OPENAI_API_KEY` | OpenAI API Key | String | `sk-1234567890` |
| `OPENAI_API_BASE` | OpenAI API Base, optional | String | `https://openai.api.base` |
| `OPENAI_ORGANIZATION` | OpenAI Organization ID, optional | String | `your-org-id` |
+
+Supported LLM Keys: `OPENAI_GPT4_TURBO`, `OPENAI_GPT4V`, `OPENAI_GPT4O`, `OPENAI_GPT4O_MINI`
+
+##### Anthropic
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_ANTHROPIC` | Register Anthropic models| Boolean | `true`, `false` |
| `ANTHROPIC_API_KEY` | Anthropic API key| String | `sk-1234567890` |
+
+Supported LLM Keys: `ANTHROPIC_CLAUDE3`, `ANTHROPIC_CLAUDE3_OPUS`, `ANTHROPIC_CLAUDE3_SONNET`, `ANTHROPIC_CLAUDE3_HAIKU`, `ANTHROPIC_CLAUDE3.5_SONNET`
+
+##### Azure OpenAI
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_AZURE` | Register Azure OpenAI models | Boolean | `true`, `false` |
| `AZURE_API_KEY` | Azure deployment API key | String | `sk-1234567890` |
| `AZURE_DEPLOYMENT` | Azure OpenAI Deployment Name | String | `skyvern-deployment`|
| `AZURE_API_BASE` | Azure deployment api base url| String | `https://skyvern-deployment.openai.azure.com/`|
| `AZURE_API_VERSION` | Azure API Version| String | `2024-02-01`|
+
+Supported LLM Key: `AZURE_OPENAI`
+
+##### AWS Bedrock
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_BEDROCK` | Register AWS Bedrock models. To use AWS Bedrock, you need to make sure your [AWS configurations](https://github.com/boto/boto3?tab=readme-ov-file#using-boto3) are set up correctly first. | Boolean | `true`, `false` |
+
+Supported LLM Keys: `BEDROCK_ANTHROPIC_CLAUDE3_OPUS`, `BEDROCK_ANTHROPIC_CLAUDE3_SONNET`, `BEDROCK_ANTHROPIC_CLAUDE3_HAIKU`, `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET`, `BEDROCK_AMAZON_NOVA_PRO`, `BEDROCK_AMAZON_NOVA_LITE`
+
+##### Gemini
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_GEMINI` | Register Gemini models| Boolean | `true`, `false` |
| `GEMINI_API_KEY` | Gemini API Key| String | `your_google_gemini_api_key`|
+
+Supported LLM Keys: `GEMINI_PRO`, `GEMINI_FLASH`
+
+##### Novita AI
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_NOVITA`| Register Novita AI models | Boolean | `true`, `false` |
| `NOVITA_API_KEY` | Novita AI API Key| String | `your_novita_api_key`|
+
+Supported LLM Keys: `NOVITA_DEEPSEEK_R1`, `NOVITA_DEEPSEEK_V3`, `NOVITA_LLAMA_3_3_70B`, `NOVITA_LLAMA_3_2_1B`, `NOVITA_LLAMA_3_2_3B`, `NOVITA_LLAMA_3_2_11B_VISION`, `NOVITA_LLAMA_3_1_8B`, `NOVITA_LLAMA_3_1_70B`, `NOVITA_LLAMA_3_1_405B`, `NOVITA_LLAMA_3_8B`, `NOVITA_LLAMA_3_70B`
+
+##### Ollama
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_OLLAMA`| Register local models via Ollama | Boolean | `true`, `false` |
| `OLLAMA_SERVER_URL` | URL for your Ollama server | String | `http://host.docker.internal:11434` |
| `OLLAMA_MODEL` | Ollama model name to load | String | `qwen2.5:7b-instruct` |
+
+Supported LLM Key: `OLLAMA`
+
+##### OpenRouter
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_OPENROUTER`| Register OpenRouter models | Boolean | `true`, `false` |
| `OPENROUTER_API_KEY` | OpenRouter API key | String | `sk-1234567890` |
| `OPENROUTER_MODEL` | OpenRouter model name | String | `mistralai/mistral-small-3.1-24b-instruct` |
| `OPENROUTER_API_BASE` | OpenRouter API base URL | String | `https://api.openrouter.ai/v1` |
-| `LLM_CONFIG_MAX_TOKENS` | Override the max tokens used by the LLM | Integer | `128000` |
+
+Supported LLM Key: `OPENROUTER`
+
+##### OpenAI-Compatible
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `ENABLE_OPENAI_COMPATIBLE`| Register a custom OpenAI-compatible API endpoint | Boolean | `true`, `false` |
| `OPENAI_COMPATIBLE_MODEL_NAME` | Model name for OpenAI-compatible endpoint | String | `yi-34b`, `gpt-3.5-turbo`, `mistral-large`, etc.|
| `OPENAI_COMPATIBLE_API_KEY` | API key for OpenAI-compatible endpoint | String | `sk-1234567890`|
| `OPENAI_COMPATIBLE_API_BASE` | Base URL for OpenAI-compatible endpoint | String | `https://api.together.xyz/v1`, `http://localhost:8000/v1`, etc.|
-
-#### Environment Variables (OpenAI-Compatible model - additional config)
-| Variable | Description| Type | Sample Value|
-| -------- | ------- | ------- | ------- |
| `OPENAI_COMPATIBLE_API_VERSION` | API version for OpenAI-compatible endpoint, optional| String | `2023-05-15`|
| `OPENAI_COMPATIBLE_MAX_TOKENS` | Maximum tokens for completion, optional| Integer | `4096`, `8192`, etc.|
| `OPENAI_COMPATIBLE_TEMPERATURE` | Temperature setting, optional| Float | `0.0`, `0.5`, `0.7`, etc.|
| `OPENAI_COMPATIBLE_SUPPORTS_VISION` | Whether model supports vision, optional| Boolean | `true`, `false`|
+Supported LLM Key: `OPENAI_COMPATIBLE`
+
+##### General LLM Configuration
+| Variable | Description| Type | Sample Value|
+| -------- | ------- | ------- | ------- |
+| `LLM_KEY` | The name of the model you want to use | String | See supported LLM keys above |
+| `SECONDARY_LLM_KEY` | The name of the model for mini agents skyvern runs with | String | See supported LLM keys above |
+| `LLM_CONFIG_MAX_TOKENS` | Override the max tokens used by the LLM | Integer | `128000` |
+
# Feature Roadmap
This is our planned roadmap for the next few months. If you have any suggestions or would like to see a feature added, please don't hesitate to reach out to us [via email](mailto:founders@skyvern.com) or [discord](https://discord.gg/fG2XXEuQX3).
diff --git a/fern/credentials/introduction.mdx b/fern/credentials/introduction.mdx
index d652231b..99529d6a 100644
--- a/fern/credentials/introduction.mdx
+++ b/fern/credentials/introduction.mdx
@@ -24,7 +24,7 @@ In many scenarios, agents need access to sensitive information to complete tasks
-## TOTP (2FA/MFA/Verification Code) Support
+## 2FA Support (TOTP)
Many websites require entering a TOTP (2FA/MFA/Verification) code during login. Skyvern has the TOTP (2FA/MFA/Verification Code) support natively.
@@ -37,15 +37,15 @@ Many websites require entering a TOTP (2FA/MFA/Verification) code during login.
- Confirmation link sent to email. Click the link and create an account. (Talk to Skyvern Support if you need this)
- One time login link sent to email. Click and login. (Talk to Skyvern Support if you need this)
-See [TOTP (2FA/MFA/Verification Code)](/credentials/totp) for more details.
+See [2FA Support (TOTP)](/credentials/totp) for more details.
- Manage and use TOTP (2FA/MFA/Verification Code) with Skyvern Agent
+ Manage and use 2FA (TOTP) with Skyvern Agent
diff --git a/fern/credentials/totp.mdx b/fern/credentials/totp.mdx
index 60d18b8c..2be29725 100644
--- a/fern/credentials/totp.mdx
+++ b/fern/credentials/totp.mdx
@@ -1,5 +1,5 @@
---
-title: TOTP (2FA/MFA/Verification Code)
+title: 2FA Support (TOTP)
subtitle: How to send TOTP codes (2FA/MFA/Verification Code) to Skyvern
slug: credentials/totp
---
diff --git a/fern/docs.yml b/fern/docs.yml
index ef9f53fc..17cca4e5 100644
--- a/fern/docs.yml
+++ b/fern/docs.yml
@@ -108,7 +108,7 @@ navigation:
path: credentials/passwords.mdx
- page: Credit Card Management
path: credentials/credit-cards.mdx
- - page: TOTP (2FA/MFA/Verification Code)
+ - page: 2FA Support (TOTP)
path: credentials/totp.mdx
- page: Bitwarden
path: credentials/bitwarden.mdx
diff --git a/skyvern/library/skyvern.py b/skyvern/library/skyvern.py
index 5559e3de..6632c2d1 100644
--- a/skyvern/library/skyvern.py
+++ b/skyvern/library/skyvern.py
@@ -30,6 +30,73 @@ from skyvern.utils import migrate_db
class Skyvern(AsyncSkyvern):
+ class local:
+ """Internal namespace for local mode operations."""
+
+ @staticmethod
+ async def run_task(
+ prompt: str,
+ engine: RunEngine = RunEngine.skyvern_v2,
+ url: str | None = None,
+ webhook_url: str | None = None,
+ totp_identifier: str | None = None,
+ totp_url: str | None = None,
+ title: str | None = None,
+ error_code_mapping: dict[str, str] | None = None,
+ data_extraction_schema: dict[str, Any] | str | None = None,
+ proxy_location: ProxyLocation | None = None,
+ max_steps: int | None = None,
+ wait_for_completion: bool = True,
+ timeout: float = DEFAULT_AGENT_TIMEOUT,
+ browser_session_id: str | None = None,
+ user_agent: str | None = None,
+ ) -> TaskRunResponse:
+ """
+ Run a task using Skyvern in local mode.
+ This is a wrapper around Skyvern.run_task that ensures it's used in local mode.
+
+ Args:
+ prompt: The prompt describing the task to run
+ engine: The engine to use for running the task
+ url: Optional URL to navigate to
+ webhook_url: Optional webhook URL for callbacks
+ totp_identifier: Optional TOTP identifier
+ totp_url: Optional TOTP verification URL
+ title: Optional title for the task
+ error_code_mapping: Optional mapping of error codes to messages
+ data_extraction_schema: Optional schema for data extraction
+ proxy_location: Optional proxy location
+ max_steps: Optional maximum number of steps
+ wait_for_completion: Whether to wait for task completion
+ timeout: Timeout in seconds
+ browser_session_id: Optional browser session ID
+ user_agent: Optional user agent string
+
+ Returns:
+ TaskRunResponse: The response from running the task
+
+ Raises:
+ ValueError: If an API key is provided (this function is for local mode only)
+ """
+ skyvern = Skyvern() # Initialize in local mode (no API key)
+ return await skyvern.run_task(
+ prompt=prompt,
+ engine=engine,
+ url=url,
+ webhook_url=webhook_url,
+ totp_identifier=totp_identifier,
+ totp_url=totp_url,
+ title=title,
+ error_code_mapping=error_code_mapping,
+ data_extraction_schema=data_extraction_schema,
+ proxy_location=proxy_location,
+ max_steps=max_steps,
+ wait_for_completion=wait_for_completion,
+ timeout=timeout,
+ browser_session_id=browser_session_id,
+ user_agent=user_agent,
+ )
+
def __init__(
self,
*,