Improve TOTP docs & README & Make CLI actually support typer + py3.11 (#2791)

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
This commit is contained in:
Suchintan
2025-06-25 12:59:56 -04:00
committed by GitHub
parent 60dcd6bcb1
commit 9c9760d6ca
9 changed files with 307 additions and 179 deletions

View File

@@ -57,7 +57,8 @@ repos:
- id: pyupgrade - id: pyupgrade
exclude: | exclude: |
(?x)( (?x)(
^skyvern/client/.* ^skyvern/client/.*|
^skyvern/cli/.*
) )
- repo: https://github.com/pre-commit/mirrors-mypy - repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.16.0 rev: v1.16.0

180
README.md
View File

@@ -31,11 +31,10 @@
Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed. Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
Instead of only relying on code-defined XPath interactions, Skyvern relies on Vision LLMs to interact with the websites. Instead of only relying on code-defined XPath interactions, Skyvern relies on Vision LLMs to learn and interact with the websites.
Want to see examples of Skyvern in action? Jump to [#real-world-examples-of-skyvern](#real-world-examples-of-skyvern) Want to see examples of Skyvern in action? Jump to [#real-world-examples-of-skyvern](#real-world-examples-of-skyvern)
# Quickstart # Quickstart
## Skyvern Cloud ## Skyvern Cloud
@@ -44,7 +43,6 @@ Want to see examples of Skyvern in action? Jump to [#real-world-examples-of-skyv
If you'd like to try it out, navigate to [app.skyvern.com](https://app.skyvern.com) and create an account. If you'd like to try it out, navigate to [app.skyvern.com](https://app.skyvern.com) and create an account.
## Install & Run ## Install & Run
> ⚠️ **Supported Python Versions**: Python 3.11, 3.12, 3.13 ⚠️
### 1. Install Skyvern ### 1. Install Skyvern
@@ -60,6 +58,18 @@ skyvern quickstart
### 3. Run task ### 3. Run task
#### UI (Recommended)
Start the Skyvern service and UI
```bash
skyvern run all
```
Go to http://localhost:8080 and use the UI to run a task
#### Code
```python ```python
from skyvern import Skyvern from skyvern import Skyvern
@@ -67,27 +77,66 @@ skyvern = Skyvern()
task = await skyvern.run_task(prompt="Find the top post on hackernews today") task = await skyvern.run_task(prompt="Find the top post on hackernews today")
print(task) print(task)
``` ```
Skyvern starts running the task in a browser that pops up and closes it when the task is done. You will be able to review the task from http://localhost:8080/history Skyvern starts running the task in a browser that pops up and closes it when the task is done. You will be able to view the task from http://localhost:8080/history
You can also run a task on Skyvern Cloud: You can also run a task on different targets:
```python ```python
from skyvern import Skyvern from skyvern import Skyvern
# Run on Skyvern Cloud
skyvern = Skyvern(api_key="SKYVERN API KEY") skyvern = Skyvern(api_key="SKYVERN API KEY")
task = await skyvern.run_task(prompt="Find the top post on hackernews today")
print(task)
```
Or your local Skyvern service from step 2: # Local Skyvern service
```python
# Find your API KEY in .env
skyvern = Skyvern(base_url="http://localhost:8000", api_key="LOCAL SKYVERN API KEY") skyvern = Skyvern(base_url="http://localhost:8000", api_key="LOCAL SKYVERN API KEY")
task = await skyvern.run_task(prompt="Find the top post on hackernews today") task = await skyvern.run_task(prompt="Find the top post on hackernews today")
print(task) print(task)
``` ```
Check out more features to use for Skyvern task in our [official doc](https://docs.skyvern.com/running-tasks/run-tasks). Here are a couple of interesting examples: # How it works
#### Control your own browser (Chrome) Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:
<picture>
<source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_2_0_system_diagram.png" />
<img src="fern/images/skyvern_2_0_system_diagram.png" />
</picture>
This approach has a few advantages:
1. Skyvern can operate on websites it's never seen before, as it's able to map visual elements to actions necessary to complete a workflow, without any customized code
1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
1. Skyvern is able to take a single workflow and apply it to a large number of websites, as it's able to reason through the interactions necessary to complete the workflow
1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
1. If you wanted to get an auto insurance quote from Geico, the answer to a common question "Were you eligible to drive at 18?" could be inferred from the driver receiving their license at age 16
1. If you were doing competitor analysis, it's understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
A detailed technical report can be found [here](https://blog.skyvern.com/skyvern-2-0-state-of-the-art-web-navigation-with-85-8-on-webvoyager-eval/).
# Demo
<!-- Redo demo -->
https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f
# Performance & Evaluation
Skyvern has SOTA performance on the [WebBench benchmark](webbench.ai) with a 64.4% accuracy. The technical report + evaluation can be found [here](https://blog.skyvern.com/web-bench-a-new-way-to-compare-ai-browser-agents/)
<p align="center">
<img src="fern/images/performance/webbench_overall.png"/>
</p>
## Performance on WRITE tasks (eg filling out forms, logging in, downloading files, etc)
Skyvern is the best performing agent on WRITE tasks (eg filling out forms, logging in, downloading files, etc), which is primarily used for RPA (Robotic Process Automation) adjacent tasks.
<p align="center">
<img src="fern/images/performance/webbench_write.png"/>
</p>
## Advanced Usage
### Control your own browser (Chrome)
> ⚠️ WARNING: Since [Chrome 136](https://developer.chrome.com/blog/remote-debugging-port), Chrome refuses any CDP connect to the browser using the default user_data_dir. In order to use your browser data, Skyvern copies your default user_data_dir to `./tmp/user_data_dir` the first time connecting to your local browser. ⚠️ > ⚠️ WARNING: Since [Chrome 136](https://developer.chrome.com/blog/remote-debugging-port), Chrome refuses any CDP connect to the browser using the default user_data_dir. In order to use your browser data, Skyvern copies your default user_data_dir to `./tmp/user_data_dir` the first time connecting to your local browser. ⚠️
1. Just With Python Code 1. Just With Python Code
@@ -115,20 +164,9 @@ CHROME_EXECUTABLE_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Ch
BROWSER_TYPE=cdp-connect BROWSER_TYPE=cdp-connect
``` ```
Restart Skyvern service `skyvern run all` and run the task through UI or code: Restart Skyvern service `skyvern run all` and run the task through UI or code
```python
from skyvern import Skyvern
skyvern = Skyvern( ### Run Skyvern with any remote browser
base_url="http://localhost:8000",
api_key="YOUR_API_KEY",
)
task = await skyvern.run_task(
prompt="Find the top post on hackernews today",
)
```
#### Run Skyvern with any remote browser
Grab the cdp connection url and pass it to Skyvern Grab the cdp connection url and pass it to Skyvern
```python ```python
@@ -140,7 +178,7 @@ task = await skyvern.run_task(
) )
``` ```
#### Get consistent output schema from your run ### Get consistent output schema from your run
You can do this by adding the `data_extraction_schema` parameter: You can do this by adding the `data_extraction_schema` parameter:
```python ```python
from skyvern import Skyvern from skyvern import Skyvern
@@ -170,37 +208,24 @@ task = await skyvern.run_task(
### Helpful commands to debug issues ### Helpful commands to debug issues
**Launch the Skyvern Server Separately**
```bash ```bash
# Launch the Skyvern Server Separately*
skyvern run server skyvern run server
```
**Launch the Skyvern UI** # Launch the Skyvern UI
```bash
skyvern run ui skyvern run ui
```
**Check status of the Skyvern service** # Check status of the Skyvern service
```bash
skyvern status skyvern status
```
**Stop the Skyvern service**
```bash # Stop the Skyvern service
skyvern stop all skyvern stop all
```
**Stop the Skyvern UI**
```bash # Stop the Skyvern UI
skyvern stop ui skyvern stop ui
```
**Stop the Skyvern Server Separately** # Stop the Skyvern Server Separately
```bash
skyvern stop server skyvern stop server
``` ```
@@ -225,29 +250,6 @@ skyvern stop server
If you encounter any database related errors while using Docker to run Skyvern, check which Postgres container is running with `docker ps`. If you encounter any database related errors while using Docker to run Skyvern, check which Postgres container is running with `docker ps`.
# How it works
Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:
<picture>
<source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_2_0_system_diagram.png" />
<img src="fern/images/skyvern_2_0_system_diagram.png" />
</picture>
This approach has a few advantages:
1. Skyvern can operate on websites it's never seen before, as it's able to map visual elements to actions necessary to complete a workflow, without any customized code
1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
1. Skyvern is able to take a single workflow and apply it to a large number of websites, as it's able to reason through the interactions necessary to complete the workflow
1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
1. If you wanted to get an auto insurance quote from Geico, the answer to a common question "Were you eligible to drive at 18?" could be inferred from the driver receiving their license at age 16
1. If you were doing competitor analysis, it's understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
# Demo
<!-- Redo demo -->
https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f
# Skyvern Features # Skyvern Features
@@ -336,6 +338,8 @@ Skyvern supports Zapier, Make.com, and N8N to allow you to connect your Skyvern
* [Make.com](https://docs.skyvern.com/integrations/make.com) * [Make.com](https://docs.skyvern.com/integrations/make.com)
* [N8N](https://docs.skyvern.com/integrations/n8n) * [N8N](https://docs.skyvern.com/integrations/n8n)
🔐 Learn more about 2FA support [here](https://docs.skyvern.com/credentials/totp).
# Real-world examples of Skyvern # Real-world examples of Skyvern
We love to see how Skyvern is being used in the wild. Here are some examples of how Skyvern is being used to automate workflows in the real world. Please open PRs to add your own examples! We love to see how Skyvern is being used in the wild. Here are some examples of how Skyvern is being used to automate workflows in the real world. Please open PRs to add your own examples!
@@ -410,11 +414,9 @@ More extensive documentation can be found on our [📕 docs page](https://docs.s
| Anthropic | Claude 3 (Haiku, Sonnet, Opus), Claude 3.5 (Sonnet) | | Anthropic | Claude 3 (Haiku, Sonnet, Opus), Claude 3.5 (Sonnet) |
| Azure OpenAI | Any GPT models. Better performance with a multimodal llm (azure/gpt4-o) | | Azure OpenAI | Any GPT models. Better performance with a multimodal llm (azure/gpt4-o) |
| AWS Bedrock | Anthropic Claude 3 (Haiku, Sonnet, Opus), Claude 3.5 (Sonnet) | | AWS Bedrock | Anthropic Claude 3 (Haiku, Sonnet, Opus), Claude 3.5 (Sonnet) |
| Gemini | Gemini 2.5 Pro and flash, Gemini 2.0 |
| Ollama | Run any locally hosted model via [Ollama](https://github.com/ollama/ollama) | | Ollama | Run any locally hosted model via [Ollama](https://github.com/ollama/ollama) |
| OpenRouter | Access models through [OpenRouter](https://openrouter.ai) | | OpenRouter | Access models through [OpenRouter](https://openrouter.ai) |
| Gemini | Coming soon (contributions welcome) |
| Llama 3.2 | Coming soon (contributions welcome) |
| Novita AI | Llama 3.1 (8B, 70B), Llama 3.2 (1B, 3B, 11B Vision) |
| OpenAI-compatible | Any custom API endpoint that follows OpenAI's API format (via [liteLLM](https://docs.litellm.ai/docs/providers/openai_compatible)) | | OpenAI-compatible | Any custom API endpoint that follows OpenAI's API format (via [liteLLM](https://docs.litellm.ai/docs/providers/openai_compatible)) |
#### Environment Variables #### Environment Variables
@@ -427,7 +429,7 @@ More extensive documentation can be found on our [📕 docs page](https://docs.s
| `OPENAI_API_BASE` | OpenAI API Base, optional | String | `https://openai.api.base` | | `OPENAI_API_BASE` | OpenAI API Base, optional | String | `https://openai.api.base` |
| `OPENAI_ORGANIZATION` | OpenAI Organization ID, optional | String | `your-org-id` | | `OPENAI_ORGANIZATION` | OpenAI Organization ID, optional | String | `your-org-id` |
Supported LLM Keys: `OPENAI_GPT4_TURBO`, `OPENAI_GPT4V`, `OPENAI_GPT4O`, `OPENAI_GPT4O_MINI` Recommended `LLM_KEY`: `OPENAI_GPT4O`, `OPENAI_GPT4O_MINI`, `OPENAI_GPT4_1`, `OPENAI_O4_MINI`, `OPENAI_O3`
##### Anthropic ##### Anthropic
| Variable | Description| Type | Sample Value| | Variable | Description| Type | Sample Value|
@@ -435,7 +437,7 @@ Supported LLM Keys: `OPENAI_GPT4_TURBO`, `OPENAI_GPT4V`, `OPENAI_GPT4O`, `OPENAI
| `ENABLE_ANTHROPIC` | Register Anthropic models| Boolean | `true`, `false` | | `ENABLE_ANTHROPIC` | Register Anthropic models| Boolean | `true`, `false` |
| `ANTHROPIC_API_KEY` | Anthropic API key| String | `sk-1234567890` | | `ANTHROPIC_API_KEY` | Anthropic API key| String | `sk-1234567890` |
Supported LLM Keys: `ANTHROPIC_CLAUDE3`, `ANTHROPIC_CLAUDE3_OPUS`, `ANTHROPIC_CLAUDE3_SONNET`, `ANTHROPIC_CLAUDE3_HAIKU`, `ANTHROPIC_CLAUDE3.5_SONNET` Recommended`LLM_KEY`: `ANTHROPIC_CLAUDE3.5_SONNET`, `ANTHROPIC_CLAUDE3.7_SONNET`, `ANTHROPIC_CLAUDE4_OPUS`, `ANTHROPIC_CLAUDE4_SONNET`
##### Azure OpenAI ##### Azure OpenAI
| Variable | Description| Type | Sample Value| | Variable | Description| Type | Sample Value|
@@ -446,14 +448,14 @@ Supported LLM Keys: `ANTHROPIC_CLAUDE3`, `ANTHROPIC_CLAUDE3_OPUS`, `ANTHROPIC_CL
| `AZURE_API_BASE` | Azure deployment api base url| String | `https://skyvern-deployment.openai.azure.com/`| | `AZURE_API_BASE` | Azure deployment api base url| String | `https://skyvern-deployment.openai.azure.com/`|
| `AZURE_API_VERSION` | Azure API Version| String | `2024-02-01`| | `AZURE_API_VERSION` | Azure API Version| String | `2024-02-01`|
Supported LLM Key: `AZURE_OPENAI` Recommended `LLM_KEY`: `AZURE_OPENAI`
##### AWS Bedrock ##### AWS Bedrock
| Variable | Description| Type | Sample Value| | Variable | Description| Type | Sample Value|
| -------- | ------- | ------- | ------- | | -------- | ------- | ------- | ------- |
| `ENABLE_BEDROCK` | Register AWS Bedrock models. To use AWS Bedrock, you need to make sure your [AWS configurations](https://github.com/boto/boto3?tab=readme-ov-file#using-boto3) are set up correctly first. | Boolean | `true`, `false` | | `ENABLE_BEDROCK` | Register AWS Bedrock models. To use AWS Bedrock, you need to make sure your [AWS configurations](https://github.com/boto/boto3?tab=readme-ov-file#using-boto3) are set up correctly first. | Boolean | `true`, `false` |
Supported LLM Keys: `BEDROCK_ANTHROPIC_CLAUDE3_OPUS`, `BEDROCK_ANTHROPIC_CLAUDE3_SONNET`, `BEDROCK_ANTHROPIC_CLAUDE3_HAIKU`, `BEDROCK_ANTHROPIC_CLAUDE3.5_SONNET`, `BEDROCK_AMAZON_NOVA_PRO`, `BEDROCK_AMAZON_NOVA_LITE` Recommended `LLM_KEY`: `BEDROCK_ANTHROPIC_CLAUDE3.7_SONNET_INFERENCE_PROFILE`, `BEDROCK_ANTHROPIC_CLAUDE4_OPUS_INFERENCE_PROFILE`, `BEDROCK_ANTHROPIC_CLAUDE4_SONNET_INFERENCE_PROFILE`
##### Gemini ##### Gemini
| Variable | Description| Type | Sample Value| | Variable | Description| Type | Sample Value|
@@ -461,15 +463,7 @@ Supported LLM Keys: `BEDROCK_ANTHROPIC_CLAUDE3_OPUS`, `BEDROCK_ANTHROPIC_CLAUDE3
| `ENABLE_GEMINI` | Register Gemini models| Boolean | `true`, `false` | | `ENABLE_GEMINI` | Register Gemini models| Boolean | `true`, `false` |
| `GEMINI_API_KEY` | Gemini API Key| String | `your_google_gemini_api_key`| | `GEMINI_API_KEY` | Gemini API Key| String | `your_google_gemini_api_key`|
Supported LLM Keys: `GEMINI_PRO`, `GEMINI_FLASH` Recommended `LLM_KEY`: `GEMINI_2.5_PRO_PREVIEW`, `GEMINI_2.5_FLASH_PREVIEW`
##### Novita AI
| Variable | Description| Type | Sample Value|
| -------- | ------- | ------- | ------- |
| `ENABLE_NOVITA`| Register Novita AI models | Boolean | `true`, `false` |
| `NOVITA_API_KEY` | Novita AI API Key| String | `your_novita_api_key`|
Supported LLM Keys: `NOVITA_DEEPSEEK_R1`, `NOVITA_DEEPSEEK_V3`, `NOVITA_LLAMA_3_3_70B`, `NOVITA_LLAMA_3_2_1B`, `NOVITA_LLAMA_3_2_3B`, `NOVITA_LLAMA_3_2_11B_VISION`, `NOVITA_LLAMA_3_1_8B`, `NOVITA_LLAMA_3_1_70B`, `NOVITA_LLAMA_3_1_405B`, `NOVITA_LLAMA_3_8B`, `NOVITA_LLAMA_3_70B`
##### Ollama ##### Ollama
| Variable | Description| Type | Sample Value| | Variable | Description| Type | Sample Value|
@@ -478,7 +472,9 @@ Supported LLM Keys: `NOVITA_DEEPSEEK_R1`, `NOVITA_DEEPSEEK_V3`, `NOVITA_LLAMA_3_
| `OLLAMA_SERVER_URL` | URL for your Ollama server | String | `http://host.docker.internal:11434` | | `OLLAMA_SERVER_URL` | URL for your Ollama server | String | `http://host.docker.internal:11434` |
| `OLLAMA_MODEL` | Ollama model name to load | String | `qwen2.5:7b-instruct` | | `OLLAMA_MODEL` | Ollama model name to load | String | `qwen2.5:7b-instruct` |
Supported LLM Key: `OLLAMA` Recommended `LLM_KEY`: `OLLAMA`
Note: Ollama does not support vision yet.
##### OpenRouter ##### OpenRouter
| Variable | Description| Type | Sample Value| | Variable | Description| Type | Sample Value|
@@ -488,7 +484,7 @@ Supported LLM Key: `OLLAMA`
| `OPENROUTER_MODEL` | OpenRouter model name | String | `mistralai/mistral-small-3.1-24b-instruct` | | `OPENROUTER_MODEL` | OpenRouter model name | String | `mistralai/mistral-small-3.1-24b-instruct` |
| `OPENROUTER_API_BASE` | OpenRouter API base URL | String | `https://api.openrouter.ai/v1` | | `OPENROUTER_API_BASE` | OpenRouter API base URL | String | `https://api.openrouter.ai/v1` |
Supported LLM Key: `OPENROUTER` Recommended `LLM_KEY`: `OPENROUTER`
##### OpenAI-Compatible ##### OpenAI-Compatible
| Variable | Description| Type | Sample Value| | Variable | Description| Type | Sample Value|
@@ -515,7 +511,7 @@ Supported LLM Key: `OPENAI_COMPATIBLE`
This is our planned roadmap for the next few months. If you have any suggestions or would like to see a feature added, please don't hesitate to reach out to us [via email](mailto:founders@skyvern.com) or [discord](https://discord.gg/fG2XXEuQX3). This is our planned roadmap for the next few months. If you have any suggestions or would like to see a feature added, please don't hesitate to reach out to us [via email](mailto:founders@skyvern.com) or [discord](https://discord.gg/fG2XXEuQX3).
- [x] **Open Source** - Open Source Skyvern's core codebase - [x] **Open Source** - Open Source Skyvern's core codebase
- [x] **[BETA] Workflow support** - Allow support to chain multiple Skyvern calls together - [x] **Workflow support** - Allow support to chain multiple Skyvern calls together
- [x] **Improved context** - Improve Skyvern's ability to understand content around interactable elements by introducing feeding relevant label context through the text prompt - [x] **Improved context** - Improve Skyvern's ability to understand content around interactable elements by introducing feeding relevant label context through the text prompt
- [x] **Cost Savings** - Improve Skyvern's stability and reduce the cost of running Skyvern by optimizing the context tree passed into Skyvern - [x] **Cost Savings** - Improve Skyvern's stability and reduce the cost of running Skyvern by optimizing the context tree passed into Skyvern
- [x] **Self-serve UI** - Deprecate the Streamlit UI in favour of a React-based UI component that allows users to kick off new jobs in Skyvern - [x] **Self-serve UI** - Deprecate the Streamlit UI in favour of a React-based UI component that allows users to kick off new jobs in Skyvern
@@ -523,14 +519,14 @@ This is our planned roadmap for the next few months. If you have any suggestions
- [x] **Chrome Viewport streaming** - Introduce a way to live-stream the Chrome viewport to the user's browser (as a part of the self-serve UI) - [x] **Chrome Viewport streaming** - Introduce a way to live-stream the Chrome viewport to the user's browser (as a part of the self-serve UI)
- [x] **Past Runs UI** - Deprecate the Streamlit UI in favour of a React-based UI that allows you to visualize past runs and their results - [x] **Past Runs UI** - Deprecate the Streamlit UI in favour of a React-based UI that allows you to visualize past runs and their results
- [X] **Auto workflow builder ("Observer") mode** - Allow Skyvern to auto-generate workflows as it's navigating the web to make it easier to build new workflows - [X] **Auto workflow builder ("Observer") mode** - Allow Skyvern to auto-generate workflows as it's navigating the web to make it easier to build new workflows
- [ ] **Prompt Caching** - Introduce a caching layer to the LLM calls to dramatically reduce the cost of running Skyvern (memorize past actions and repeat them!) - [x] **Prompt Caching** - Introduce a caching layer to the LLM calls to dramatically reduce the cost of running Skyvern (memorize past actions and repeat them!)
- [ ] **Web Evaluation Dataset** - Integrate Skyvern with public benchmark tests to track the quality of our models over time - [x] **Web Evaluation Dataset** - Integrate Skyvern with public benchmark tests to track the quality of our models over time
- [ ] **Improved Debug mode** - Allow Skyvern to plan its actions and get "approval" before running them, allowing you to debug what it's doing and more easily iterate on the prompt - [ ] **Improved Debug mode** - Allow Skyvern to plan its actions and get "approval" before running them, allowing you to debug what it's doing and more easily iterate on the prompt
- [ ] **Chrome Extension** - Allow users to interact with Skyvern through a Chrome extension (incl voice mode, saving tasks, etc.) - [ ] **Chrome Extension** - Allow users to interact with Skyvern through a Chrome extension (incl voice mode, saving tasks, etc.)
- [ ] **Skyvern Action Recorder** - Allow Skyvern to watch a user complete a task and then automatically generate a workflow for it - [ ] **Skyvern Action Recorder** - Allow Skyvern to watch a user complete a task and then automatically generate a workflow for it
- [ ] **Interactable Livestream** - Allow users to interact with the livestream in real-time to intervene when necessary (such as manually submitting sensitive forms) - [ ] **Interactable Livestream** - Allow users to interact with the livestream in real-time to intervene when necessary (such as manually submitting sensitive forms)
- [ ] **Integrate LLM Observability tools** - Integrate LLM Observability tools to allow back-testing prompt changes with specific data sets + visualize the performance of Skyvern over time - [ ] **Integrate LLM Observability tools** - Integrate LLM Observability tools to allow back-testing prompt changes with specific data sets + visualize the performance of Skyvern over time
- [ ] **Langchain Integration** - Create langchain integration in langchain_community to use Skyvern as a "tool". - [x] **Langchain Integration** - Create langchain integration in langchain_community to use Skyvern as a "tool".
# Contributing # Contributing
@@ -547,7 +543,7 @@ By Default, Skyvern collects basic usage statistics to help us understand how Sk
# License # License
Skyvern's open source repository is supported via a managed cloud. All of the core logic powering Skyvern is available in this open source repository licensed under the [AGPL-3.0 License](LICENSE), with the exception of anti-bot measures available in our managed cloud offering. Skyvern's open source repository is supported via a managed cloud. All of the core logic powering Skyvern is available in this open source repository licensed under the [AGPL-3.0 License](LICENSE), with the exception of anti-bot measures available in our managed cloud offering.
If you have any questions or concerns around licensing, please [contact us](mailto:founders@skyvern.com) and we would be happy to help. If you have any questions or concerns around licensing, please [contact us](mailto:support@skyvern.com) and we would be happy to help.
# Star History # Star History

View File

@@ -4,10 +4,29 @@ subtitle: Never send your credentials to LLMs.
slug: credentials/introduction slug: credentials/introduction
--- ---
Agents need access to sensitive information to complete tasks. For example, usernames and passwords to login, credit cards for payments, etc. With Skyvern's credential management tool, you can run agents securely without exposing your credentials to LLMs. Need to give Skyvern access to your credentials? Usernames and passwords, 2FA, credit cards for payments, etc. Skyvern's credential management provides a secure way to manage and use credentials. Agents can then without exposing those credentials to LLMs.
## Credential Support ### 2FA Support (TOTP)
<CardGroup cols={2}>
Many websites require entering a TOTP (2FA/MFA/Verification) code during login. Skyvern has TOTP (2FA/MFA/Verification Code) support natively.
**Supported authentication methods**:
- Phone verification code
- Email verification code
- Authenticator app
- Confirmation link sent to email. Click the link and create an account
- One time login link sent to email. Click and login
If you have any questions about how to set these up, please contact [Skyvern Support](mailto:support@skyvern.com).
## Credit Card Management
Skyvern can manage your credit cards and use them to complete tasks.
**Supported credit card types**:
- Visa
- Mastercard
<CardGroup cols={3}>
<Card <Card
title="Password Management" title="Password Management"
icon="key" icon="key"
@@ -22,24 +41,6 @@ Agents need access to sensitive information to complete tasks. For example, user
> >
Manage and use credit cards with Skyvern Agent Manage and use credit cards with Skyvern Agent
</Card> </Card>
</CardGroup>
## 2FA Support (TOTP)
Many websites require entering a TOTP (2FA/MFA/Verification) code during login. Skyvern has the TOTP (2FA/MFA/Verification Code) support natively.
**Supported authentication methods**:
- Phone verification code
- Email verification code
- Authenticator app
**Coming soon**:
- Confirmation link sent to email. Click the link and create an account. (Talk to Skyvern Support if you need this)
- One time login link sent to email. Click and login. (Talk to Skyvern Support if you need this)
See [2FA Support (TOTP)](/credentials/totp) for more details.
<CardGroup cols={1}>
<Card <Card
title="2FA Support (TOTP)" title="2FA Support (TOTP)"
icon="pager" icon="pager"
@@ -49,14 +50,63 @@ See [2FA Support (TOTP)](/credentials/totp) for more details.
</Card> </Card>
</CardGroup> </CardGroup>
## Bitwarden Integration
Skyvern can integrate with your Bitwarden account. Skyvern agent can read the credentials on the fly to complete tasks while keeping your credentials secure. Skyvern never stores your Bitwarden credentials or sends them to LLMs. ## Password Manager Integrations
See [Bitwarden Integration](/credentials/bitwarden) for more details. If you have your own password manager, Skyvern can integrate with it. Skyvern can read the credentials on the fly to complete tasks while keeping your credentials secure. Skyvern never stores your credentials or sends them to any third parties (including LLMs).
## Coming Soon **Supported password manager types**:
(Contact support@skyvern.com if you need any password integration to help us prioritize) - Bitwarden
- 1Password Integration (Private beta)
- 1Password Integration **Coming Soon**:
- LastPass Integration - LastPass Integration
- Keeper Integration
- Azure Key Vault Integration
Contact [Skyvern Support](mailto:support@skyvern.com) if you want access to the private beta for these integrations.
<CardGroup cols={3}>
<Card
title="Bitwarden Integration"
icon="shield-keyhole"
href="/credentials/bitwarden"
>
Securely manage your passwords with Bitwarden
</Card>
<Card
title="1Password Integration"
icon="fingerprint"
href="mailto:sales@skyvern.com"
>
Securely manage your passwords with 1Password (Private beta)
</Card>
<Card
title="LastPass Integration"
icon="vault"
href="mailto:sales@skyvern.com"
>
(coming soon) Securely manage your passwords with LastPass
</Card>
<Card
title="Keeper Integration"
icon="lock-keyhole"
href="mailto:sales@skyvern.com"
>
(coming soon) Securely manage your passwords with Keeper
</Card>
<Card
title="Azure Key Vault Integration"
icon="cloud"
href="mailto:sales@skyvern.com"
>
(coming soon) Securely manage your secrets with Azure Key Vault
</Card>
<Card
title="AWS Secret Manager Integration"
icon="key"
href="mailto:sales@skyvern.com"
>
(coming soon) Securely manage your secrets with AWS Secret Manager
</Card>
</CardGroup>

View File

@@ -1,58 +1,32 @@
--- ---
title: 2FA Support (TOTP) title: 2FA Support (TOTP)
subtitle: How to send 2FA codes (TOTP) to Skyvern subtitle: How to send TOTP codes (2FA/MFA/Verification Code) to Skyvern
slug: credentials/totp slug: credentials/totp
--- ---
Skyvern supports one-time password (see https://www.twilio.com/docs/glossary/totp for more information), also known as 2FA/MFA. For Skyvern to get the code, there are three options: Skyvern supports logging into websites that require a 2FA/MFA/Verification code. There are 5 kinds of 2FA we support today:
- [Option 1: Store your 2FA/MFA secret in Skyvern Credential tool](#option-1-store-your-2famfa-secret-in-the-skyvern-credential-tool) - [Option 1: Google Authenticator (TOTP)](#option-1-google-authenticator-totp)
- [Option 2: Skyvern gets the code from your endpoint](#option-2-get-code-from-your-endpoint) - [Option 2: Email Verification Code](#option-2-email-verification-code)
- [Option 3: You push the code to Skyvern](#option-3-push-code-to-skyvern) - [Option 3: Phone Verification Code](#option-3-phone-verification-code)
- [Option 4: Let Skyvern get the code from your server (webhook)](#option-4-let-skyvern-get-the-code-from-your-server-webhook)
- [Option 5: One Time Login Link](#option-5-one-time-login-link)
## Option 1: Store your 2FA/MFA secret in the Skyvern Credential tool ## Option 1: Google Authenticator (TOTP)
Save your username and password in [Skyvern Credential](https://app.skyvern.com/credentials) where you can also store your 2FA/MFA key/secret. Step 1: Save your username and password in [Skyvern Credential](https://app.skyvern.com/credentials). See [Password Management](/credentials/passwords#manage-passwords-in-skyvern-cloud) for more details.
See [Password Management](/credentials/passwords#manage-passwords-in-skyvern-cloud) for more details. Step 2: Add your account by manually entering the secret key (extracted from the QR code). Not sure how to get it? [Follow this guide](https://bitwarden.com/help/integrated-authenticator/).
## Option 2: Get Code From Your Endpoint
You can pass `totp_url` when running [a task](/api-reference/api-reference/agent/run-task) or a [workflow](/api-reference/api-reference/agent/run-workflow). Inside this endpoint hosted by you, you have to conform to the following schema:
### Set Up Your TOTP Endpoint > 💡 Don't have the key? Contact [Skyvern Support](mailto:support@skyvern.com) and we can help you get it.
For websites that require a verification code to complete a task, you have to set up a TOTP endpoint for Skyvern to fetch the verification code.
Here's the TOTP endpoint contract you should use: ## Option 2: Email Verification Code
Email verification codes require you to set up a forwarding rule that forwards these emails to a Skyvern endpoint.
Request (POST): The forwarding rule can be set up using [Gmail + Zapier](https://zapier.com/app/home) or similar tools. (instructions below)
| Parameter | Type | Required? | Sample Value | Description |
| --- | --- | --- | --- | --- |
| task_id | String | yes | tsk_123 | The task ID that needs the verification to be done |
Response: > 💡 *Coming Soon*: We plan to provide email forwarding addresses that make this easier to set up
| Parameter | Type | Required? | Sample Value | Description |
| --- | --- | --- | --- | --- |
| task_id | String | yes | tsk_123 | The task ID that needs the verification to be done |
| verification_code | String | no | 123456 | The verification code |
### Validate The Sender of The Request You can pass `totp_identifier` when running [a task](/api-reference/api-reference/agent/run-task) or a [workflow](/api-reference/api-reference/agent/run-workflow). When the TOTP code arrives at your inbox, all you need to do is to send the email/message to Skyvern's [TOTP endpoint](/api-reference/api-reference/credentials/send-totp-code).
Same as the webhook API, your server needs to make sure its Skyvern thats making the request.
- a python example for how to generate and validate the signature:
```python
def validate_skyvern_request_headers(request: Request) -> bool:
header_skyvern_signature = request.headers["x-skyvern-signature"]
payload = request.body() # this is a bytes
hash_obj = hmac.new(SKYVERN_API_KEY.encode("utf-8"), msg=payload, digestmod=hashlib.sha256)
client_generated_signature = hash_obj.hexdigest()
return header_skyvern_signature == client_generated_signature
```
SKYVERN_API_KEY: you can get the API KEY from [Skyvern Settings](https://app.skyvern.com/settings).
## Option 3: Push Code To Skyvern
Find TOTP API doc [here](/api-reference/api-reference/credentials/send-totp-code).
You can pass `totp_identifier` when running [a task](/api-reference/api-reference/agent/run-task) or a [workflow](/api-reference/api-reference/agent/run-workflow). When the TOTP code arrives at your inbox, all you need to do is to send the email/message (Gmail + Zapier integration can be a good solution to set up email forwarding) to Skyvern's TOTP endpoint.
### Forwarding Your Email To Skyvern (Gmail + Zapier) ### Forwarding Your Email To Skyvern (Gmail + Zapier)
This setup requires a Zapier Pro plan account. This setup requires a Zapier Pro plan account.
@@ -133,3 +107,107 @@ In Zapier: under the “Test” of the Webhooks action, send a request to test i
<p align="center"> <p align="center">
<img src="../images/totp/test_end_to_end.png"/> <img src="../images/totp/test_end_to_end.png"/>
</p> </p>
## Option 3: Phone Verification Code
Phone verification codes are supported the same way as email verification codes. You will need to set up a forwarding rule that forwards these messages to a Skyvern endpoint.
A good solution to set up this forwarding rule is to use virtual phone number services such as [Twilio](https://www.twilio.com/en-us/docs/usage/tutorials/how-to-use-your-free-trial-account) or [Pilvo](https://www.pilvo.com/en/us/virtual-phone-number).
Make sure you pass `totp_identifier` when running [a task](/api-reference/api-reference/agent/run-task) or a [workflow](/api-reference/api-reference/agent/run-workflow). When the TOTP code arrives at your virtual phone number, all you need to do is to send the message to Skyvern's TOTP endpoint.
You can use the following code to forward the message to Skyvern:
```python
// Twilio Function to post 2FA data to Skyvern API
exports.handler = async function(context, event, callback) {
const axios = require('axios');
const apiUrl = 'https://api.skyvern.com/v1/credentials/totp';
const apiKey = '{{your api key}}';
const totpIdentifier = '{{your totp identifier (could be phone number)}}';
const requestBody = {
totp_identifier: totpIdentifier,
content: event.Body || "Default 2FA message",
source: "phone"
};
const response = new Twilio.Response();
response.appendHeader('Content-Type', 'application/json');
try {
const apiResponse = await axios.post(apiUrl, requestBody, {
headers: {
'Content-Type': 'application/json',
'x-api-key': apiKey
}
});
response.setStatusCode(200);
response.setBody({
status: 'success',
message: '2FA message sent',
data: apiResponse.data
});
} catch (error) {
response.setStatusCode(500);
response.setBody({
status: 'error',
message: error.message,
details: error.response?.data || null
});
}
return callback(null, response);
};
```
## Option 4: Let Skyvern get the code from your server (webhook)
You can pass `totp_url` when running [a task](/api-reference/api-reference/agent/run-task) or a [workflow](/api-reference/api-reference/agent/run-workflow). Inside this endpoint hosted by you, you have to conform to the following schema:
### Set Up Your TOTP Endpoint
For websites that requires a verification code to complete a task, you have to set up a TOTP endpoint for Skyvern to fetch the verification code.
Here's the TOTP endpoint contract you should use:
Request (POST):
| Parameter | Type | Required? | Sample Value | Description |
| --- | --- | --- | --- | --- |
| task_id | String | yes | tsk_123 | The task ID that needs the verification to be done |
Response:
| Parameter | Type | Required? | Sample Value | Description |
| --- | --- | --- | --- | --- |
| task_id | String | yes | tsk_123 | The task ID that needs the verification to be done |
| verification_code | String | no | 123456 | The verification code |
### Validate The Sender of The Request
Same as the webhook API, your server needs to make sure its Skyvern thats making the request.
- a python example for how to generate and validate the signature:
```python
def validate_skyvern_request_headers(request: Request) -> bool:
header_skyvern_signature = request.headers["x-skyvern-signature"]
payload = request.body() # this is a bytes
hash_obj = hmac.new(SKYVERN_API_KEY.encode("utf-8"), msg=payload, digestmod=hashlib.sha256)
client_generated_signature = hash_obj.hexdigest()
return header_skyvern_signature == client_generated_signature
```
`SKYVERN_API_KEY`: you can get the API KEY from [Skyvern Settings](https://app.skyvern.com/settings).
## Option 5: One Time Login Link
One time login links are supported by breaking your workflow / task into two parts:
1. Login to trigger the one time login link
2. Trigger the rest of your task / workflow based on the one time login link as the starting point
You will need to set up something (e.g. Zapier) that monitors the email inbox for incoming magic links. Once you get the link, you can use it to trigger the rest of your task / workflow.
When triggering the rest of your task / workflow, you can pass the one time login link as the starting point (e.g. url parameter), and Skyvern will start the new session already logged in.

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

View File

@@ -1,5 +1,6 @@
import os import os
import subprocess import subprocess
from typing import Optional
from urllib.parse import urlparse from urllib.parse import urlparse
import requests # type: ignore import requests # type: ignore
@@ -26,7 +27,7 @@ def get_default_chrome_location(host_system: str) -> str:
return "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" return "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe"
def setup_browser_config() -> tuple[str, str | None, str | None]: def setup_browser_config() -> tuple[str, Optional[str], Optional[str]]:
"""Configure browser settings for Skyvern.""" """Configure browser settings for Skyvern."""
console.print(Panel("\n[bold blue]Configuring web browser for scraping...[/bold blue]", border_style="cyan")) console.print(Panel("\n[bold blue]Configuring web browser for scraping...[/bold blue]", border_style="cyan"))
browser_types = ["chromium-headless", "chromium-headful", "cdp-connect"] browser_types = ["chromium-headless", "chromium-headful", "cdp-connect"]

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import json import json
import os import os
from typing import Optional
import typer import typer
from dotenv import load_dotenv from dotenv import load_dotenv
@@ -20,7 +21,7 @@ tasks_app = typer.Typer(help="Manage Skyvern tasks and operations.")
@tasks_app.callback() @tasks_app.callback()
def tasks_callback( def tasks_callback(
ctx: typer.Context, ctx: typer.Context,
api_key: str | None = typer.Option( api_key: Optional[str] = typer.Option(
None, None,
"--api-key", "--api-key",
help="Skyvern API key", help="Skyvern API key",
@@ -31,7 +32,7 @@ def tasks_callback(
ctx.obj = {"api_key": api_key} ctx.obj = {"api_key": api_key}
def _get_client(api_key: str | None = None) -> Skyvern: def _get_client(api_key: Optional[str] = None) -> Skyvern:
"""Instantiate a Skyvern SDK client using environment variables.""" """Instantiate a Skyvern SDK client using environment variables."""
load_dotenv() load_dotenv()
load_dotenv(".env") load_dotenv(".env")

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import json import json
import os import os
from typing import Optional
import typer import typer
from dotenv import load_dotenv from dotenv import load_dotenv
@@ -21,7 +22,7 @@ workflow_app = typer.Typer(help="Manage Skyvern workflows.")
@workflow_app.callback() @workflow_app.callback()
def workflow_callback( def workflow_callback(
ctx: typer.Context, ctx: typer.Context,
api_key: str | None = typer.Option( api_key: Optional[str] = typer.Option(
None, None,
"--api-key", "--api-key",
help="Skyvern API key", help="Skyvern API key",
@@ -32,7 +33,7 @@ def workflow_callback(
ctx.obj = {"api_key": api_key} ctx.obj = {"api_key": api_key}
def _get_client(api_key: str | None = None) -> Skyvern: def _get_client(api_key: Optional[str] = None) -> Skyvern:
"""Instantiate a Skyvern SDK client using environment variables.""" """Instantiate a Skyvern SDK client using environment variables."""
load_dotenv() load_dotenv()
load_dotenv(".env") load_dotenv(".env")
@@ -45,8 +46,8 @@ def run_workflow(
ctx: typer.Context, ctx: typer.Context,
workflow_id: str = typer.Argument(..., help="Workflow permanent ID"), workflow_id: str = typer.Argument(..., help="Workflow permanent ID"),
parameters: str = typer.Option("{}", "--parameters", "-p", help="JSON parameters for the workflow"), parameters: str = typer.Option("{}", "--parameters", "-p", help="JSON parameters for the workflow"),
title: str | None = typer.Option(None, "--title", help="Title for the workflow run"), title: Optional[str] = typer.Option(None, "--title", help="Title for the workflow run"),
max_steps: int | None = typer.Option(None, "--max-steps", help="Override the workflow max steps"), max_steps: Optional[int] = typer.Option(None, "--max-steps", help="Override the workflow max steps"),
) -> None: ) -> None:
"""Run a workflow.""" """Run a workflow."""
try: try: