Co-authored-by: Kunal Mishra <kunalm2345@gmail.com> Co-authored-by: Suchintan <suchintan@users.noreply.github.com>
2075 lines
65 KiB
Plaintext
2075 lines
65 KiB
Plaintext
---
|
|
title: File Operations
|
|
subtitle: Download, parse, and upload files in workflows
|
|
slug: multi-step-automations/file-operations
|
|
---
|
|
|
|
Workflows often need to handle files: downloading invoices, parsing spreadsheets, uploading documents to storage. This page covers the four core file operations: downloading, parsing, saving, and passing files between blocks.
|
|
|
|
<Info>
|
|
This page uses workflow template syntax (`{{ parameter }}`, `{{ label_output }}`) and assumes familiarity with blocks. See [Build a Workflow](/multi-step-automations/build-a-workflow) for an introduction.
|
|
</Info>
|
|
|
|
---
|
|
|
|
## Downloading files
|
|
|
|
When your workflow needs to retrieve documents from a website — invoices from a vendor portal, confirmation letters after form submission, compliance certificates — use `file_download`. Downloaded files land in `SKYVERN_DOWNLOAD_DIRECTORY`, a temporary directory created for each workflow run. Every file downloaded during a run accumulates here, and you can reference this directory in later blocks to upload or email the files.
|
|
|
|
### Downloading a single file
|
|
|
|
Say you've just filed an SS-4 form on the IRS website and need to grab the EIN confirmation letter. The `file_download` block navigates the page and triggers the download:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Download EIN Letter",
|
|
"workflow_definition": {
|
|
"parameters": [],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_download",
|
|
"label": "download_ein_letter",
|
|
"url": "https://sa.www4.irs.gov/modiein/individual/index.jsp",
|
|
"navigation_goal": (
|
|
"Find and download the EIN confirmation letter PDF. "
|
|
"Look for a Download or Print button. "
|
|
"COMPLETE when the download starts."
|
|
),
|
|
"download_suffix": "ein_confirmation.pdf"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow
|
|
run = await client.run_workflow(workflow_id=workflow.workflow_permanent_id)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Download EIN Letter",
|
|
workflow_definition: {
|
|
parameters: [],
|
|
blocks: [
|
|
{
|
|
block_type: "file_download",
|
|
label: "download_ein_letter",
|
|
url: "https://sa.www4.irs.gov/modiein/individual/index.jsp",
|
|
navigation_goal: [
|
|
"Find and download the EIN confirmation letter PDF.",
|
|
"Look for a Download or Print button.",
|
|
"COMPLETE when the download starts.",
|
|
].join(" "),
|
|
download_suffix: "ein_confirmation.pdf",
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow
|
|
const run = await client.runWorkflow({
|
|
body: { workflow_id: workflow.workflow_permanent_id },
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Download EIN Letter",
|
|
"workflow_definition": {
|
|
"parameters": [],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_download",
|
|
"label": "download_ein_letter",
|
|
"url": "https://sa.www4.irs.gov/modiein/individual/index.jsp",
|
|
"navigation_goal": "Find and download the EIN confirmation letter PDF. Look for a Download or Print button. COMPLETE when the download starts.",
|
|
"download_suffix": "ein_confirmation.pdf"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{\"workflow_id\": \"$WORKFLOW_ID\"}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
Three fields drive this block:
|
|
|
|
- **`navigation_goal`** tells Skyvern what to do on the page — find the download button, click it, and mark the task complete when the download starts. Write this the way you'd explain the task to a colleague looking at the screen.
|
|
- **`download_suffix`** names the downloaded file. Without it, you'd get a generic filename. Use a descriptive suffix so you can identify the file later when you upload or email it.
|
|
- **`url`** sets the starting page. If your previous block already navigated to the right page, omit `url` and the download block continues from there.
|
|
|
|
| Parameter | Type | Use this to |
|
|
|-----------|------|-------------|
|
|
| `navigation_goal` | string | Describe how to find and click the download button |
|
|
| `url` | string | Set the starting page (omit to continue from previous block) |
|
|
| `download_suffix` | string | Set the filename for the downloaded file |
|
|
| `download_timeout` | number | Allow more time for large files (in seconds) |
|
|
| `parameter_keys` | array | List workflow parameters this block can access |
|
|
| `max_retries` | integer | Retry the block on failure |
|
|
| `max_steps_per_run` | integer | Limit how many actions the block can take |
|
|
| `engine` | string | Set the AI engine (`skyvern-1.0` or `skyvern-2.0`) |
|
|
|
|
For the full parameter list, see [File Download](/multi-step-automations/workflow-blocks-reference#file-download) in the blocks reference.
|
|
|
|
### Downloading multiple files
|
|
|
|
When you need to download a batch of files — say, all monthly invoices from a vendor portal — pair a `for_loop` with `file_download`. A prior block (like an extraction) returns a list of URLs, and the loop downloads each one.
|
|
|
|
Here, `get_invoices` is an extraction block that scrapes invoice URLs from the portal. The loop iterates over its output and downloads each one:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Download All Invoices",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "portal_url",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "string"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "extraction",
|
|
"label": "get_invoices",
|
|
"url": "{{ portal_url }}",
|
|
"data_extraction_goal": "Extract all invoice download URLs from this page"
|
|
},
|
|
{
|
|
"block_type": "for_loop",
|
|
"label": "download_all_invoices",
|
|
"loop_over_parameter_key": "get_invoices_output",
|
|
"continue_on_failure": True,
|
|
"loop_blocks": [
|
|
{
|
|
"block_type": "file_download",
|
|
"label": "download_invoice",
|
|
"url": "{{ download_invoice.current_value }}",
|
|
"navigation_goal": "Download the invoice PDF from this page"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow with parameters
|
|
run = await client.run_workflow(
|
|
workflow_id=workflow.workflow_permanent_id,
|
|
parameters={"portal_url": "https://vendor.example.com/invoices"}
|
|
)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Download All Invoices",
|
|
workflow_definition: {
|
|
parameters: [
|
|
{
|
|
key: "portal_url",
|
|
parameter_type: "workflow",
|
|
workflow_parameter_type: "string",
|
|
},
|
|
],
|
|
blocks: [
|
|
{
|
|
block_type: "extraction",
|
|
label: "get_invoices",
|
|
url: "{{ portal_url }}",
|
|
data_extraction_goal:
|
|
"Extract all invoice download URLs from this page",
|
|
},
|
|
{
|
|
block_type: "for_loop",
|
|
label: "download_all_invoices",
|
|
loop_over_parameter_key: "get_invoices_output",
|
|
continue_on_failure: true,
|
|
loop_blocks: [
|
|
{
|
|
block_type: "file_download",
|
|
label: "download_invoice",
|
|
url: "{{ download_invoice.current_value }}",
|
|
navigation_goal: "Download the invoice PDF from this page",
|
|
},
|
|
],
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow with parameters
|
|
const run = await client.runWorkflow({
|
|
body: {
|
|
workflow_id: workflow.workflow_permanent_id,
|
|
parameters: { portal_url: "https://vendor.example.com/invoices" },
|
|
},
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Download All Invoices",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "portal_url",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "string"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "extraction",
|
|
"label": "get_invoices",
|
|
"url": "{{ portal_url }}",
|
|
"data_extraction_goal": "Extract all invoice download URLs from this page"
|
|
},
|
|
{
|
|
"block_type": "for_loop",
|
|
"label": "download_all_invoices",
|
|
"loop_over_parameter_key": "get_invoices_output",
|
|
"continue_on_failure": true,
|
|
"loop_blocks": [
|
|
{
|
|
"block_type": "file_download",
|
|
"label": "download_invoice",
|
|
"url": "{{ download_invoice.current_value }}",
|
|
"navigation_goal": "Download the invoice PDF from this page"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow with parameters
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{
|
|
\"workflow_id\": \"$WORKFLOW_ID\",
|
|
\"parameters\": {
|
|
\"portal_url\": \"https://vendor.example.com/invoices\"
|
|
}
|
|
}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
- **`loop_over_parameter_key`** points to the output of the prior extraction block (`get_invoices_output`).
|
|
- **`{{ download_invoice.current_value }}`** gives you one URL per iteration — the inner block's label (`download_invoice`) followed by `.current_value`.
|
|
- **`continue_on_failure: true`** on the loop means a single failed download won't stop the rest. All successful downloads still land in `SKYVERN_DOWNLOAD_DIRECTORY`.
|
|
|
|
<Note>
|
|
If a download fails, the block fails. Set `continue_on_failure: true` on the loop (as shown above) so a single failed download doesn't stop the entire workflow.
|
|
</Note>
|
|
|
|
---
|
|
|
|
## Parsing files
|
|
|
|
When your workflow needs to act on information inside a file — filling forms with resume data, processing orders from a spreadsheet, extracting fields from a PDF — use `file_url_parser`. It downloads the file, parses it, and returns structured data you can use in subsequent blocks.
|
|
|
|
### Parsing a PDF with a schema
|
|
|
|
Say you're building a workflow that applies to jobs on Lever. You have a candidate's resume as a PDF URL and need to extract their name, email, and work history so a later navigation block can fill the application form.
|
|
|
|
Define the fields you want in a `json_schema`, and Skyvern uses an LLM to extract them:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Parse Resume",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "resume",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url",
|
|
"description": "URL to the resume PDF"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_url_parser",
|
|
"label": "parse_resume",
|
|
"file_url": "{{ resume }}",
|
|
"file_type": "pdf",
|
|
"json_schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"name": {"type": "string"},
|
|
"email": {"type": "string"},
|
|
"work_experience": {
|
|
"type": "array",
|
|
"items": {
|
|
"type": "object",
|
|
"properties": {
|
|
"company": {"type": "string"},
|
|
"role": {"type": "string"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow with a resume URL
|
|
run = await client.run_workflow(
|
|
workflow_id=workflow.workflow_permanent_id,
|
|
parameters={
|
|
"resume": "https://writing.colostate.edu/guides/documents/resume/functionalsample.pdf"
|
|
}
|
|
)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Parse Resume",
|
|
workflow_definition: {
|
|
parameters: [
|
|
{
|
|
key: "resume",
|
|
parameter_type: "workflow",
|
|
workflow_parameter_type: "file_url",
|
|
description: "URL to the resume PDF",
|
|
},
|
|
],
|
|
blocks: [
|
|
{
|
|
block_type: "file_url_parser",
|
|
label: "parse_resume",
|
|
file_url: "{{ resume }}",
|
|
file_type: "pdf",
|
|
json_schema: {
|
|
type: "object",
|
|
properties: {
|
|
name: { type: "string" },
|
|
email: { type: "string" },
|
|
work_experience: {
|
|
type: "array",
|
|
items: {
|
|
type: "object",
|
|
properties: {
|
|
company: { type: "string" },
|
|
role: { type: "string" },
|
|
},
|
|
},
|
|
},
|
|
},
|
|
},
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow with a resume URL
|
|
const run = await client.runWorkflow({
|
|
body: {
|
|
workflow_id: workflow.workflow_permanent_id,
|
|
parameters: {
|
|
resume:
|
|
"https://writing.colostate.edu/guides/documents/resume/functionalsample.pdf",
|
|
},
|
|
},
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Parse Resume",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "resume",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url",
|
|
"description": "URL to the resume PDF"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_url_parser",
|
|
"label": "parse_resume",
|
|
"file_url": "{{ resume }}",
|
|
"file_type": "pdf",
|
|
"json_schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"name": {"type": "string"},
|
|
"email": {"type": "string"},
|
|
"work_experience": {
|
|
"type": "array",
|
|
"items": {
|
|
"type": "object",
|
|
"properties": {
|
|
"company": {"type": "string"},
|
|
"role": {"type": "string"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow with a resume URL
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{
|
|
\"workflow_id\": \"$WORKFLOW_ID\",
|
|
\"parameters\": {
|
|
\"resume\": \"https://writing.colostate.edu/guides/documents/resume/functionalsample.pdf\"
|
|
}
|
|
}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
- **`file_url`** points to the file. Here, `{{ resume }}` is a workflow parameter passed at runtime — it could be a public URL, an S3 presigned URL, or a Google Drive link.
|
|
- **`file_type`** tells Skyvern how to read the file. Use `pdf`, `csv`, or `excel`.
|
|
- **`json_schema`** defines exactly what to extract. The LLM reads the PDF and returns data matching this structure. The `work_experience` array means you'll get one object per job with `company` and `role` fields.
|
|
|
|
The output is available as `{{ parse_resume_output }}` in subsequent blocks and looks like this:
|
|
|
|
```json
|
|
{
|
|
"name": "Jane Doe",
|
|
"email": "jane@example.com",
|
|
"work_experience": [
|
|
{ "company": "Acme Corp", "role": "Software Engineer" },
|
|
{ "company": "Globex Inc", "role": "Tech Lead" }
|
|
]
|
|
}
|
|
```
|
|
|
|
Without a `json_schema`, you get the raw parsed content instead — plain text for PDFs, or an array of row objects for CSVs. The schema is what turns unstructured file content into typed, structured data.
|
|
|
|
| Parameter | Type | Use this to |
|
|
|-----------|------|-------------|
|
|
| `file_url` | string | Point to the file (HTTP URL, S3 URI, or parameter like `{{ resume }}`) |
|
|
| `file_type` | string | Specify format: `csv`, `excel`, or `pdf` |
|
|
| `json_schema` | object | Define the structure you want extracted |
|
|
|
|
**Supported sources for `file_url`:**
|
|
- HTTP/HTTPS URLs
|
|
- S3 URIs (`s3://bucket/path/file.csv`)
|
|
- Azure Blob URIs (`azure://container/path/file.xlsx`)
|
|
- Google Drive links (auto-converted)
|
|
- Workflow parameters (`{{ parameter_name }}`)
|
|
|
|
### Parsing a CSV and processing each row
|
|
|
|
Say your procurement team exports purchase orders as a CSV and you need to enter each one into your web-based ERP. First parse the CSV, then loop over the rows:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Import Purchase Orders",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "orders_csv",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_url_parser",
|
|
"label": "parse_orders",
|
|
"file_url": "{{ orders_csv }}",
|
|
"file_type": "csv"
|
|
},
|
|
{
|
|
"block_type": "for_loop",
|
|
"label": "process_orders",
|
|
"loop_over_parameter_key": "parse_orders_output",
|
|
"loop_blocks": [
|
|
{
|
|
"block_type": "navigation",
|
|
"label": "enter_order",
|
|
"url": "https://erp.example.com/orders/new",
|
|
"navigation_goal": "Create a new order with: {{ enter_order.current_value }}"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow with a CSV URL
|
|
run = await client.run_workflow(
|
|
workflow_id=workflow.workflow_permanent_id,
|
|
parameters={
|
|
"orders_csv": "https://example.com/purchase_orders.csv"
|
|
}
|
|
)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Import Purchase Orders",
|
|
workflow_definition: {
|
|
parameters: [
|
|
{
|
|
key: "orders_csv",
|
|
parameter_type: "workflow",
|
|
workflow_parameter_type: "file_url",
|
|
},
|
|
],
|
|
blocks: [
|
|
{
|
|
block_type: "file_url_parser",
|
|
label: "parse_orders",
|
|
file_url: "{{ orders_csv }}",
|
|
file_type: "csv",
|
|
},
|
|
{
|
|
block_type: "for_loop",
|
|
label: "process_orders",
|
|
loop_over_parameter_key: "parse_orders_output",
|
|
loop_blocks: [
|
|
{
|
|
block_type: "navigation",
|
|
label: "enter_order",
|
|
url: "https://erp.example.com/orders/new",
|
|
navigation_goal:
|
|
"Create a new order with: {{ enter_order.current_value }}",
|
|
},
|
|
],
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow with a CSV URL
|
|
const run = await client.runWorkflow({
|
|
body: {
|
|
workflow_id: workflow.workflow_permanent_id,
|
|
parameters: {
|
|
orders_csv: "https://example.com/purchase_orders.csv",
|
|
},
|
|
},
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Import Purchase Orders",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "orders_csv",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_url_parser",
|
|
"label": "parse_orders",
|
|
"file_url": "{{ orders_csv }}",
|
|
"file_type": "csv"
|
|
},
|
|
{
|
|
"block_type": "for_loop",
|
|
"label": "process_orders",
|
|
"loop_over_parameter_key": "parse_orders_output",
|
|
"loop_blocks": [
|
|
{
|
|
"block_type": "navigation",
|
|
"label": "enter_order",
|
|
"url": "https://erp.example.com/orders/new",
|
|
"navigation_goal": "Create a new order with: {{ enter_order.current_value }}"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow with a CSV URL
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{
|
|
\"workflow_id\": \"$WORKFLOW_ID\",
|
|
\"parameters\": {
|
|
\"orders_csv\": \"https://example.com/purchase_orders.csv\"
|
|
}
|
|
}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
CSV parsing returns an array of objects — one per row, with column headers as keys. No `json_schema` needed for CSVs since the column headers already provide structure:
|
|
|
|
```json
|
|
[
|
|
{ "order_id": "ORD-001", "product": "Widget", "quantity": "10" },
|
|
{ "order_id": "ORD-002", "product": "Gadget", "quantity": "5" }
|
|
]
|
|
```
|
|
|
|
The `for_loop` iterates over this array. On each iteration, `{{ enter_order.current_value }}` contains one row object (e.g., `{ "order_id": "ORD-001", "product": "Widget", "quantity": "10" }`), which the navigation block uses to fill the ERP form.
|
|
|
|
---
|
|
|
|
## Saving files
|
|
|
|
When your workflow needs to store downloaded files permanently — archiving invoices, sending reports to a shared bucket, integrating with other systems — use one of the upload blocks.
|
|
|
|
| Block | When to use it |
|
|
|-------|---------------|
|
|
| `upload_to_s3` | Store files in Skyvern's managed S3. Simplest option — no credentials needed. |
|
|
| `file_upload` | Store files in your own S3 bucket or Azure Blob Storage. Use when you need files in your infrastructure. |
|
|
| `download_to_s3` | Save a file from a URL directly to Skyvern's S3, skipping the browser. Use when you already have the file URL. |
|
|
|
|
### Upload to Skyvern's S3
|
|
|
|
After downloading EIN confirmation letters from the IRS, archive them to Skyvern's managed S3. Pass `SKYVERN_DOWNLOAD_DIRECTORY` as the path and the block uploads every file downloaded during this run:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Archive Downloads to S3",
|
|
"workflow_definition": {
|
|
"parameters": [],
|
|
"blocks": [
|
|
{
|
|
"block_type": "upload_to_s3",
|
|
"label": "save_invoices",
|
|
"path": "SKYVERN_DOWNLOAD_DIRECTORY"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow
|
|
run = await client.run_workflow(workflow_id=workflow.workflow_permanent_id)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Archive Downloads to S3",
|
|
workflow_definition: {
|
|
parameters: [],
|
|
blocks: [
|
|
{
|
|
block_type: "upload_to_s3",
|
|
label: "save_invoices",
|
|
path: "SKYVERN_DOWNLOAD_DIRECTORY",
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow
|
|
const run = await client.runWorkflow({
|
|
body: { workflow_id: workflow.workflow_permanent_id },
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Archive Downloads to S3",
|
|
"workflow_definition": {
|
|
"parameters": [],
|
|
"blocks": [
|
|
{
|
|
"block_type": "upload_to_s3",
|
|
"label": "save_invoices",
|
|
"path": "SKYVERN_DOWNLOAD_DIRECTORY"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{\"workflow_id\": \"$WORKFLOW_ID\"}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
The output (`{{ save_invoices_output }}`) is a list of S3 URIs you can return from your workflow or pass to subsequent blocks.
|
|
|
|
| Parameter | Type | Use this to |
|
|
|-----------|------|-------------|
|
|
| `path` | string | Specify what to upload (`SKYVERN_DOWNLOAD_DIRECTORY` or a specific file path) |
|
|
|
|
**Limit:** 50 files maximum per upload.
|
|
|
|
### Upload to your own storage
|
|
|
|
When compliance certificates need to land in your company's own S3 bucket (not Skyvern's), use `file_upload` with your AWS credentials:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow with AWS Secrets Manager-backed credentials
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Upload to Company S3",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "aws_key",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/aws/access_key_id"
|
|
},
|
|
{
|
|
"key": "aws_secret",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/aws/secret_access_key"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_upload",
|
|
"label": "upload_to_company_bucket",
|
|
"storage_type": "s3",
|
|
"s3_bucket": "company-documents",
|
|
"aws_access_key_id": "{{ aws_key }}",
|
|
"aws_secret_access_key": "{{ aws_secret }}",
|
|
"region_name": "us-west-2",
|
|
"path": "SKYVERN_DOWNLOAD_DIRECTORY"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow (credentials auto-resolve from AWS Secrets Manager)
|
|
run = await client.run_workflow(workflow_id=workflow.workflow_permanent_id)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow with AWS Secrets Manager-backed credentials
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Upload to Company S3",
|
|
workflow_definition: {
|
|
parameters: [
|
|
{
|
|
key: "aws_key",
|
|
parameter_type: "aws_secret",
|
|
aws_key: "skyvern/aws/access_key_id",
|
|
},
|
|
{
|
|
key: "aws_secret",
|
|
parameter_type: "aws_secret",
|
|
aws_key: "skyvern/aws/secret_access_key",
|
|
},
|
|
],
|
|
blocks: [
|
|
{
|
|
block_type: "file_upload",
|
|
label: "upload_to_company_bucket",
|
|
storage_type: "s3",
|
|
s3_bucket: "company-documents",
|
|
aws_access_key_id: "{{ aws_key }}",
|
|
aws_secret_access_key: "{{ aws_secret }}",
|
|
region_name: "us-west-2",
|
|
path: "SKYVERN_DOWNLOAD_DIRECTORY",
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow (credentials auto-resolve from AWS Secrets Manager)
|
|
const run = await client.runWorkflow({
|
|
body: {
|
|
workflow_id: workflow.workflow_permanent_id,
|
|
},
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow with AWS Secrets Manager-backed credentials
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Upload to Company S3",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "aws_key",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/aws/access_key_id"
|
|
},
|
|
{
|
|
"key": "aws_secret",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/aws/secret_access_key"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_upload",
|
|
"label": "upload_to_company_bucket",
|
|
"storage_type": "s3",
|
|
"s3_bucket": "company-documents",
|
|
"aws_access_key_id": "{{ aws_key }}",
|
|
"aws_secret_access_key": "{{ aws_secret }}",
|
|
"region_name": "us-west-2",
|
|
"path": "SKYVERN_DOWNLOAD_DIRECTORY"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow (credentials auto-resolve from AWS Secrets Manager)
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{
|
|
\"workflow_id\": \"$WORKFLOW_ID\"
|
|
}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
- **`storage_type`** selects the destination: `s3` for AWS, `azure` for Azure Blob Storage.
|
|
- **Credentials** should use `aws_secret` parameters backed by AWS Secrets Manager — credentials auto-resolve at runtime without being passed in the run payload.
|
|
- **`path`** works the same as `upload_to_s3` — pass `SKYVERN_DOWNLOAD_DIRECTORY` to upload everything, or a specific file path.
|
|
|
|
| Parameter | Type | Use this to |
|
|
|-----------|------|-------------|
|
|
| `storage_type` | string | Choose `s3` or `azure` |
|
|
| `path` | string | Specify what to upload (`SKYVERN_DOWNLOAD_DIRECTORY` or a file path) |
|
|
|
|
**S3 parameters:** `s3_bucket`, `aws_access_key_id`, `aws_secret_access_key`, `region_name`
|
|
|
|
**Azure parameters:** `azure_storage_account_name`, `azure_storage_account_key`, `azure_blob_container_name`, `azure_folder_path`
|
|
|
|
**Limit:** 50 files maximum per upload.
|
|
|
|
### Quick URL-to-S3
|
|
|
|
When you already have a direct URL to a file — say, a Google Analytics export link — and just need to save it without opening a browser, use `download_to_s3`:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Save Report to S3",
|
|
"workflow_definition": {
|
|
"parameters": [],
|
|
"blocks": [
|
|
{
|
|
"block_type": "download_to_s3",
|
|
"label": "save_report",
|
|
"url": "https://analytics.example.com/report.pdf"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow
|
|
run = await client.run_workflow(workflow_id=workflow.workflow_permanent_id)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Save Report to S3",
|
|
workflow_definition: {
|
|
parameters: [],
|
|
blocks: [
|
|
{
|
|
block_type: "download_to_s3",
|
|
label: "save_report",
|
|
url: "https://analytics.example.com/report.pdf",
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow
|
|
const run = await client.runWorkflow({
|
|
body: { workflow_id: workflow.workflow_permanent_id },
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Save Report to S3",
|
|
"workflow_definition": {
|
|
"parameters": [],
|
|
"blocks": [
|
|
{
|
|
"block_type": "download_to_s3",
|
|
"label": "save_report",
|
|
"url": "https://analytics.example.com/report.pdf"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{\"workflow_id\": \"$WORKFLOW_ID\"}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
| Parameter | Type | Use this to |
|
|
|-----------|------|-------------|
|
|
| `url` | string | Specify the file URL to download and save |
|
|
|
|
This skips the browser entirely — it fetches the file directly and stores it in Skyvern's S3. Use it when you already have the URL and don't need Skyvern to navigate a page. **Limit:** 10 MB maximum file size.
|
|
|
|
---
|
|
|
|
## Passing files between blocks
|
|
|
|
Files flow through workflows in two ways: through `SKYVERN_DOWNLOAD_DIRECTORY` for downloaded files, and through output parameters for parsed data.
|
|
|
|
### Using downloaded files
|
|
|
|
Say you downloaded invoices from a vendor portal earlier in the workflow and now need to email them to your accounting team. Any file downloaded during the run sits in `SKYVERN_DOWNLOAD_DIRECTORY`, and you can attach the entire directory to an email:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Email Downloaded Invoices",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "smtp_host",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/host"
|
|
},
|
|
{
|
|
"key": "smtp_port",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/port"
|
|
},
|
|
{
|
|
"key": "smtp_username",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/username"
|
|
},
|
|
{
|
|
"key": "smtp_password",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/password"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "send_email",
|
|
"label": "email_documents",
|
|
"sender": "automation@company.com",
|
|
"recipients": ["accounting@company.com"],
|
|
"subject": "Monthly invoices attached",
|
|
"body": "See attached invoices for this month.",
|
|
"smtp_host_secret_parameter_key": "smtp_host",
|
|
"smtp_port_secret_parameter_key": "smtp_port",
|
|
"smtp_username_secret_parameter_key": "smtp_username",
|
|
"smtp_password_secret_parameter_key": "smtp_password",
|
|
"file_attachments": ["SKYVERN_DOWNLOAD_DIRECTORY"]
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow (SMTP secrets are auto-resolved from AWS Secrets Manager)
|
|
run = await client.run_workflow(workflow_id=workflow.workflow_permanent_id)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Email Downloaded Invoices",
|
|
workflow_definition: {
|
|
parameters: [
|
|
{
|
|
key: "smtp_host",
|
|
parameter_type: "aws_secret",
|
|
aws_key: "skyvern/smtp/host",
|
|
},
|
|
{
|
|
key: "smtp_port",
|
|
parameter_type: "aws_secret",
|
|
aws_key: "skyvern/smtp/port",
|
|
},
|
|
{
|
|
key: "smtp_username",
|
|
parameter_type: "aws_secret",
|
|
aws_key: "skyvern/smtp/username",
|
|
},
|
|
{
|
|
key: "smtp_password",
|
|
parameter_type: "aws_secret",
|
|
aws_key: "skyvern/smtp/password",
|
|
},
|
|
],
|
|
blocks: [
|
|
{
|
|
block_type: "send_email",
|
|
label: "email_documents",
|
|
sender: "automation@company.com",
|
|
recipients: ["accounting@company.com"],
|
|
subject: "Monthly invoices attached",
|
|
body: "See attached invoices for this month.",
|
|
smtp_host_secret_parameter_key: "smtp_host",
|
|
smtp_port_secret_parameter_key: "smtp_port",
|
|
smtp_username_secret_parameter_key: "smtp_username",
|
|
smtp_password_secret_parameter_key: "smtp_password",
|
|
file_attachments: ["SKYVERN_DOWNLOAD_DIRECTORY"],
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow (SMTP secrets are auto-resolved from AWS Secrets Manager)
|
|
const run = await client.runWorkflow({
|
|
body: { workflow_id: workflow.workflow_permanent_id },
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Email Downloaded Invoices",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "smtp_host",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/host"
|
|
},
|
|
{
|
|
"key": "smtp_port",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/port"
|
|
},
|
|
{
|
|
"key": "smtp_username",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/username"
|
|
},
|
|
{
|
|
"key": "smtp_password",
|
|
"parameter_type": "aws_secret",
|
|
"aws_key": "skyvern/smtp/password"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "send_email",
|
|
"label": "email_documents",
|
|
"sender": "automation@company.com",
|
|
"recipients": ["accounting@company.com"],
|
|
"subject": "Monthly invoices attached",
|
|
"body": "See attached invoices for this month.",
|
|
"smtp_host_secret_parameter_key": "smtp_host",
|
|
"smtp_port_secret_parameter_key": "smtp_port",
|
|
"smtp_username_secret_parameter_key": "smtp_username",
|
|
"smtp_password_secret_parameter_key": "smtp_password",
|
|
"file_attachments": ["SKYVERN_DOWNLOAD_DIRECTORY"]
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow (SMTP secrets are auto-resolved from AWS Secrets Manager)
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{\"workflow_id\": \"$WORKFLOW_ID\"}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
`file_attachments: ["SKYVERN_DOWNLOAD_DIRECTORY"]` attaches every file downloaded during the run. The SMTP credentials use `aws_secret` parameters — these reference keys stored in AWS Secrets Manager so your credentials are never exposed in the workflow definition.
|
|
|
|
### Using parsed data
|
|
|
|
When you parse a file, the extracted data becomes available as `{{ label_output }}`. This is how you chain a parse step with a navigation step — for example, parsing a resume PDF and then using the candidate's info to fill a Lever job application:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
import os
|
|
import asyncio
|
|
from skyvern import Skyvern
|
|
|
|
async def main():
|
|
client = Skyvern(api_key=os.getenv("SKYVERN_API_KEY"))
|
|
|
|
# Create the workflow
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Parse Resume and Apply",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "resume",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url"
|
|
},
|
|
{
|
|
"key": "job_url",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "string"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_url_parser",
|
|
"label": "parse_resume",
|
|
"file_url": "{{ resume }}",
|
|
"file_type": "pdf"
|
|
},
|
|
{
|
|
"block_type": "navigation",
|
|
"label": "fill_application",
|
|
"url": "{{ job_url }}",
|
|
"navigation_goal": (
|
|
"Fill out the application form using "
|
|
"this candidate's information:\n\n"
|
|
"{{ parse_resume_output }}"
|
|
)
|
|
}
|
|
]
|
|
}
|
|
}
|
|
)
|
|
print(f"Workflow ID: {workflow.workflow_permanent_id}")
|
|
|
|
# Run the workflow with a resume and job URL
|
|
run = await client.run_workflow(
|
|
workflow_id=workflow.workflow_permanent_id,
|
|
parameters={
|
|
"resume": "https://writing.colostate.edu/guides/documents/resume/functionalsample.pdf",
|
|
"job_url": "https://jobs.lever.co/company/position"
|
|
}
|
|
)
|
|
print(f"Run ID: {run.run_id}")
|
|
|
|
# Poll until complete
|
|
while True:
|
|
result = await client.get_run(run.run_id)
|
|
if result.status in ["completed", "failed", "terminated", "timed_out", "canceled"]:
|
|
break
|
|
print(f"Status: {result.status}")
|
|
await asyncio.sleep(5)
|
|
|
|
print(f"Final status: {result.status}")
|
|
print(f"Output: {result.output}")
|
|
|
|
asyncio.run(main())
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { SkyvernClient } from "@skyvern/client";
|
|
|
|
async function main() {
|
|
const client = new SkyvernClient({
|
|
apiKey: process.env.SKYVERN_API_KEY,
|
|
});
|
|
|
|
// Create the workflow
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Parse Resume and Apply",
|
|
workflow_definition: {
|
|
parameters: [
|
|
{
|
|
key: "resume",
|
|
parameter_type: "workflow",
|
|
workflow_parameter_type: "file_url",
|
|
},
|
|
{
|
|
key: "job_url",
|
|
parameter_type: "workflow",
|
|
workflow_parameter_type: "string",
|
|
},
|
|
],
|
|
blocks: [
|
|
{
|
|
block_type: "file_url_parser",
|
|
label: "parse_resume",
|
|
file_url: "{{ resume }}",
|
|
file_type: "pdf",
|
|
},
|
|
{
|
|
block_type: "navigation",
|
|
label: "fill_application",
|
|
url: "{{ job_url }}",
|
|
navigation_goal: [
|
|
"Fill out the application form using this candidate's information:",
|
|
"",
|
|
"{{ parse_resume_output }}",
|
|
].join("\n"),
|
|
},
|
|
],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
console.log(`Workflow ID: ${workflow.workflow_permanent_id}`);
|
|
|
|
// Run the workflow with a resume and job URL
|
|
const run = await client.runWorkflow({
|
|
body: {
|
|
workflow_id: workflow.workflow_permanent_id,
|
|
parameters: {
|
|
resume:
|
|
"https://writing.colostate.edu/guides/documents/resume/functionalsample.pdf",
|
|
job_url: "https://jobs.lever.co/company/position",
|
|
},
|
|
},
|
|
});
|
|
console.log(`Run ID: ${run.run_id}`);
|
|
|
|
// Poll until complete
|
|
while (true) {
|
|
const result = await client.getRun(run.run_id);
|
|
if (
|
|
["completed", "failed", "terminated", "timed_out", "canceled"].includes(
|
|
result.status
|
|
)
|
|
) {
|
|
console.log(`Final status: ${result.status}`);
|
|
console.log(`Output: ${JSON.stringify(result.output, null, 2)}`);
|
|
break;
|
|
}
|
|
console.log(`Status: ${result.status}`);
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
main();
|
|
```
|
|
|
|
```bash cURL
|
|
# Create the workflow
|
|
WORKFLOW=$(curl -s -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Parse Resume and Apply",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "resume",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url"
|
|
},
|
|
{
|
|
"key": "job_url",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "string"
|
|
}
|
|
],
|
|
"blocks": [
|
|
{
|
|
"block_type": "file_url_parser",
|
|
"label": "parse_resume",
|
|
"file_url": "{{ resume }}",
|
|
"file_type": "pdf"
|
|
},
|
|
{
|
|
"block_type": "navigation",
|
|
"label": "fill_application",
|
|
"url": "{{ job_url }}",
|
|
"navigation_goal": "Fill out the application form using this candidate'\''s information:\n\n{{ parse_resume_output }}"
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}')
|
|
|
|
WORKFLOW_ID=$(echo "$WORKFLOW" | jq -r '.workflow_permanent_id')
|
|
echo "Workflow ID: $WORKFLOW_ID"
|
|
|
|
# Run the workflow with a resume and job URL
|
|
RUN=$(curl -s -X POST "https://api.skyvern.com/v1/run/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d "{
|
|
\"workflow_id\": \"$WORKFLOW_ID\",
|
|
\"parameters\": {
|
|
\"resume\": \"https://writing.colostate.edu/guides/documents/resume/functionalsample.pdf\",
|
|
\"job_url\": \"https://jobs.lever.co/company/position\"
|
|
}
|
|
}")
|
|
|
|
RUN_ID=$(echo "$RUN" | jq -r '.run_id')
|
|
echo "Run ID: $RUN_ID"
|
|
|
|
# Poll until complete
|
|
while true; do
|
|
RESPONSE=$(curl -s -X GET "https://api.skyvern.com/v1/runs/$RUN_ID" \
|
|
-H "x-api-key: $SKYVERN_API_KEY")
|
|
STATUS=$(echo "$RESPONSE" | jq -r '.status')
|
|
echo "Status: $STATUS"
|
|
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "terminated" || "$STATUS" == "timed_out" || "$STATUS" == "canceled" ]]; then
|
|
echo "$RESPONSE" | jq '.output'
|
|
break
|
|
fi
|
|
sleep 5
|
|
done
|
|
```
|
|
</CodeGroup>
|
|
|
|
The first block parses the resume and stores the result. The second block references `{{ parse_resume_output }}` in its `navigation_goal` — Skyvern replaces this with the full parsed data (name, email, work experience) and uses it to fill the form fields.
|
|
|
|
### Accepting files as workflow inputs
|
|
|
|
To accept a file when the workflow runs, define a parameter with `workflow_parameter_type: file_url`:
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
workflow = await client.create_workflow(
|
|
json_definition={
|
|
"title": "Resume Parser",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "resume",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url",
|
|
"description": "URL to the applicant's resume"
|
|
}
|
|
],
|
|
"blocks": [...]
|
|
}
|
|
}
|
|
)
|
|
```
|
|
|
|
```typescript TypeScript
|
|
const workflow = await client.createWorkflow({
|
|
body: {
|
|
json_definition: {
|
|
title: "Resume Parser",
|
|
workflow_definition: {
|
|
parameters: [
|
|
{
|
|
key: "resume",
|
|
parameter_type: "workflow",
|
|
workflow_parameter_type: "file_url",
|
|
description: "URL to the applicant's resume",
|
|
},
|
|
],
|
|
blocks: [...],
|
|
},
|
|
},
|
|
},
|
|
});
|
|
```
|
|
|
|
```bash cURL
|
|
curl -X POST "https://api.skyvern.com/v1/workflows" \
|
|
-H "x-api-key: $SKYVERN_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"json_definition": {
|
|
"title": "Resume Parser",
|
|
"workflow_definition": {
|
|
"parameters": [
|
|
{
|
|
"key": "resume",
|
|
"parameter_type": "workflow",
|
|
"workflow_parameter_type": "file_url",
|
|
"description": "URL to the applicant'\''s resume"
|
|
}
|
|
],
|
|
"blocks": [...]
|
|
}
|
|
}
|
|
}'
|
|
```
|
|
</CodeGroup>
|
|
|
|
When running the workflow, pass any accessible URL: public links, S3 presigned URLs, or files in your own storage.
|
|
|
|
---
|
|
|
|
## Next steps
|
|
|
|
<CardGroup cols={2}>
|
|
<Card
|
|
title="Workflow Blocks Reference"
|
|
icon="cube"
|
|
href="/multi-step-automations/workflow-blocks-reference"
|
|
>
|
|
Complete parameter reference for all file blocks
|
|
</Card>
|
|
<Card
|
|
title="Build a Workflow"
|
|
icon="hammer"
|
|
href="/multi-step-automations/build-a-workflow"
|
|
>
|
|
Learn how blocks pass data to each other
|
|
</Card>
|
|
</CardGroup>
|