Making file parser flexible to deprecate pdf parser (#3073)

Co-authored-by: Suchintan <suchintan@users.noreply.github.com>
This commit is contained in:
PHSB
2025-08-06 11:15:04 -06:00
committed by GitHub
parent 31aa7d6973
commit 468f5c6051
15 changed files with 555 additions and 49 deletions

View File

@@ -19,7 +19,7 @@ Building blocks supported today:
- TextPromptBlock: A text only prompt block.
- SendEmailBlock: Send an email.
- FileDownloadBlock: Given a goal, Skyvern downloads a file from the website.
- FileParserBlock: Given a file url, Skyvern downloads the file from the url, and returns the parsed content as the output of the block. Currently only support CSV file format.
- FileParserBlock: Given a file url, Skyvern downloads the file from the url, and returns the parsed content as the output of the block. Supports CSV, Excel, and PDF file formats.
- PDFParserBlock: Given a pdf url, Skyvern downloads the PDF file from the url and returns the parsed content as the output of the block.
- FileUploadBlock: Upload all the downloaded files to a desired destination. Currently only AWS S3 is supported. Please contact support@skyvern.com if you need more integrations.
- WaitBlock: Wait for a given amount of time.

View File

@@ -43,7 +43,7 @@ This block sends an email.
This block downloads a file from the website.
## FileParserBlock
This block parses a file from the website.
This block parses PDFs, CSVs, and Excel files from the website.
## PDFParserBlock
This block parses a PDF file from the website.

View File

@@ -228,16 +228,16 @@ Inputs:
Downloads and parses a file to be used within other workflow blocks.
**Supported types:** CSV
**Supported types:** CSV, TSV, Excel, PDF
```
- block_type: file_url_parser
label: csv_parser
file_type: csv
file_url: <csv_file_url>
label: file_parser
file_type: csv # Auto-detected from URL extension
file_url: <file_url>
```
Inputs:
1. **File URL *(required):*** This block allows you to use a CSV within your workflow.
1. **File URL *(required):*** This block allows you to use CSV, TSV, Excel, and PDF files within your workflow.
* Since were still in beta, you will need to [contact us](https://meetings.hubspot.com/skyvern/demo?uuid=7c83865f-1a92-4c44-9e52-1ba0dbc04f7a) to load a value into this block