--- title: Storage Configuration subtitle: Configure where Skyvern stores artifacts and recordings slug: self-hosted/storage --- Skyvern generates several types of artifacts during task execution: screenshots, browser recordings, HAR files, and extracted data. By default, these are stored on the local filesystem. For production deployments, you can configure S3 or Azure Blob Storage. ## Storage types Skyvern supports three storage backends: | Type | `SKYVERN_STORAGE_TYPE` | Best for | |------|------------------------|----------| | Local filesystem | `local` | Development, single-server deployments | | AWS S3 | `s3` | Production on AWS, multi-server deployments | | Azure Blob | `azureblob` | Production on Azure | --- ## Local storage (default) By default, Skyvern stores all artifacts in a local directory. ```bash .env SKYVERN_STORAGE_TYPE=local ARTIFACT_STORAGE_PATH=/data/artifacts VIDEO_PATH=/data/videos HAR_PATH=/data/har LOG_PATH=/data/log ``` ### Docker volume mounts When using Docker Compose, these paths are mounted from your host: ```yaml docker-compose.yml volumes: - ./artifacts:/data/artifacts - ./videos:/data/videos - ./har:/data/har - ./log:/data/log ``` ### Limitations Local storage works well for single-server deployments but has limitations: - Not accessible across multiple servers - No automatic backup or redundancy - Requires manual cleanup to manage disk space --- ## AWS S3 Store artifacts in S3 for durability, scalability, and access from multiple servers. ### Configuration ```bash .env SKYVERN_STORAGE_TYPE=s3 AWS_REGION=us-east-1 AWS_S3_BUCKET_ARTIFACTS=your-skyvern-artifacts AWS_S3_BUCKET_SCREENSHOTS=your-skyvern-screenshots AWS_S3_BUCKET_BROWSER_SESSIONS=your-skyvern-browser-sessions AWS_S3_BUCKET_UPLOADS=your-skyvern-uploads # Pre-signed URL expiration (seconds) - default 24 hours PRESIGNED_URL_EXPIRATION=86400 # Maximum upload file size (bytes) - default 10MB MAX_UPLOAD_FILE_SIZE=10485760 ``` ### Authentication Skyvern uses the standard AWS credential chain. Configure credentials using one of these methods: **Environment variables:** ```bash .env AWS_ACCESS_KEY_ID=AKIA... AWS_SECRET_ACCESS_KEY=... ``` **IAM role (recommended for EC2/ECS/EKS):** Attach an IAM role with S3 permissions to your instance or pod. No credentials needed in environment. **AWS profile:** ```bash .env AWS_PROFILE=your-profile-name ``` ### Required IAM permissions Create an IAM policy with these permissions: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::your-skyvern-artifacts", "arn:aws:s3:::your-skyvern-artifacts/*", "arn:aws:s3:::your-skyvern-screenshots", "arn:aws:s3:::your-skyvern-screenshots/*", "arn:aws:s3:::your-skyvern-browser-sessions", "arn:aws:s3:::your-skyvern-browser-sessions/*", "arn:aws:s3:::your-skyvern-uploads", "arn:aws:s3:::your-skyvern-uploads/*" ] } ] } ``` ### Creating the buckets Create the S3 buckets in your AWS account: ```bash aws s3 mb s3://your-skyvern-artifacts --region us-east-1 aws s3 mb s3://your-skyvern-screenshots --region us-east-1 aws s3 mb s3://your-skyvern-browser-sessions --region us-east-1 aws s3 mb s3://your-skyvern-uploads --region us-east-1 ``` Bucket names must be globally unique across all AWS accounts. Add a unique prefix or suffix (e.g., your company name or a random string). ### Bucket configuration recommendations **Lifecycle rules:** Configure automatic deletion of old artifacts to control costs. ```bash aws s3api put-bucket-lifecycle-configuration \ --bucket your-skyvern-artifacts \ --lifecycle-configuration '{ "Rules": [ { "ID": "DeleteOldArtifacts", "Status": "Enabled", "Filter": {}, "Expiration": { "Days": 30 } } ] }' ``` **Encryption:** Enable server-side encryption for data at rest. **Access logging:** Enable access logging for audit trails. --- ## Azure Blob Storage Store artifacts in Azure Blob Storage for Azure-based deployments. ### Configuration ```bash .env SKYVERN_STORAGE_TYPE=azureblob AZURE_STORAGE_ACCOUNT_NAME=yourstorageaccount AZURE_STORAGE_ACCOUNT_KEY=your-storage-account-key AZURE_STORAGE_CONTAINER_ARTIFACTS=skyvern-artifacts AZURE_STORAGE_CONTAINER_SCREENSHOTS=skyvern-screenshots AZURE_STORAGE_CONTAINER_BROWSER_SESSIONS=skyvern-browser-sessions AZURE_STORAGE_CONTAINER_UPLOADS=skyvern-uploads # Pre-signed URL expiration (seconds) - default 24 hours PRESIGNED_URL_EXPIRATION=86400 # Maximum upload file size (bytes) - default 10MB MAX_UPLOAD_FILE_SIZE=10485760 ``` ### Creating the storage account and containers Using Azure CLI: ```bash # Create resource group az group create --name skyvern-rg --location eastus # Create storage account az storage account create \ --name yourstorageaccount \ --resource-group skyvern-rg \ --location eastus \ --sku Standard_LRS # Get the account key az storage account keys list \ --account-name yourstorageaccount \ --resource-group skyvern-rg \ --query '[0].value' -o tsv # Create containers az storage container create --name skyvern-artifacts --account-name yourstorageaccount az storage container create --name skyvern-screenshots --account-name yourstorageaccount az storage container create --name skyvern-browser-sessions --account-name yourstorageaccount az storage container create --name skyvern-uploads --account-name yourstorageaccount ``` ### Using Managed Identity (recommended) For Azure VMs or AKS, use Managed Identity instead of storage account keys: 1. Enable Managed Identity on your VM or AKS cluster 2. Grant the identity "Storage Blob Data Contributor" role on the storage account 3. Omit `AZURE_STORAGE_ACCOUNT_KEY` from your configuration --- ## What gets stored where | Artifact type | S3 bucket / Azure container | Contents | |---------------|---------------------------|----------| | Artifacts | `*-artifacts` | Extracted data, HTML snapshots, logs | | Screenshots | `*-screenshots` | Page screenshots at each step | | Browser Sessions | `*-browser-sessions` | Saved browser state for profiles | | Uploads | `*-uploads` | User-uploaded files for workflows | Videos (recordings) are currently always stored locally in `VIDEO_PATH` regardless of storage type. --- ## Pre-signed URLs When artifacts are stored in S3 or Azure Blob, Skyvern generates pre-signed URLs for access. These URLs: - Expire after `PRESIGNED_URL_EXPIRATION` seconds (default: 24 hours) - Allow direct download without additional authentication - Are included in task responses (`recording_url`, `screenshot_urls`) Adjust the expiration based on your needs: ```bash .env # 1 hour PRESIGNED_URL_EXPIRATION=3600 # 7 days PRESIGNED_URL_EXPIRATION=604800 ``` --- ## Migrating from local to cloud storage To migrate existing artifacts from local storage to S3 or Azure: ### S3 ```bash # Sync local artifacts to S3 aws s3 sync ./artifacts s3://your-skyvern-artifacts/ # Update configuration # SKYVERN_STORAGE_TYPE=s3 # ... # Restart Skyvern docker compose restart skyvern ``` ### Azure ```bash # Install azcopy # https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10 # Sync local artifacts to Azure azcopy copy './artifacts/*' 'https://yourstorageaccount.blob.core.windows.net/skyvern-artifacts' --recursive # Update configuration and restart ``` After migration, new artifacts will be stored in cloud storage, but existing local artifacts won't be automatically moved. The sync is a one-time operation. --- ## Disk space management ### Local storage Monitor disk usage and clean up old artifacts periodically: ```bash # Check disk usage du -sh ./artifacts ./videos ./har ./log # Remove artifacts older than 30 days find ./artifacts -type f -mtime +30 -delete find ./videos -type f -mtime +30 -delete ``` ### Cloud storage Use lifecycle policies to automatically delete old objects: **S3:** Configure lifecycle rules to expire objects after N days. **Azure:** Configure lifecycle management policies in the Azure portal or via CLI. --- ## Troubleshooting ### "Access Denied" errors - Verify your credentials are correct - Check IAM permissions include all required actions - Ensure the buckets/containers exist - For S3, verify the AWS region matches your bucket location ### Pre-signed URLs not working - Check that `PRESIGNED_URL_EXPIRATION` hasn't elapsed - Verify bucket policy allows public access to pre-signed URLs - For S3, ensure the bucket isn't blocking public access if needed ### Artifacts not appearing - Check Skyvern logs for storage errors: `docker compose logs skyvern | grep -i storage` - Verify the storage type is correctly set: `SKYVERN_STORAGE_TYPE` - Ensure network connectivity to the storage endpoint --- ## Next steps Return to the Docker setup guide Deploy Skyvern at scale