5.0 KiB
Open-Source No-Code Web Data Extraction Platform
Maxun lets you train a robot in 2 minutes and scrape the web on auto-pilot. Web data extraction doesn't get easier than this!
Installation
Envirnoment Variables
| Variable | Mandatory | Description | If Not Set |
|---|---|---|---|
NODE_ENV |
Yes | Sets whether you are running the app locally or in production. | |
JWT_SECRET |
Yes | JWT secret is utilized to generate authentication tokens. | |
DB_NAME |
Yes | Brief description here. | Describe what happens here. |
DB_USER |
Yes | Brief description here. | Describe what happens here. |
DB_PASSWORD |
Yes | Brief description here. | Describe what happens here. |
DB_NAME |
Yes | Brief description here. | Describe what happens here. |
DB_USER |
Yes | Brief description here. | Describe what happens here. |
DB_HOST |
Yes | Sets whether you are running the app locally or in production. | |
DB_PORT |
Yes | JWT secret is utilized to generate authentication tokens. | |
ENCRYPTION_KEY |
Yes | Brief description here. | Describe what happens here. |
MINIO_ENDPOINT |
Yes | Brief description here. | Describe what happens here. |
MINIO_PORT |
Yes | Brief description here. | Describe what happens here. |
MINIO_ACCESS_KEY |
Yes | Brief description here. | Describe what happens here. |
GOOGLE_CLIENT_ID |
Yes | Brief description here. | Describe what happens here. |
GOOGLE_CLIENT_SECRET |
Yes | Brief description here. | Describe what happens here. |
GOOGLE_REDIRECT_URI |
Yes | Brief description here. | Describe what happens here. |
REDIS_HOST |
Yes | Brief description here. | Describe what happens here. |
REDIS_PORT |
Yes | Brief description here. | Describe what happens here. |
MAXUN_TELEMETRY |
No | Brief description here. | Describe what happens here. |
How Does It Work?
Maxun lets you create custom robots which emulate user actions and extract data. A robot can perform any of the actions: Capture List, Capture Text or Capture Screenshot. Once a robot is created, it will keep extracting data for you without manual intervention
1. Robot Actions
- Capture List: Useful to extract structured and bulk items from the website. Example: Scrape products from Amazon etc.
- Capture Text: Useful to extract individual text content from the website.
- Capture Screenshot: Get fullpage or visible section screenshots of the website.
2. BYOP
BYOP (Bring Your Own Proxy) lets you connect external proxies to bypass anti-bot protection. Currently, the proxies are per user. Soon you'll be able to configure proxy per robot.
Features
- ✨ Extract Data With No-Code
- ✨ Handle Pagination & Scrolling
- ✨ Run Robots On A Specific Schedule
- ✨ Turn Websites to APIs
- ✨ Turn Websites to Spreadsheets
- ✨ Adapt To Website Layout Changes (coming soon)
- ✨ Extract Behind Login, With Two-Factor Authentication Support (coming soon)
- ✨ Integrations (currently Google Sheet)
- +++ A lot of amazing things soon!
Cloud
We offer a managed cloud version to run Maxun without having to manage the infrastructure and extract data at scale. Maxun cloud also deals with anti-bot detection, huge proxy network with automatic proxy rotation, and CAPTCHA solving. If this interests you, join the cloud waitlist as we launch soon.
Note
This project is in early stages of development. We're actively working to improve the product.
Contributing
Please refer to Contribution Guide.
License
This project is licensed under AGPLv3.
Contributors
Thank you to the combined efforts of everyone who contributes!