Rohit Rajan
|
fa961c5f03
|
feat: reuse existing page instance
|
2025-11-21 13:21:18 +05:30 |
|
Rohit Rajan
|
71bbca16df
|
fix: merge conflict
|
2025-11-21 00:41:04 +05:30 |
|
amhsirak
|
8b17ef7e90
|
chore: remove test.ts
|
2025-11-21 00:23:11 +05:30 |
|
Karishma Shukla
|
a1515c2abf
|
Merge pull request #889 from getmaxun/markdownify
feat: scrape [html + markdown]
|
2025-11-21 00:14:31 +05:30 |
|
Rohit
|
8171e517d7
|
Merge branch 'develop' into persist-fix
|
2025-11-20 21:08:22 +05:30 |
|
Rohit Rajan
|
b2b5a914e7
|
chore: add telemetry for scrape robots and runs
|
2025-11-20 19:40:48 +05:30 |
|
Rohit Rajan
|
0987183bac
|
chore: increase goto timeout scrape 100s
|
2025-11-20 18:59:32 +05:30 |
|
Rohit Rajan
|
e90cd9961e
|
feat: add html scrape support
|
2025-11-20 18:49:39 +05:30 |
|
amhsirak
|
7f48e276f1
|
chore: lint
|
2025-11-20 17:23:04 +05:30 |
|
amhsirak
|
691dedc351
|
fix: lesser restrictions
|
2025-11-20 17:22:33 +05:30 |
|
amhsirak
|
930c7b6c74
|
fix: lesser restrictions
|
2025-11-20 16:56:43 +05:30 |
|
amhsirak
|
8346c9637a
|
chore: cleanup
|
2025-11-20 15:37:26 +05:30 |
|
amhsirak
|
28d2288f6e
|
Merge branch 'markdownify' of https://github.com/getmaxun/maxun into markdownify
|
2025-11-20 15:35:45 +05:30 |
|
amhsirak
|
ddcb3dfe4b
|
feat: extend turndown + clean
|
2025-11-20 15:35:31 +05:30 |
|
Rohit Rajan
|
d444756f67
|
chore: add static markdown import
|
2025-11-20 13:33:10 +05:30 |
|
Rohit Rajan
|
05d2d1b7fe
|
feat: add optional type and url fields
|
2025-11-20 13:25:43 +05:30 |
|
Rohit Rajan
|
b19e02f137
|
feat: add markdown route
|
2025-11-20 13:22:54 +05:30 |
|
Rohit Rajan
|
0d45d1d7f1
|
feat: markdownify manual, scheduled, api runs
|
2025-11-20 13:19:12 +05:30 |
|
amhsirak
|
9257b1564e
|
feat: pass url param
|
2025-11-20 04:22:06 +05:30 |
|
amhsirak
|
0a7a1eb9b8
|
fix: make baseUrl optional param
|
2025-11-20 04:21:41 +05:30 |
|
amhsirak
|
839f9fa5ce
|
fix: plugin imports
|
2025-11-20 04:10:12 +05:30 |
|
amhsirak
|
b14d84d83a
|
fix: -rm debug turndown
|
2025-11-20 03:51:53 +05:30 |
|
amhsirak
|
b4644ba106
|
feat: use turndown
|
2025-11-20 03:51:27 +05:30 |
|
amhsirak
|
767fa5fe4f
|
chore: del go
|
2025-11-20 03:48:30 +05:30 |
|
amhsirak
|
1a291c22b6
|
chore: cleanup
|
2025-11-20 03:38:19 +05:30 |
|
amhsirak
|
3fd9bb5e0e
|
chore(debug): test
|
2025-11-20 03:01:42 +05:30 |
|
amhsirak
|
1d65f90033
|
feat: use parser to scrape
|
2025-11-20 03:01:18 +05:30 |
|
amhsirak
|
66d8291282
|
fix: export convert fxn
|
2025-11-20 02:47:20 +05:30 |
|
amhsirak
|
0837ac50b9
|
fix: go parser path
|
2025-11-20 02:44:05 +05:30 |
|
amhsirak
|
6c93cbc9a2
|
feat: html -> markdown
|
2025-11-20 02:42:44 +05:30 |
|
amhsirak
|
713d37465d
|
feat: to markdown
|
2025-11-20 00:00:56 +05:30 |
|
amhsirak
|
da48d46f2a
|
chore: build
|
2025-11-19 23:59:39 +05:30 |
|
amhsirak
|
f0d6712c3e
|
chore: build
|
2025-11-19 23:59:33 +05:30 |
|
amhsirak
|
ec49565c44
|
chore: ignore build files
|
2025-11-19 23:59:14 +05:30 |
|
amhsirak
|
7da464755d
|
wip: to markdown
|
2025-11-19 22:50:46 +05:30 |
|
amhsirak
|
6c8850a0a7
|
chore: link replace
|
2025-11-19 22:35:25 +05:30 |
|
amhsirak
|
4158896e3c
|
chore: link replace
|
2025-11-19 22:34:18 +05:30 |
|
amhsirak
|
0fa5397b45
|
debug(temporary): test url -> llm text
|
2025-11-18 23:42:20 +05:30 |
|
amhsirak
|
f22f6ef83d
|
debug(temporary): turndown x amzn
|
2025-11-18 23:41:27 +05:30 |
|
Rohit Rajan
|
801ae5a365
|
fix: scrapeList pagination persistence and action data separation
|
2025-11-18 14:25:09 +05:30 |
|
amhsirak
|
1651763fc2
|
fix: better markdown output
|
2025-11-17 21:53:04 +05:30 |
|
amhsirak
|
28f1bf8510
|
fix: better markdown output
|
2025-11-17 21:52:39 +05:30 |
|
amhsirak
|
dae4e83412
|
wip: markdown + plain text
|
2025-11-17 21:18:11 +05:30 |
|
amhsirak
|
a3891f6813
|
wip: markdown + plain text
|
2025-11-17 21:14:23 +05:30 |
|
amhsirak
|
af9570659f
|
fix: get important content
|
2025-11-17 20:50:25 +05:30 |
|
amhsirak
|
191ac52ee3
|
fix: return empty empty str on error
|
2025-11-17 19:55:17 +05:30 |
|
amhsirak
|
9b71cfc40c
|
fix: return empty empty str on error
|
2025-11-17 19:54:28 +05:30 |
|
amhsirak
|
560f5a3300
|
feat: get llm ready text
|
2025-11-17 19:51:34 +05:30 |
|
amhsirak
|
0c9dc899c3
|
feat: get input text for llm
|
2025-11-17 19:50:28 +05:30 |
|
amhsirak
|
994142ae40
|
fix: define browser context
|
2025-11-17 19:49:38 +05:30 |
|