Commit Graph

1814 Commits

Author SHA1 Message Date
amhsirak
6efd92ace9 feat: shared schedule utils to avoid connection leaks 2025-11-28 15:40:59 +05:30
amhsirak
58e6da8d6a feat: shared pgboss singleton for job queue ops 2025-11-28 15:40:13 +05:30
amhsirak
447314abb0 feat: add pool config 2025-11-28 15:37:13 +05:30
Rohit Rajan
71bbca16df fix: merge conflict 2025-11-21 00:41:04 +05:30
amhsirak
8b17ef7e90 chore: remove test.ts 2025-11-21 00:23:11 +05:30
Karishma Shukla
a1515c2abf Merge pull request #889 from getmaxun/markdownify
feat: scrape [html + markdown]
2025-11-21 00:14:31 +05:30
Rohit
8171e517d7 Merge branch 'develop' into persist-fix 2025-11-20 21:08:22 +05:30
Rohit Rajan
b2b5a914e7 chore: add telemetry for scrape robots and runs 2025-11-20 19:40:48 +05:30
Rohit Rajan
0987183bac chore: increase goto timeout scrape 100s 2025-11-20 18:59:32 +05:30
Rohit Rajan
e90cd9961e feat: add html scrape support 2025-11-20 18:49:39 +05:30
amhsirak
7f48e276f1 chore: lint 2025-11-20 17:23:04 +05:30
amhsirak
691dedc351 fix: lesser restrictions 2025-11-20 17:22:33 +05:30
amhsirak
930c7b6c74 fix: lesser restrictions 2025-11-20 16:56:43 +05:30
amhsirak
8346c9637a chore: cleanup 2025-11-20 15:37:26 +05:30
amhsirak
28d2288f6e Merge branch 'markdownify' of https://github.com/getmaxun/maxun into markdownify 2025-11-20 15:35:45 +05:30
amhsirak
ddcb3dfe4b feat: extend turndown + clean 2025-11-20 15:35:31 +05:30
Rohit Rajan
d444756f67 chore: add static markdown import 2025-11-20 13:33:10 +05:30
Rohit Rajan
05d2d1b7fe feat: add optional type and url fields 2025-11-20 13:25:43 +05:30
Rohit Rajan
b19e02f137 feat: add markdown route 2025-11-20 13:22:54 +05:30
Rohit Rajan
0d45d1d7f1 feat: markdownify manual, scheduled, api runs 2025-11-20 13:19:12 +05:30
amhsirak
9257b1564e feat: pass url param 2025-11-20 04:22:06 +05:30
amhsirak
0a7a1eb9b8 fix: make baseUrl optional param 2025-11-20 04:21:41 +05:30
amhsirak
839f9fa5ce fix: plugin imports 2025-11-20 04:10:12 +05:30
amhsirak
b14d84d83a fix: -rm debug turndown 2025-11-20 03:51:53 +05:30
amhsirak
b4644ba106 feat: use turndown 2025-11-20 03:51:27 +05:30
amhsirak
767fa5fe4f chore: del go 2025-11-20 03:48:30 +05:30
amhsirak
1a291c22b6 chore: cleanup 2025-11-20 03:38:19 +05:30
amhsirak
3fd9bb5e0e chore(debug): test 2025-11-20 03:01:42 +05:30
amhsirak
1d65f90033 feat: use parser to scrape 2025-11-20 03:01:18 +05:30
amhsirak
66d8291282 fix: export convert fxn 2025-11-20 02:47:20 +05:30
amhsirak
0837ac50b9 fix: go parser path 2025-11-20 02:44:05 +05:30
amhsirak
6c93cbc9a2 feat: html -> markdown 2025-11-20 02:42:44 +05:30
amhsirak
713d37465d feat: to markdown 2025-11-20 00:00:56 +05:30
amhsirak
da48d46f2a chore: build 2025-11-19 23:59:39 +05:30
amhsirak
f0d6712c3e chore: build 2025-11-19 23:59:33 +05:30
amhsirak
ec49565c44 chore: ignore build files 2025-11-19 23:59:14 +05:30
amhsirak
7da464755d wip: to markdown 2025-11-19 22:50:46 +05:30
amhsirak
6c8850a0a7 chore: link replace 2025-11-19 22:35:25 +05:30
amhsirak
4158896e3c chore: link replace 2025-11-19 22:34:18 +05:30
amhsirak
0fa5397b45 debug(temporary): test url -> llm text 2025-11-18 23:42:20 +05:30
amhsirak
f22f6ef83d debug(temporary): turndown x amzn 2025-11-18 23:41:27 +05:30
Rohit Rajan
801ae5a365 fix: scrapeList pagination persistence and action data separation 2025-11-18 14:25:09 +05:30
amhsirak
1651763fc2 fix: better markdown output 2025-11-17 21:53:04 +05:30
amhsirak
28f1bf8510 fix: better markdown output 2025-11-17 21:52:39 +05:30
amhsirak
dae4e83412 wip: markdown + plain text 2025-11-17 21:18:11 +05:30
amhsirak
a3891f6813 wip: markdown + plain text 2025-11-17 21:14:23 +05:30
amhsirak
af9570659f fix: get important content 2025-11-17 20:50:25 +05:30
amhsirak
191ac52ee3 fix: return empty empty str on error 2025-11-17 19:55:17 +05:30
amhsirak
9b71cfc40c fix: return empty empty str on error 2025-11-17 19:54:28 +05:30
amhsirak
560f5a3300 feat: get llm ready text 2025-11-17 19:51:34 +05:30