Commit Graph

240 Commits

Author SHA1 Message Date
pedrohsdb
52c38a66c1 reverting lawys PR to fix workable (#3579) 2025-10-01 11:51:07 -07:00
pedrohsdb
80c7ea2577 Revert "skip malformed css selector" (#3578) 2025-10-01 11:38:07 -07:00
pedrohsdb
0fce84a384 skip malformed css selector (#3577) 2025-10-01 11:24:43 -07:00
LawyZheng
6b90f10221 remove valid css selector check (#3524) 2025-09-25 11:06:28 +08:00
LawyZheng
55bc6bd367 helper function for wait animation (#3240) 2025-08-20 14:28:01 +08:00
LawyZheng
7823ff9c46 start build tree from HTML element (#3237) 2025-08-20 10:58:18 +08:00
LawyZheng
458b7e43ab remove hard wait time in input action (#3229) 2025-08-19 14:26:25 +08:00
LawyZheng
1588d8018b improve dynamic wait when multiple frames (#3228) 2025-08-19 14:09:03 +08:00
LawyZheng
9a359ebfde decrease parse input prompt token (#3210) 2025-08-16 10:05:38 +08:00
LawyZheng
654cdb14e4 fix wait for animation end (#3201) 2025-08-15 15:24:54 +08:00
LawyZheng
6b8d29a23d fix stop waiting bug (#3197) 2025-08-15 03:51:39 +08:00
LawyZheng
cac4792f38 remove hard waiting time in scraping (#3195) 2025-08-15 02:24:59 +08:00
LawyZheng
f971cf8e58 optimize cache element tree logic (#3194) 2025-08-15 02:06:08 +08:00
LawyZheng
81767e3189 optimize scraping part 4 (#3192) 2025-08-15 01:55:59 +08:00
LawyZheng
04fd540cd5 stop building element tree again and again when drawing boudingbox (#3191) 2025-08-15 01:40:39 +08:00
LawyZheng
2556d04e70 fix scraping edge case (#3186) 2025-08-14 15:04:15 +08:00
LawyZheng
65e9cb10e9 optimize scraping part 2 (#3185) 2025-08-14 14:51:43 +08:00
LawyZheng
30606645ea optimize scraping part 1 (#3184) 2025-08-14 14:24:21 +08:00
LawyZheng
b88cf18590 optimize scraping part 3 (#3183) 2025-08-14 14:12:16 +08:00
Shuchang Zheng
52dc5a510b fix economy element tree trimming (#3182) 2025-08-13 21:45:14 -07:00
Shuchang Zheng
434bbff459 add support_empty_page and wait_seconds to the scrape_website interface (#3181) 2025-08-13 19:22:50 -07:00
devsy-bot[bot]
e3a3309e9c fix: change scraper log level from info to debug (#3143)
Co-authored-by: devsy-bot <no-reply@devsy.ai>
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-08 08:56:30 -07:00
LawyZheng
f33906509f fix dom listener bug (#3095) 2025-08-04 11:10:49 +08:00
LawyZheng
ecc0e2e17d better failure reason for blank page (#3049)
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-07-29 14:40:54 +08:00
LawyZheng
bff7544b83 fix scraping issue (#3035) 2025-07-25 12:53:22 +08:00
LawyZheng
4093a7fab0 fix style map parsing (#3029) 2025-07-25 00:50:06 +08:00
LawyZheng
fcd22017b7 make scraping timeout configurable (#2991) 2025-07-19 13:18:12 +08:00
Jonathan Dobson
f5d7639de8 allow empty urls (#2984) 2025-07-18 10:20:33 -04:00
Jonathan Dobson
c13c36f99e distinctify failed scrapes due to no url (#2977) 2025-07-17 16:19:16 -04:00
LawyZheng
5363d33dcc fix interactable detecting (#2941) 2025-07-15 03:31:34 +08:00
LawyZheng
dd9710eb9f add force textural element as interactable exp (#2936) 2025-07-14 13:09:40 +08:00
LawyZheng
95ab8295ce laminar integration (#2887) 2025-07-07 14:43:10 +08:00
Shuchang Zheng
cb17dbbb6f extend select agent to support date picker (#2849) 2025-07-01 13:12:39 +08:00
Shuchang Zheng
e448721468 add pointer detect for h element (#2846)
Co-authored-by: lawyzheng <lawyzheng1106@gmail.com>
2025-06-30 14:20:20 +08:00
Asher Foa
a6bf217559 Fix typos (#2807) 2025-06-28 01:26:21 +00:00
Shuchang Zheng
775da18878 current viewpoint screenshot and scrolling n screenshot (#2716)
Co-authored-by: lawyzheng <lawyzheng1106@gmail.com>
2025-06-14 14:59:50 +08:00
Asher Foa
effd0c4911 Add pyupgrade pre-commit hook + modernize python code (#2611) 2025-06-10 18:52:38 +00:00
Shuchang Zheng
c531395c39 fix input interactable detect (#2621)
Co-authored-by: lawyzheng <lawyzheng1106@gmail.com>
2025-06-09 06:45:32 +00:00
Shuchang Zheng
4fd8f5fdad add error handle for click event check (#2596) 2025-06-05 14:17:28 +08:00
Shuchang Zheng
a46f7248a7 enhance clickable element detect (#2595) 2025-06-05 14:05:34 +08:00
Shuchang Zheng
47709dc0d8 support cross domain css sheet parse (#2535) 2025-05-30 09:51:59 +08:00
Shuchang Zheng
49ef1aaa07 fix li interactable (#2511) 2025-05-29 23:03:39 +08:00
Shuchang Zheng
cf08ca951e Fix chrome user data dir problem (#2503) 2025-05-28 22:41:06 -07:00
Shuchang Zheng
31d6dbdacd stop removing target attr when scraping (#2495) 2025-05-28 15:55:01 +08:00
Shuchang Zheng
cca2772765 fix new tab a issue (#2437) 2025-05-23 13:18:42 +08:00
Shuchang Zheng
24a73b7af0 select option on click (#2391) 2025-05-20 00:08:55 +08:00
Shuchang Zheng
193df54c6e fix dom listener bug (#2344) 2025-05-15 01:22:38 +08:00
Shuchang Zheng
e927940800 no to build full tree but still adding structure representive div (#2342) 2025-05-14 16:54:04 +08:00
Shuchang Zheng
6148f6cd72 add structured div in the element tree (#2339) 2025-05-14 16:24:37 +08:00
Shuchang Zheng
d3ea8ef85b generate element xpath (#2335) 2025-05-14 02:11:16 +08:00