Releases: apify/crawlee
Releases · apify/crawlee
v3.2.2
v3.2.1
v3.2.0
3.2.0 (2023-02-07)
Bug Fixes
- allow
userData
option inenqueueLinksByClickingElements
(#1749) (736f85d), closes #1617 - clone
request.userData
when creating new request object (#1728) (222ef59), closes #1725 - Correctly compute
pendingRequestCount
in request queue (#1765) (946535f) - declare missing dependency on
tslib
(27e96c8), closes #1747 - ensure CrawlingContext interface is inferred correctly in route handlers (aa84633)
- KeyValueStore: big buffers should not crash (#1734) (2f682f7), closes #1732 #1710
- memory-storage: dont fail when storage already purged (#1737) (8694027), closes #1736
- update playwright to 1.29.2 and make peer dep. less strict (#1735) (c654fcd), closes #1723
- utils: add missing dependency on
ow
(bf0e03c), closes #1716
Features
v3.1.4
v3.1.3
v3.1.2
3.1.2 (2022-11-15)
Bug Fixes
- injectJQuery in context does not survive navs (#1661) (493a7cf)
- make router error message more helpful for undefined routes (#1678) (ab359d8)
- MemoryStorage: correctly respect the desc option (#1666) (b5f37f6)
- requestHandlerTimeout timing (#1660) (493ea0c)
- shallow clone browserPoolOptions before normalization (#1665) (22467ca)
- support headfull mode in playwright js project template (ea2e61b)
- support headfull mode in puppeteer js project template (e6aceb8)
Features
v3.1.1
3.1.1 (2022-11-07)
Bug Fixes
utils.playwright.blockRequests
warning message (#1632) (76549eb)- concurrency option override order (#1649) (7bbad03)
- handle non-error objects thrown gracefully (#1652) (c3a4e1a)
- mark session as bad on failed requests (#1647) (445ae43)
- support reloading of sessions with lots of retries (ebc89d2)
- fix type errors when
playwright
is not installed (#1637) (de9db0c) - upgrade to puppeteer@19.x (#1623) (ce36d6b)
Features
v3.1.0
3.1.0 (2022-10-13)
Bug Fixes
- add overload for
KeyValueStore.getValue
with defaultValue (#1541) (e3cb509) - add retry attempts to methods in CLI (#1588) (9142e59)
- allow
label
inenqueueLinksByClickingElements
options (#1525) (18b7c25) - basic-crawler: handle
request.noRetry
aftererrorHandler
(#1542) (2a2040e) - build storage classes by using
this
instead of the class (#1596) (2b14eb7) - correct some typing exports (#1527) (4a136e5)
- do not hide stack trace of (retried) Type/Syntax/ReferenceErrors (469b4b5)
- enqueueLinks: ensure the enqueue strategy is respected alongside user patterns (#1509) (2b0eeed)
- enqueueLinks: prevent useless request creations when filtering by user patterns (#1510) (cb8fe36)
- export
Cookie
fromcrawlee
metapackage (7b02ceb) - handle redirect cookies (#1521) (2f7fc7c)
- http-crawler: do not hang on POST without payload (#1546) (8c87390)
- remove undeclared dependency on core package from puppeteer utils (827ae60)
- support TypeScript 4.8 (#1507) (4c3a504)
- wait for persist state listeners to run when event manager closes (#1481) (aa550ed)
Features
- add
Dataset.exportToCSV
andDataset.exportToJSON
- add
Dataset.getData()
shortcut (522ed6e) - add
utils.downloadListOfUrls
to crawlee metapackage (7b33b0a) - add
utils.parseOpenGraph()
(#1555) (059f85e) - add
utils.playwright.compileScript
(#1559) (2e14162) - add
utils.playwright.infiniteScroll
(#1543) (60c8289), closes #1528 - add
utils.playwright.saveSnapshot
(#1544) (a4ceef0) - add global
useState
helper (#1551) (2b03177) - allow disabling storage persistence (#1539) (f65e3c6)
- bump puppeteer support to 17.x (#1519) (b97a852)
- core: add
forefront
option toenqueueLinks
helper (f8755b6), closes #1595 - don't close page before calling errorHandler (#1548) (1c8cd82)
- enqueue links by clicking for Playwright (#1545) (3d25ade)
- error tracker (#1467) (6bfe1ce)
- make the CLI download directly from GitHub (#1540) (3ff398a)
- router: add userdata generic to addHandler (#1547) (19cdf13)
- use JSON5 for
INPUT.json
to support comments (#1538) (09133ff)
v3.0.4
v3.0.3
What's Changed
- fix: add missing configuration to CheerioCrawler constructor by @AndreyBykov in #1432
- fix: sendRequest types by @szmarczak in #1445
- fix: respect
headless
option in browser crawlers by @B4nan in #1455 - fix: make
CheerioCrawlerOptions
type more loose by @B4nan in d871d8c - fix: improve dockerfiles and project templates by @B4nan in 7c21a64
- feat: add
utils.playwright.blockRequests()
by @barjin in #1447 - feat: http-crawler by @szmarczak in #1440
- feat: prefer
/INPUT.json
files forKeyValueStore.getInput()
by @vladfrangu in #1453 - feat: jsdom-crawler by @szmarczak in #1451
- feat: add
RetryRequestError
+ add error to the context for BC by @vladfrangu in #1443 - feat: add
keepAlive
to crawler options by @B4nan in #1452
Full Changelog: v3.0.2...v3.0.3