Releases: apify/crawlee
Releases · apify/crawlee
v3.5.2
v3.5.1
v3.5.0
3.5.0 (2023-07-31)
Bug Fixes
- cleanup worker stuff from memory storage to fix
vitest
(#2004) (d2e098c), closes #1999 - core: add requests from URL list (
requestsFromUrl
) to the queue in batches (418fbf8), closes #1995 - core: support relative links in
enqueueLinks
explicitly provided viaurls
option (#2014) (cbd9d08), closes #2005
Features
- add
closeCookieModals
context helper for Playwright and Puppeteer (#1927) (98d93bb) - add support for
sameDomainDelaySecs
(#2003) (e796883), closes #1993 - basic-crawler: allow configuring the automatic status message (#2001) (3eb4e4c)
- core: use
RequestQueue.addBatchedRequests()
inenqueueLinks
helper (4d61ca9), closes #1995 - retire session on proxy error (#2002) (8c0928b), closes #1912
v3.4.2
v3.4.1
v3.4.0
3.4.0 (2023-06-12)
Bug Fixes
- respect
<base>
when enqueuing (#1936) (aeef572) - stop lerna from overwriting the copy.ts results (#1946) (69bed40)
Features
- add LinkeDOMCrawler (#1907) (1c69560), closes /github.com/apify/crawlee/pull/1890#issuecomment-1533271694
- infiniteScroll has maxScrollHeight limit (#1945) (44997bb)
v3.3.3
v3.3.2
3.3.2 (2023-05-11)
Bug Fixes
- MemoryStorage: cache requests in
RequestQueue
(#1899) (063dcd1) - respect config object when creating
SessionPool
(#1881) (db069df)
Features
- allow running single crawler instance multiple times (#1844) (9e6eb1e), closes #765
- HttpCrawler: add
parseWithCheerio
helper toHttpCrawler
(#1906) (ff5f76f) - router: allow inline router definition (#1877) (2d241c9)
- RQv2 memory storage support (#1874) (049486b)
- support alternate storage clients when opening storages (#1901) (661e550)
v3.3.1
3.3.1 (2023-04-11)
Bug Fixes
- infiniteScroll() not working in Firefox (#1826) (4286c5d), closes #1821
- jsdom: add timeout to the window.load wait when
runScripts
are enabled (806de31) - jsdom: delay closing of the window and add some polyfills (2e81618)
- jsdom: use no-op
enqueueLinks
in http crawlers when parsing fails (fd35270) - MemoryStorage: handling of readable streams for key-value stores when setting records (#1852) (a5ee37d), closes #1843
- start status message logger after the crawl actually starts (5d1df7a)
- status message - total requests (#1842) (710f734)
- Storage: queue up opening storages to prevent issues in concurrent calls (#1865) (044c740)
- templates: added missing '@types/node' peer dependency (#1860) (d37a7e2)
- try to detect stuck request queue and fix its state (#1837) (95a9f94)
Features
v3.3.0
3.3.0 (2023-03-09)
Bug Fixes
- add
proxyUrl
toDownloadListOfUrlsOptions
(779be1e), closes #1780 - CheerioCrawler: pass ixXml down to response parser (#1807) (af7a5c4), closes #1794
- ignore invalid URLs in
enqueueLinks
in browser crawlers (#1803) (5ac336c) - MemoryStorage: request queues race conditions causing crashes (#1806) (083a9db), closes #1792
- MemoryStorage: RequestQueue should respect
forefront
(#1816) (b68e86a), closes #1787 - MemoryStorage: RequestQueue#handledRequestCount should update (#1817) (a775e4a), closes #1764