- Improved how the :ref:`default retry policy <default-retry-policy>` handles :ref:`temporary download errors <zapi-temporary-download-errors>`. Before, 3 HTTP 429 responses followed by a single HTTP 520 response would have prevented a retry. Now, unrelated responses and errors do not count towards the HTTP 520 retry limit.
- Improved how the :ref:`default retry policy <default-retry-policy>` handles network errors. Before, after 15 minutes of unsuccessful responses (e.g. HTTP 429), any network error would prevent a retry. Now, network errors must happen 15 minutes in a row, without different errors in between, to stop retries.
- Implemented an optional :ref:`aggressive retry policy <aggressive-retry-policy>`, which retries more errors more often, and could be useful for long crawls or websites with a low success rate.
- Improved the exception that is raised when passing an invalid retrying policy object to a :ref:`Python client <api>`.
- :class:`~zyte_api.RequestError` now has a :data:`~zyte_api.RequestError.query` attribute with the Zyte API request parameters that caused the error.
- :class:`~zyte_api.ZyteAPI` and :class:`~zyte_api.AsyncZyteAPI` sessions no
longer need to be used as context managers, and can instead be closed with a
close()
method.
- Removed Python 3.7 support.
- Added :class:`~zyte_api.ZyteAPI` and :class:`~zyte_api.AsyncZyteAPI` to provide both sync and async Python interfaces with a cleaner API.
- Deprecated
zyte_api.aio
:- Replace
zyte_api.aio.client.AsyncClient
with the new :class:`~zyte_api.AsyncZyteAPI` class. - Replace
zyte_api.aio.client.create_session
with the new :meth:`AsyncZyteAPI.session <zyte_api.AsyncZyteAPI.session>` method. - Import
zyte_api.aio.errors.RequestError
,zyte_api.aio.retry.RetryFactory
andzyte_api.aio.retry.zyte_api_retrying
directly fromzyte_api
now.
- Replace
- When using the command-line interface, you can now use
--store-errors
to have error responses be stored alongside successful responses. - Improved the documentation.
- Include the Zyte API request ID value in a new
.request_id
attribute inzyte_api.aio.errors.RequestError
.
AsyncClient
now lets you set a custom user agent to send to Zyte API.
- Increased the client timeout to match the server’s.
- Mentioned the
api_key
parameter ofAsyncClient
in the docs example.
- w3lib >= 2.1.1 is required in install_requires, to ensure that URLs are escaped properly.
- unnecessary
requests
library is removed from install_requires - fixed tox 4 support
- Fixed an issue with submitting URLs which contain unescaped symbols
- New "retrying" argument for AsyncClient.__init__, which allows to set custom retrying policy for the client
--dont-retry-errors
argument in the CLI tool
- Connections are no longer reused between requests.
This reduces the amount of
ServerDisconnectedError
exceptions.
- Bump minimum
aiohttp
version to 3.8.0, as earlier versions don't support brotli decompression of responses - Declared Python 3.11 support
- Network errors, like server timeouts or disconnections, are now retried for up to 15 minutes, instead of 5 minutes.
- Require to install
Brotli
as a dependency. This changes the requests to haveAccept-Encoding: br
and automatically decompress brotli responses.
Internal AggStats class is cleaned up:
AggStats.n_extracted_queries
attribute is removed, as it was a duplicate ofAggStats.n_results
AggStats.n_results
is renamed toAggStats.n_success
AggStats.n_input_queries
is removed as redundant and misleading; AggStats got a newAggStats.n_processed
property instead.
This change is backwards incompatible if you used stats directly.
aiohttp.client_exceptions.ClientConnectorError
is now treated as a network error and retried accordingly.- Removed the unused
zyte_api.sync
module.
- Temporary download errors are now retried 3 times by default. They were not retried in previous releases.
This release contains usability improvements to the command-line script:
- Instead of
python -m zyte_api
you can now run it aszyte-api
; - the type of the input file (
--intype
argument) is guessed now, based on file extension and content; .jl, .jsonl and .txt files are supported.
- Minor documenation fix
- Remove support for Python 3.6
- Added support for Python 3.10
- Default timeouts changed
- CHANGES.rst updated properly
- Initial release.