Splash authentication credentials potentially leaked to target websites
High severity
GitHub Reviewed
Published
Oct 5, 2021
in
scrapy-plugins/scrapy-splash
•
Updated Oct 26, 2024
Description
Reviewed
Oct 5, 2021
Published by the National Vulnerability Database
Oct 5, 2021
Published to the GitHub Advisory Database
Oct 6, 2021
Last updated
Oct 26, 2024
Impact
If you use
HttpAuthMiddleware
(i.e. thehttp_user
andhttp_pass
spider attributes) for Splash authentication, any non-Splash request will expose your credentials to the request target. This includesrobots.txt
requests sent by Scrapy when theROBOTSTXT_OBEY
setting is set toTrue
.Patches
Upgrade to scrapy-splash 0.8.0 and use the new
SPLASH_USER
andSPLASH_PASS
settings instead to set your Splash authentication credentials safely.Workarounds
If you cannot upgrade, set your Splash request credentials on a per-request basis, using the
splash_headers
request parameter, instead of defining them globally using theHttpAuthMiddleware
.Alternatively, make sure all your requests go through Splash. That includes disabling the robots.txt middleware.
For more information
If you have any questions or comments about this advisory:
References