-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scraper with Selenium config behaves differently with or without headless option #65
Comments
Enhancement suggestion:
|
After some research, it seems really complicated to change the header of requests in Selenium (the easiest way is to use a local proxy...). Also, there seems to be quite a few differences between chrome and chrome headless. It may be on purpose. So the best solution would actually be to propose an option to use Firefox (geckodriver) instead of Chrome, which actually solve the problem here (tested). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Crawling/Testing not working with that config.
The xpath doesn't seem to find any element.
Crawler is working when I comment
options_selenium.add_argument('headless')
in masterspider.py line 101.This is very weird as chromedriver is supposed to behave identically with or without headless.
PagesJaunes is known to have implemented scraping protections . This may be related.
The text was updated successfully, but these errors were encountered: