Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cookies not accepted #145

Open
stevenbaert opened this issue Feb 7, 2024 · 2 comments
Open

Cookies not accepted #145

stevenbaert opened this issue Feb 7, 2024 · 2 comments

Comments

@stevenbaert
Copy link

Hi,

Great project! Though results of my crawl is bad since on almost every query I get
"html": "We use cookies on this site to enhance your user experience\n\nBy clicking any link on this page you are giving your consent for us to set cookies.\n\nI ...

Any way to change the crawler to accept the cookies?
S

@aztack
Copy link

aztack commented Mar 1, 2024

I found emptyresourceExclusions prevent config.cookie from respected.
So I set resourceExclusions to the value in the README:

{   
  resourceExclusions: ['png','jpg','jpeg','gif','svg','css','js','ico','woff','woff2','ttf','eot','otf','mp4','mp3','webm','ogg','wav','flac','aac','zip','tar','gz','rar','7z','exe','dmg','apk','csv','xls','xlsx','doc','docx','pdf','epub','iso','dmg','bin','ppt','pptx','odt','avi','mkv','xml','json','yml','yaml','rss','atom','swf','txt','dart','webp','bmp','tif','psd','ai','indd','eps','ps','zipx','srt','wasm','m4v','m4a','webp','weba','m4b','opus','ogv','ogm','oga','spx','ogx','flv','3gp','3g2','jxr','wdp','jng','hief','avif','apng','avifs','heif','heic','cur','ico','ani','jp2','jpm','jpx','mj2','wmv','wma','aac','tif','tiff','mpg','mpeg','mov','avi','wmv','flv','swf','mkv','m4v','m4p','m4b','m4r','m4a','mp3','wav','wma','ogg','oga','webm','3gp','3g2','flac','spx','amr','mid','midi','mka','dts','ac3','eac3','weba','m3u','m3u8','ts','wpl','pls','vob','ifo','bup','svcd','drc','dsm','dsv','dsa','dss','vivo','ivf','dvd','fli','flc','flic','flic','mng','asf','m2v','asx','ram','ra','rm','rpm','roq','smi','smil','wmf','wmz','wmd','wvx','wmx','movie','wri','ins','isp','acsm','djvu','fb2','xps','oxps','ps','eps','ai','prn','svg','dwg','dxf','ttf','fnt','fon','otf','cab'],
  cookie
}

and then the cookie accepted

@litong1
Copy link

litong1 commented Nov 22, 2024

WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. browserContext.addCookies: Cookie should have a url or a domain/path pair
at PlaywrightCrawler.preNavigationHooks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants