-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
False negative for a PDF? #192
Comments
works fine on latest git commit and v1.2.0, pdf downloaded directly from the browser |
That's strange, I tried it again with v1.2.0 and it is still None. Maybe my file somehow got modified.
Environment: Windows 11 |
your hex |
Okay, it seems that I have a different version of the pdf compared to Nature's. The wget pdf has the |
3.pdf have html stuff in front of the actual pdf file (firefox open it fine)
<html>
<head>
<meta http-equiv="refresh"
content="0;url=http://dnserrorassist.att.net/search/?q=http://bulk%2F%26srchgdeCid%3Daaaaaaaa%26t%3D0%26bc%3D" />
</head>
<body>
<script
type="text/javascript">window.location = "http://dnserrorassist.att.net/search/?q=" + escape(window.location) + "&r=" + escape(document.referrer) + "&t=0&srchgdeCid=aaaaaaaa&bc=";</script>
</body>
</html> |
Huh, interesting. I still would prefer if filetype were to be able to recognize this as a pdf, though. For example, pymupdf is able to open the pdf no problem. I was able to replicate the pdf miss on google colab with this code:
|
hmm did your wget command in ubuntu vm and got clear pdf file |
I have a pdf for which filetype is unable to recognize the extension.
The text was updated successfully, but these errors were encountered: