urlfinderlib

urlfinderlib

Python library for finding URLs in documents and arbitrary data and checking their validity.

Basic usage

from urlfinderlib import find_urls

with open('/path/to/file', 'rb') as f:
    print(find_urls(f.read())

base_url usage

If you are trying to find URLs inside of an HTML file, the paths in the URLs are likely relative to their location on the server hosting the HTML. You can use the base_url parameter in this case to extract these "relative" URLs.

from urlfinderlib import find_urls

with open('/path/to/file', 'rb') as f:
    print(find_urls(f.read(), base_url='http://somewebsite.com/')

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

THIS REPOSITORY HAS MOVED TO https://github.com/ace-ecosystem/urlfinderlib

urlfinderlib

Files

README.md

Latest commit

History

README.md

File metadata and controls

THIS REPOSITORY HAS MOVED TO https://github.com/ace-ecosystem/urlfinderlib

urlfinderlib