Skip to content

Blocklist for newly created scam, phishing, and malware domains automatically retrieved daily using Google Search API, automated detection, and public databases.

License

Notifications You must be signed in to change notification settings

jarelllama/Scam-Blocklist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jarelllama's Scam Blocklist

Blocklist for newly created scam and phishing domains automatically retrieved daily using Google Search API, automated detection, and other public sources.

This blocklist aims to be an alternative to blocking all newly registered domains (NRDs) seeing how many, but not all, NRDs are malicious. This is done by detecting new malicious domains within a short period of their registration date. Sources include:

  • Public databases
  • Google Search indexing to find common scam site templates
  • Open source tools such as dnstwist to detect cybersquatting techniques like typosquatting, doppelganger domains, and IDN homograph attacks
  • Regex expression matching for phishing NRDs. See the list of expressions here

A list of all sources can be found in SOURCES.md with config files here.

The automated retrieval is done daily at 16:00 UTC.

Downloads

Format Syntax
Adblock Plus ||scam.com^
Wildcard Domains scam.com

This blocklist is integrated into Hagezi's Threat Intelligence Feed (full version). For extended protection, please use his list instead.

Statistics

Total domains: 178877
Light version: 18065

New domains after filtering:
Today | Monthly | %Monthly | %Filtered | Source
   19 |    1974 |      2 % |      36 % | Emerging Threats
  146 |    2132 |      2 % |      18 % | FakeWebshopListHUN
    0 |     446 |      0 % |       3 % | Google Search
  794 |   15414 |     19 % |       9 % | Jeroengui phishing feed
    9 |      98 |      0 % |       8 % | Jeroengui scam feed
  586 |   35457 |     45 % |      22 % | PhishStats
   96 |    9415 |     11 % |       0 % | PhishStats (NRDs)
  348 |   19897 |     25 % |       1 % | Regex Matching (NRDs)
    9 |     154 |      0 % |      11 % | aa419.org
   26 |     940 |      1 % |       1 % | dnstwist (NRDs)
    0 |    1478 |      1 % |      32 % | guntab.com
    0 |     166 |      0 % |       8 % | scam.directory
    0 |      46 |      0 % |      32 % | scamadviser.com
    0 |       8 |      0 % |       5 % | stopgunscams.com
 1937 |   78514 |    100 % |      19 % | All sources

- %Monthly: percentage out of total domains from all sources.
- %Filtered: percentage of dead, whitelisted and parked domains.
Domains over time (days)

Domains over time

Courtesy of iam-py-test/blocklist_stats.

Other blocklists

Light version

For collated blocklists cautious about size, a light version of the blocklist is available in the lists directory. Sources excluded from the light version are marked in SOURCES.md.

Note that dead and parked domains that become alive/unparked are not added back into the light version due to limitations in the way these domains are recorded.

NSFW Blocklist

A blocklist for NSFW domains is available in Adblock Plus format here: nsfw.txt.

Details
  • Domains are automatically retrieved from the Tranco Top Sites Ranking daily
  • Dead domains are removed daily
  • Note that resurrected domains are not added back into the blocklist
  • Note that parked domains are not checked for in this blocklist
Total domains: 12516

This blocklist does not just include adult videos, but also NSFW content of the artistic variety (rule34, illustrations, etc).

Malware Blocklist

A blocklist for malicious domains extracted from Proofpoint's Emerging Threats rulesets can be found here: jarelllama/Emerging-Threats.

Automated filtering process

  • Domains are filtered against an actively maintained whitelist
  • Domains are checked against the Tranco Top Sites Ranking for potential false positives which are then vetted manually
  • Common subdomains like 'www' are stripped
  • Only domains are included in the blocklist; URLs are stripped down to their domains and IP addresses are manually checked for resolving DNS records
  • Redundant rules are removed via wildcard matching. For example, 'abc.example.com' is a wildcard match of 'example.com' and, therefore, is redundant and removed. Wildcards are occasionally added to the blocklist manually to further optimize the number of entries

Entries that require manual verification/intervention are notified to the maintainer for fast remediations.

The full filtering process can be viewed in the repository's code.

Dead domains

Dead domains are removed daily using AdGuard's Dead Domains Linter.

Dead domains that are resolving again are included back into the blocklist.

Dead domains removed today: 161
Resurrected domains added today: 411

Parked domains

Parked domains are removed weekly. A list of common parked domain messages is used to automatically detect these domains. This list can be viewed here: parked_terms.txt.

Parked sites no longer containing any of the parked messages are assumed to be unparked and are included back into the blocklist.

Tip

For list maintainers interested in integrating the parked domains as a source, a list of weekly-updated parked domains can be found here: parked_domains.txt (capped to newest 50000 entries).

Parked domains removed this month: 15835
Unparked domains added this month: 517

Resources / See also

Sponsor this project

 

Languages