From f0fa3ffd28d424f5192cb8831985b940d02db33c Mon Sep 17 00:00:00 2001 From: J <91372088+jarelllama@users.noreply.github.com> Date: Mon, 23 Dec 2024 16:19:33 +0800 Subject: [PATCH] Finalise adding 165antifraud source and cleanup sources.md --- SOURCES.md | 52 ++++++++++++++++++------------------- scripts/retrieve_domains.sh | 17 +++++++++++- scripts/update_readme.sh | 1 + 3 files changed, 43 insertions(+), 27 deletions(-) diff --git a/SOURCES.md b/SOURCES.md index 0aabe7c9a..15edc3c37 100644 --- a/SOURCES.md +++ b/SOURCES.md @@ -6,32 +6,32 @@ Any data hidden behind account creation/commercial licenses is never used. Only active sources are used to automatically retrieve domains. -| Source | Type | Active | Excluded from light | -|:--- |:--- |:--- |:--- | -| [ANFRAS](https://anfras.com/fakeshops/) | Fake | | | -| [Artists Against 419](https://db.aa419.org/fakebankslist.php) | Advance-fee | Yes | | -| [DFPI's Crypto Scam Tracker](https://dfpi.ca.gov/crypto-scams/) | Crypto | | | -| [DGA Detector](https://github.com/exp0se/dga_detector) | DGA | Yes | Yes | -| [Emerging Threats](https://rules.emergingthreats.net/) | Malware | Yes | | -| [Fake Website Buster](https://fakewebsitebuster.com/) | Fake | | | -| [FakeWebshopListHUN](https://github.com/FakesiteListHUN/FakeWebshopListHUN) | Fake | Yes | | -| [Google's Custom Search JSON API](https://developers.google.com/custom-search/v1/introduction) | Fake | Yes | | -| [Greek Tax Scam](https://github.com/hagezi/dns-blocklists/issues/4191) | Phishing | | | -| [GunTab](https://www.guntab.com/scam-websites) | Fake | Yes | Yes | -| [Jeroen Gui's phishing & scam feeds](https://jeroengui.be/anti-phishing-project/)[^1] | Phishing | Yes | ~ | -| [PetScams.com](https://petscams.com/) | Fake | | | -| [PhishStats](https://phishstats.info/)[^2] | Phishing | Yes | ~ | -| [Regex Matching](https://github.com/jarelllama/Scam-Blocklist/blob/main/config/phishing_targets.csv) | Phishing | Yes | Yes | -| [Scam Directory](https://scam.directory/) | Any | Yes | | -| [Scam.Delivery](https://scam.delivery/) | Non-delivery | | | -| [ScamAdvisor](https://www.scamadviser.com/) | Any | Yes | | -| [Stop 419 Scams and Scammers](https://www.stop419scams.com/) | Any | | | -| [StopGunScams.com](https://stopgunscams.com/) | Fake | Yes | | -| [URLCrazy](https://github.com/urbanadventurer/urlcrazy) | Cybersquatting | Yes | | -| [dnstwist](https://github.com/elceef/dnstwist) | Cybersquatting | Yes | | -| [openSquat](https://github.com/atenreiro/opensquat) | Phishing | | | -| [r/Scams](https://www.reddit.com/r/Scams/) | Any | | | -| [xRuffKez's NRD List](https://github.com/xRuffKez/NRD) | NRD | - | - | +| Source | Active | Excluded from light | +|:--- |:--- |:--- | +| [165 Anti-fraud](https://data.gov.tw/dataset/160055) | Yes | Yes | +| [ANFRAS](https://anfras.com/fakeshops/) | | | +| [Artists Against 419](https://db.aa419.org/fakebankslist.php) | Yes | | +| [DFPI's Crypto Scam Tracker](https://dfpi.ca.gov/crypto-scams/) | | | +| [DGA Detector](https://github.com/exp0se/dga_detector) | Yes | Yes | +| [Emerging Threats](https://rules.emergingthreats.net/) | Yes | | +| [Fake Website Buster](https://fakewebsitebuster.com/) | | | +| [FakeWebshopListHUN](https://github.com/FakesiteListHUN/FakeWebshopListHUN) | Yes | | +| [Google's Custom Search JSON API](https://developers.google.com/custom-search/v1/introduction) | Yes | | +| [GunTab](https://www.guntab.com/scam-websites) | Yes | Yes | +| [Jeroen Gui's phishing & scam feeds](https://jeroengui.be/anti-phishing-project/)[^1] | Yes | ~ | +| [PetScams.com](https://petscams.com/) | | | +| [PhishStats](https://phishstats.info/)[^2] | Yes | ~ | +| [Regex Matching](https://github.com/jarelllama/Scam-Blocklist/blob/main/config/phishing_targets.csv) | Yes | Yes | +| [Scam Directory](https://scam.directory/) | Yes | | +| [Scam.Delivery](https://scam.delivery/) | | | +| [ScamAdvisor](https://www.scamadviser.com/) | Yes | | +| [Stop 419 Scams and Scammers](https://www.stop419scams.com/) | | | +| [StopGunScams.com](https://stopgunscams.com/) | Yes | | +| [URLCrazy](https://github.com/urbanadventurer/urlcrazy) | Yes | | +| [dnstwist](https://github.com/elceef/dnstwist) | Yes | | +| [openSquat](https://github.com/atenreiro/opensquat) | | | +| [r/Scams](https://www.reddit.com/r/Scams/) | | | +| [xRuffKez's NRD List](https://github.com/xRuffKez/NRD) | - | - | [^1]: Only the scam feed is used for the light version. [^2]: Only domains found in the NRD feed are used for the light version. diff --git a/scripts/retrieve_domains.sh b/scripts/retrieve_domains.sh index 4779f1321..ba25d47cc 100644 --- a/scripts/retrieve_domains.sh +++ b/scripts/retrieve_domains.sh @@ -34,7 +34,22 @@ readonly STRICT_DOMAIN_REGEX='[[:alnum:]][[:alnum:].-]+\.[[:alnum:]-]*[a-z]{2,}[ readonly -a SOURCES=( source_165antifraud - + source_aa419 + source_dga_detector + source_cybersquatting + source_emerging_threats + source_fakewebshoplisthun + source_guntab + source_jeroengui_phishing + source_jeroengui_scam + source_manual + source_phishstats + source_phishstats_nrd + source_regex + source_scamadviser + source_scamdirectory + source_stopgunscams + source_google_search ) # Function 'source' calls on the respective functions of each source to diff --git a/scripts/update_readme.sh b/scripts/update_readme.sh index fb2644646..0f9b5b2a0 100644 --- a/scripts/update_readme.sh +++ b/scripts/update_readme.sh @@ -45,6 +45,7 @@ Light version: $(grep -cF '||' lists/adblock/scams_light.txt) New domains after filtering: Today | Monthly | %Monthly | %Filtered | Source +$(print_stats '165 Anti-fraud') $(print_stats Cybersquatting) $(print_stats 'DGA Detector') $(print_stats 'Emerging Threats')