-
Notifications
You must be signed in to change notification settings - Fork 58
Home
- Setup and Install
- Jupyter Notebook
- CLI Utility
- General Implementation Information
sudo docker-compose ps
sudo docker-compose down
The configuration file at cli/config/config.yaml
is used by the CLI tool and defines several criteria, such as:
- Certstream logging (include issuer CA, include root CA, include log source, etc)
- Active classifier
- Data sources for features and training data
- Classifier thresholds for phishing predictions
Here's an example of what it looks like:
certstream:
colors: true
include_issuer_ca_name: true
include_log_source: false
include_root_ca_name: false
include_seen_timestamp: false
classifier:
active: 4_24_v1
data:
benign_dir: /opt/streamingphish/training_data/benign/
fqdn_keywords_dir: /opt/streamingphish/training_data/fqdn_keywords/
keywords_dir: /opt/streamingphish/training_data/keywords/
malicious_dir: /opt/streamingphish/training_data/malicious/
similarity_words_dir: /opt/streamingphish/training_data/similarity_words/
targeted_brands_dir: /opt/streamingphish/training_data/targeted_brands/
tld_dir: /opt/streamingphish/training_data/tlds/
logging:
enabled: true
path: /opt/streamingphish/predictions/
logging_tiers:
high:
color: red
threshold: 0.9
low:
color: cyan
threshold: 0.6
suspicious:
color: yellow
threshold: 0.75
system:
log_path: /opt/streamingphish/system/
version: 1
The training_data/
folder is bind-mounted from the host system directly into the cli
container. Any changes to the training data, features, keywords, targeted brands, or TLDs will persist to the host system in the training_data/
folder regardless of the state of the underlying container.
The list of phishing domains used for training are in the training_data/malicious/
folder. The list of benign domains used for training are in the training_data/benign/
folder.
Fully-qualified domain names (FQDNs) predicted as phishing will be written to a bind-mounted folder named predictions/
. Log files will be generated in this folder based on the scoring thresholds defined in cli/config/config.yaml
. The score produced by the classifier when evaluating a fully-qualified domain name will always be between 0 and 1 (1 == phishing, 0 == benign). FQDNs with higher scores are more likely to be phishing. The default thresholds are as follows:
- "High" threshold is 0.90 and above
- "Suspicious" threshold is between 0.90 and 0.75
- "Low" threshold is between 0.75 and 0.60
Any FQDN with a score of 0.60 or lower will not be logged.