VTISearch is a small utility for running a VirusTotal Intelligence search query. A query can include powerful search modifiers (listed in the documentation) that permit efficient threat research and hunting operations.
The program leverages v3 of the VirusTotal API. Please note that for Intelligence Search (and most other features of the program), you need a private API key, i.e., access to VirusTotal Enterprise. The API key is requested upon the first start and saved to the keyring of the system for security reasons.
By default, VTISearch retrieves information about the first 20 samples that are associated with the search query. However, results for up to 300 samples can be requested as well with the help of the -l
(--limit
) parameter.
Information includes the list of sample hashes (MD5, SHA1, SHA256, and - if existing - the VirusTotal vhash similarity hash), the type and size of the artifact, dates of (first and last) submission, and also detection statistics.
Additional details, e.g., scanning results per vendor, can be displayed when speciying the verbose (-v
) parameter. Up to three different verbosity levels are supported.
VTISearch is capable of downloading the samples as well as behavioral (dynamic analysis) reports for an Intelligence search. Dynamic analysis reports are also automatically parsed in order to extract network-based Indicators of Compromise (IOCs).
When using the --csv
option, results can be exported in CSV format for subsequent import in, e.g., Maltego or other graph visualization programs.
- Retrieves information for up to 300 artifacts (samples, domains, URLs) that are related to the search query.
- Information includes meta data as well as detailed scanning and detection results upon request.
- Supports the automatic download of associated samples and behavioral (dynamic analysis) reports.
- Behavioral reports are automatically scanned for network-based Indicators of Compromise (IOCs).
- Use of multiple workers to speed up operations.
- All information is categorized in different sub-folders. Detailed logs facilitate post-processing.
- Results can be exported in CSV format for subsequent relationship visualization with, e.g., Maltego.
-
Linux operating system (tested on Ubuntu 18.04)
-
Python 3.7+
-
pip3
-
vt-py
-
keyring
- By default, Python 3.6 is still installed on Ubuntu 18.04. You can install version 3.7 (or 3.8) with
$ sudo apt-get install python3.7
and then change to the new version with*
$ sudo update-alternatives --config python3
(* In case you should get an error message that no alternative had been found, please run sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 1
.)
If you subsequently run python3 --version
you should see the new version. Please note that you might have to reinstall respective packages for this version.
- Please reinstall pip3 with
$ sudo apt-get install --reinstall python3-pip
- Once Python 3.7 is avaialble, you can comfortably install all required packages as follows:
$ sudo pip3 install vt-py keyring
(I am globally installing the packages in this example. Please feel free to set up a virtual environment instead if you prefer.)
- Clone the VTISearch repository, and start the program:
$ git clone https://github.com/svo80/vti_search.git .
$ cd vti_search && python3 vti_search -h
usage: vti_search.py [-h] [-q QUERY] [-l LIMIT] [--logfile LOG]
[--download-dir DOWNLOAD_DIR] [-d] [-f SAMPLE_FILE]
[--no-behavior] [-v] [--csv]
optional arguments:
-h, --help Show this help message and exit
-q QUERY, --query QUERY Run a VirusTotal Intelligence search query.
-l LIMIT, --limit LIMIT Limits the number of samples to return.
--logfile LOG Name of the log file.
--download-dir DOWNLOAD_DIR Name of the directory where retrieved information will
be stored in.
-d, --download If set, also downloads samples from VirusTotal that
are referenced in an Intelligence search.
-f SAMPLE_FILE, --file SAMPLE_FILE Downloads samples that are referenced in a file.
--no-behavior If set, does not download behavior reports for
samples.
-v, --verbose If set, display verbose information about reports. Use
-vvv to see detailed scan results.
-u, --update-key If set, offers to enter a new API key.
-w, --workers WORKERS Number of concurrent workers.
--csv If set, display results as comma-separated values.
In the majority of cases, VTISearch will be executed with the -q
(--query
) parameter. This query is sent to VirusTotal via the v3
API. Respective samples will not be downloaded by default. However, this procedure can be easily activated with the -d
parameter.
$ python3 vti_search.py -q "evil.exe" -d
Rather than performing an Intelligence search, it is also possible processing a list of hashes that are stored in a file. As such, the program can be used as a quick sample downloader and IOC processor:
$ python3 python3 vti_search.py -f ./iocs.txt
The approaches can also be mixed. For instance, you might want to first check the results of a query slightly more in detail, adapt the list of samples in scope, and then re-run the program with the download option enabled for the updated sample list.
Alternatively, you might want to combine the results of an Intelligence search with indicators highlighted in a (third-party) report in order to create a more detailed overview of a specific campaign or operation.
By default, all log files, samples, and reports are stored in a separate directory (identified by its timestamp) that is created at program startup in the downloads
folder. If you prefer rather updating an existing directory, you can explicitly set the --download-dir
parameter.
For instance, assuming you would like to investigate an APT campaign, you can perform an Intelligence search, retrieve the first 100 results in detailed format, and store all information in a specific folder as follows:
$ python3 vt_search.py -d -q <query> -l 100 -vvv --download-dir=downloads/apt
The following queries are solely for demonstration purposes to illustrate search capabilities and possible use cases for the program:
- Show samples with detection statistics that were submitted after May 1, 2020 and were detected by more than five but less than 10 vendors.
$ python3 vti_search.py -q "ls:2020-05-01+ positives:5+ positives:10-" -v --no-behavior
- Show PDF documents in German that were delivered as an email attachment and contain an embedded JavaScript.
$ python3 vti_search.py -q "tag:attachment type:pdf lang:german tag:js-embedded"
- Show signed executables with a size of less than 300KB that were detected by more than five vendors.
$ python3 vti_search.py -q "size:300KB- positives:5+ tag:signed type:peexe"
- Show up to five samples, representing Microsoft Office documents that execute code upon opening and likely set an AutoRun key for persistence.
$ python3 vti_search.py -q "behavior:'currentversion\run\' type:docx tag:auto-open" -l 5
VTISearch supports exporting all information in CSV format. Exported contents are dependent on the verbosity level.
For instance, when specifying the -vvv
parameter, detailed anti-virus scanning reports will be exported into CSV format. On the other hand, when solely specifying the -v
parameter, higher level summary statistics will be created.
The list of network indicators retrieved from dynamic analysis sandbox reports can be exported in CSV format as well. This information can subsequently be loaded with, e.g., Maltego in order to visualize respective relationships.
$ python3 vti_search.py -d -q evil.exe -l 10 -vv
VTISearch - VirusTotal Intelligence Search - Version 0.1.0
Written by Stefan Voemel.
------------------------------------------------------------------------------------------
2axxxxxxxxxe4b2be454ed0dxxxxxxxxxx7db18e9780xxxxxxxx10dcabxxxxxx
MD5: xxxxx09dxxxxxc271cxxxxx5cb6xxxxx
Sha1: xxxxxx71bxxxxx4aaxxxx383xxxxce8xxxxe00xx
VHash: xxx04xx5xdxx1xx8xxxx2txxxx
Type: PE32 executable for MS Windows (GUI) Intel 80386 32-bit
Type Tag: peexe
Size: 73802
First submission: 2020-05-07 11:16:58
Last submission: 2020-05-07 11:16:58
Number of submissions: 1
Unique sources: 1
Malicious: 58
Suspicious: 0
Undetected: 14
[Host] 1xx.16.xxx.xxx:4444
798xxxx29xxxx4xxxe3dxxxa8xfxx3x2excxxxe7xxc4cxxxd4x4fx4x05xxxxxx
MD5: xxxx27xxxx28xxxx14xxxb34xxx13xxx
Sha1: xxxxb6xxx1f4xxxxdb26xxxx94xxxx5dxx61cxxx
VHash: xxx03xxx7dxxx2xx
Type: PE32 executable for MS Windows (console) Intel 80386 32-bit
Type Tag: peexe
Size: 4752
First submission: 2011-07-04 22:00:08
Last submission: 2020-05-06 13:39:21
Number of submissions: 1951
Unique sources: 1472
Malicious: 58
Suspicious: 0
Undetected: 14
[Host] 1xx.1xx.221.22:80
[Host] 1xx.1xx.131.241:80
[Host] 1xx.xxx.78.24:443
[Host] 1xx.xxx.78.25:443
[URL] hxxp://www.xxxxxxxx.com/ad.html
├── downloads Program data
│ └── <timestamp>
│ ├── artifacts.txt List of artifacts that were in scope
│ ├── behavior/ Directory for behavioral reports
│ ├── csv Directory with CSV files
│ │ ├── domains.csv Exported domains (if existing)
│ │ ├── network_iocs.csv Exported network indicators
│ │ ├── samples.csv Exported samples (if existing)
│ │ └── urls.csv Exorted URLs (if existing)
│ ├── log.txt Detailed log file with program runtime messages
│ ├── reports/ Directory for summary reports and network indicators (*.ioc)
│ │ ├── <sample> Textual summary report for a sample
│ │ ├── <sample.ioc> Extracted network indicators for a sample
│ │ ├── <sample.raw> Static analysis report for a sample in JSON format
│ ├── samples/ Directory for malware samples
│
├── lib Program libraries
│ ├── auxiliary.py
│ ├── sandboxes.py
│ └── vt.py
│
├── README.md
└── vti_search.py Main program file
I am not a professional developer or software engineer, and this program should be seen as a small helper tool. While I do enjoy periodically writing smaller utilities in my free time for Incident Response, malware analysis, and Threat Intelligence scenarios, I very rarely upload any of them.
The only reason why I did so for this program is, because the number of alternatives for the v3 VirusTotal API is currently still very much limited. This being said, I spend the vast majority of my time (i.e., my professional life) with leading security teams and offering strategic advice and guidance on a higher level. As such, if you believe that the code is <beep>, you are probably right.