An utility to retrieve infrastructure and system information from a cloud provider (Azure for now), store the info in a graph database (Neo4j) and visualize it using Dash
- Python 3.6
- Microsoft Azure Python SDK
- Microsoft Authentication Library (MSAL)
- neomodel
- dash-cytoscape
- For more details see the
requirements.txt
file. Also, just in case, you should create an env to do the installation using something likevirtualenv
orconda
- Neo4j >= 3.4 and supported by neomodel (tested using kernel version
3.4.0
):- The compatible APOC plugin needs to be installed too (tested using
3.4.0.8
). Note: Some neo4j config is needed to run some queries using APOC. Please add at the end of yourneo4j.conf
file the following lines:
#*********************************************************** # APOC #*********************************************************** dbms.security.procedures.unrestricted=apoc.* apoc.export.file.enabled=true apoc.import.file.use_neo4j_config=false
- The compatible APOC plugin needs to be installed too (tested using
- Install, in the accesible VMs, IIS Administration API. More info about the IIS API. You should grant read access to the API to a user. For that, the
appsettings.json
of the IIS Administration API needs to be change like (cors settings, files access settings and security settings)
- Clone the repository.
- Install the required Python packages referenced above (
pip install -r requirements.txt
). - Check that a Neo4j instance is running in the default port and has as user and password (for example( user ->
neo4j
and pass ->ne@4j
) and properly configured (APOC plugin) as stated above. - Config the relevant options in the
config.json
file (you can specify the path to the file using a.env
file with the env varCONFIG_FILE_PATH
or adding anconfig.json
file add thesystem_mapper/
dir). For example a config that uses Azure and IIS Administration API to get the information:
{
"initial_rule": "RULE_0_MULTIPLE_RESOURCE_GROUPS",
"rules": [
"RULE_0_MULTIPLE_RESOURCE_GROUPS",
"RULE_1_MULTIPLE_SUSCRIPTIONS",
"RULE_2_ORPHAN_NODES",
"RULE_3_MAX_DEPENDENCIES"],
"rules_mapping": {
"RULE_0_MULTIPLE_RESOURCE_GROUPS": [
"MATCH (n)-[r]-(m) MATCH (n)-[rg1:ELEMENT_RESOURCE_GROUP]-(nrg1) MATCH (m)-[rg2:ELEMENT_RESOURCE_GROUP]-(nrg2) WHERE NOT nrg1 = nrg2 RETURN n, r, m ",
"n,r,m"
],
"RULE_1_MULTIPLE_SUSCRIPTIONS": [
"MATCH (n)-[r]-(m) MATCH (n)-[]-(np:Property {key: 'subscriptionId'}) MATCH (m)-[]-(mp:Property {key: 'subscriptionId'}) WHERE NOT np.value = mp.value RETURN n, r, m, np, mp ",
"n,r,m,np,mp"
],
"RULE_2_ORPHAN_NODES": [
"MATCH (n) WHERE NOT (n)-[]-() RETURN n",
"n"
],
"RULE_3_MAX_DEPENDENCIES": [
"MATCH (n)-[]-(m) RETURN n, COLLECT(m) as others ORDER BY SIZE(others) DESC LIMIT 1",
"n"
]
},
"neo4j_database_url": "bolt://neo4j:ne@4j@localhost:7687",
"database_strings": ["database", "base de datos", "MicrosoftSQLServer"],
"port": 55539,
"app_container_url": "/api/webserver/websites/",
"app_container_token": "some token to access the ISS API. More info: https://docs.microsoft.com/en-us/IIS-Administration/management-portal/connecting",
"app_container_user": "<windows username>",
"app_container_password": "<windows user password>",
"visualization_port": "80",
"visualization_n_threads": "100",
"visualization_dev": false,
"visualization_host": "0.0.0.0"
}
Some notes regarding the config file:
-
initial_rule
is the name of the initial rule to apply when checking the rule visualization. -
rules
are the list of available rules (which need to match therules_mapping
dict) -
rules_mapping
are custom cypher queries. Each entry has (1) the query, (2) the variables returned by the query -
The other values are related with:
neo4j_database_url
: Connection string to connect to the Neo4j databasedatabase_strings
: Strings to find in the VM information to classify a Virtual Machine as a Database node- IIS related config:
port
: Connection port for the IIS management API (the API needs to be enabled in the Virtual Machines)app_container_url
: relative url to use to start to get the INFO. For now THE ONLY SUPPORTED ONE is/api/webserver/websites/
.app_container_token
: token to use to authenticate the request made to the IIS Administration API or relevant API. See IIS Administration access tokensapp_container_user
: windows user to use to authenticate via NTLM in case of IIS Admin APIapp_container_password
: windows user password to authenticate via NTLM in case of IIS Admin API
- Visualization dashboard related config:
visualization_port
: Port for the server to launch the dash app.visualization_n_threads
: Number of threads the server in prod mode will use.visualization_dev
: If the launched dash app is dev mode (run from Dash) or prod mode (waitress).visualization_host
: Host for the server.
From the root directory run and after setting up the environment:
python -m system_mapper.main