Skip to content

This Python script is used to interact with the Microsoft Azure Resource Graph service to query Azure Resource Manager for information on Azure resources (although the effort goes towards and agnostic provider query)

License

Notifications You must be signed in to change notification settings

dalthviz/query-resource-graph

 
 

Repository files navigation

System mapper

An utility to retrieve infrastructure and system information from a cloud provider (Azure for now), store the info in a graph database (Neo4j) and visualize it using Dash

Requirements

Python Libraries and CLI utilities

Python

CLI

Data persistency and APIs use (database and IIS Administration API)

  • Neo4j >= 3.4 and supported by neomodel (tested using kernel version 3.4.0):
    • The compatible APOC plugin needs to be installed too (tested using 3.4.0.8). Note: Some neo4j config is needed to run some queries using APOC. Please add at the end of your neo4j.conf file the following lines:
    #***********************************************************
    # APOC
    #***********************************************************
    dbms.security.procedures.unrestricted=apoc.*
    apoc.export.file.enabled=true
    apoc.import.file.use_neo4j_config=false
  • Install, in the accesible VMs, IIS Administration API. More info about the IIS API. You should grant read access to the API to a user. For that, the appsettings.json of the IIS Administration API needs to be change like (cors settings, files access settings and security settings)

Setup

  • Clone the repository.
  • Install the required Python packages referenced above (pip install -r requirements.txt).
  • Check that a Neo4j instance is running in the default port and has as user and password (for example( user -> neo4j and pass -> ne@4j) and properly configured (APOC plugin) as stated above.
  • Config the relevant options in the config.json file (you can specify the path to the file using a .env file with the env var CONFIG_FILE_PATH or adding an config.json file add the system_mapper/ dir). For example a config that uses Azure and IIS Administration API to get the information:
{
    "initial_rule": "RULE_0_MULTIPLE_RESOURCE_GROUPS",
    "rules": [
        "RULE_0_MULTIPLE_RESOURCE_GROUPS",
        "RULE_1_MULTIPLE_SUSCRIPTIONS",
        "RULE_2_ORPHAN_NODES",
        "RULE_3_MAX_DEPENDENCIES"],
    "rules_mapping": {
        "RULE_0_MULTIPLE_RESOURCE_GROUPS": [
            "MATCH (n)-[r]-(m) MATCH (n)-[rg1:ELEMENT_RESOURCE_GROUP]-(nrg1) MATCH (m)-[rg2:ELEMENT_RESOURCE_GROUP]-(nrg2) WHERE NOT nrg1 = nrg2 RETURN n, r, m ",
            "n,r,m"
            ],
        "RULE_1_MULTIPLE_SUSCRIPTIONS": [
            "MATCH (n)-[r]-(m) MATCH (n)-[]-(np:Property {key: 'subscriptionId'}) MATCH (m)-[]-(mp:Property {key: 'subscriptionId'}) WHERE NOT np.value = mp.value RETURN n, r, m, np, mp ",
            "n,r,m,np,mp"
            ],
        "RULE_2_ORPHAN_NODES": [
            "MATCH (n) WHERE NOT (n)-[]-() RETURN n",
            "n"
            ],
        "RULE_3_MAX_DEPENDENCIES": [
            "MATCH (n)-[]-(m) RETURN n, COLLECT(m) as others ORDER BY SIZE(others) DESC LIMIT 1",
            "n"
            ]
        },
    "neo4j_database_url": "bolt://neo4j:ne@4j@localhost:7687",
    "database_strings": ["database", "base de datos", "MicrosoftSQLServer"],
    "port": 55539,
    "app_container_url": "/api/webserver/websites/",
    "app_container_token": "some token to access the ISS API. More info: https://docs.microsoft.com/en-us/IIS-Administration/management-portal/connecting",
    "app_container_user": "<windows username>",
    "app_container_password": "<windows user password>",
    "visualization_port": "80",
    "visualization_n_threads": "100",
    "visualization_dev": false,
    "visualization_host": "0.0.0.0"
}

Some notes regarding the config file:

  • initial_rule is the name of the initial rule to apply when checking the rule visualization.

  • rules are the list of available rules (which need to match the rules_mapping dict)

  • rules_mapping are custom cypher queries. Each entry has (1) the query, (2) the variables returned by the query

  • The other values are related with:

    • neo4j_database_url: Connection string to connect to the Neo4j database
    • database_strings: Strings to find in the VM information to classify a Virtual Machine as a Database node
    • IIS related config:
      • port: Connection port for the IIS management API (the API needs to be enabled in the Virtual Machines)
      • app_container_url: relative url to use to start to get the INFO. For now THE ONLY SUPPORTED ONE is /api/webserver/websites/.
      • app_container_token: token to use to authenticate the request made to the IIS Administration API or relevant API. See IIS Administration access tokens
      • app_container_user: windows user to use to authenticate via NTLM in case of IIS Admin API
      • app_container_password: windows user password to authenticate via NTLM in case of IIS Admin API
    • Visualization dashboard related config:
      • visualization_port: Port for the server to launch the dash app.
      • visualization_n_threads: Number of threads the server in prod mode will use.
      • visualization_dev: If the launched dash app is dev mode (run from Dash) or prod mode (waitress).
      • visualization_host: Host for the server.

Run

From the root directory run and after setting up the environment:

python -m system_mapper.main

About

This Python script is used to interact with the Microsoft Azure Resource Graph service to query Azure Resource Manager for information on Azure resources (although the effort goes towards and agnostic provider query)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 85.1%
  • CSS 14.9%