Welcome to the state-owned ASes dataset repository. Here you can find a dataset containing a list of ASes of State-Owned Internet Operators.
The list of ASes of State-Owned Internet Operators is a product of the research published in the paper "Identifying ASes of State-owned Internet operators" by E. Carisimo, A Gamero-Garrido, A. Snoeren and A. Dainotti, appearing in the Proceedings of the ACM Internet Measurement Conference (IMC) 2021, November 2021, Virtual Event.
.
├── LICENSE
├── README.md
├── data
│ ├── minority-state-owned-ases
│ │ ├── csv
│ │ │ ├── minority_state_owned_ases.csv
│ │ │ ├── minority_state_owned_ases_table_ases.csv
│ │ │ └── minority_state_owned_ases_table_organizations.csv
│ │ ├── json
│ │ │ ├── minority_state_owned_ases_table_ases.json
│ │ │ └── minority_state_owned_ases_table_organizations.json
│ │ └── sqlite
│ │ └── minority_state_owned_ases.sqlite
│ └── state-owned-ases
│ ├── csv
│ │ ├── state_owned_ases.csv
│ │ ├── state_owned_ases_table_ases.csv
│ │ └── state_owned_ases_table_organizations.csv
│ ├── json
│ │ ├── state_owned_ases_table_ases.json
│ │ └── state_owned_ases_table_organizations.json
│ └── sqlite
│ └── state_owned_ases.sqlite
├── examples
│ ├── minority-state-owned-ases
│ │ ├── load_from_json.ipynb
│ │ ├── load_from_organizations_and_ases_csv_files.ipynb
│ │ ├── load_from_single_table_csv_file.ipynb
│ │ └── load_from_sqlite.ipynb
│ └── state-owned-ases
│ ├── load_from_json.ipynb
│ ├── load_from_organizations_and_ases_csv_files.ipynb
│ ├── load_from_single_table_csv_file.ipynb
│ └── load_from_sqlite.ipynb
└── requirements.txt
12 directories, 23 files
In this repository, we provide two datasets:
- (Main dataset) A list examined in-depth of ASes operated by state-owned Internet providers.
- (Additional dataset) A preliminary and partial list of ASes operated by minority state-owned Internet providers.
We provide the datasets of state-owned ASes in three formats CSV, JSON and sqlite
We use the following example to illustrate the data structure of the records in our dataset.
# Ownership details of an identified
# state-owned organization
{
"conglomerate_name": "NO-TELENOR",
"org_id": "ORG-NA38-RIPE",
"org_name": "Telenor Norge AS",
"ownership_cc": "NO",
"ownership_country_name": "Norway",
"rir": "RIPE",
"source": "Company's website",
"quote": "Major Shareholdings: Government of Norway (54,7%)",
"quote_lang": "English",
"url":"https://www.telenor.com/investors/share-information/major-shareholdings"
"additional_info": "",
"inputs": [G, E, W, O],
"parent_org":,
"target_cc":,
"target_country_name":,
}
# List of ASes operated by the identified
# state-owned organization
{
"org_id": "ORG-NA38-RIPE",
"asn": [2119, 8210, 8394, 8786, 39197, 197943, 200168]
}
Our dataset is composed of two data entities organization
and ases
which are linked by the key org_id
. Next, we describe each of the fields in both entities.
Organization
conglomerate_name
: name of the conglomerate the company belongs to.org_id
: Org ID from CAIDA's AS2Org.org_name
: The name of the organization according to CAIDA's AS2Org records.ownership_cc
: ISO-3361 country code.ownership_country_name
: country name.rir
: country's RIR.source
: Type of confirmation sources that validated the inference.quote
: The exact quote we use to determine the state ownership.quote_lang
: Language of the quote.url
: the URL to the confirmation data source.additional_info
: (optional) In some cases, this record adds some details to understand the state ownership (e.g., specifying that a hedge fund is state-owned)inputs
: The input data source(s) that caused this organization to be initially added to the candidate list (the associated research paper describes candidate lists). We abbreviate the input sources using the following convention:- G: Country-level AS geolocation.
- E: APNIC eyeballs dataset.
- C: Country-Level Transit Influence.
- O: Orbis.
- W: Wikipedia & Freedom House.
parent_org
: (optional, only for foreign subsidiaries) the parent company's Org IDtarget_cc
: (optional, only for foreign subsidiaries) The ISO-3361 country code where the company is intended to operate.target_country_name
: (optional, only for foreign subsidiaries) The name of the country where the company is intended to operate.
ASes
org_id
: CAIDA AS2Org's Org IDasn
: Autonomous System Number associated with thatorg_id
Minority state-owned Internet providers. In the dataset of the minority state-owned ASes we include one additional field onwership_percentage
to organizations
to indicate the state ownership in the company operating the Autonomous System
We provide two alternatives to access the dataset using CSV files.
- Single-table file: Some users of our dataset might prefer to have all the information in a single file.
- The list of state-owned ASes is here (single-file state-owned ASes dataset)!
- The preliminary and partial list of minority state-owned ASes is here (single-file minority state-owned ASes dataset)!
- Organizations and ASes files: Following the data structure we described above, CSV files containing Organizations and ASes are available here:
- state-owned ASes (Orgs) and (ASes).
- The preliminary and partial list of minority state-owned ASes (Orgs) and(ASes).
As described in the Data structure example, Organization and ASes JSON files are available here:
- state-owned ASes (Orgs) and (ASes).
- The preliminary and partial list of minority state-owned ASes (Orgs) and (ASes).
We also provide access to the dataset using sqlite. We provide two separate databases for clarity. The first database includes the list of state-owned Ases. The second databases includes the preliminary and partial list of minority state-owned ASes. These databases include two tables each, organizations
and ases
. The files containing the sqlite databases are located here:
- state-owned ASes (sqlite database)
- The preliminary and partial list of minority state-owned ASes (sqlite database)
Identifying ASes operated by minority state-owned Internet Operators was beyond the scope of this project. However, during our manual examination we discovered some ASes with partial state ownership, that is where the state directly or indirectly owns less than 50% of the company's shares. We share this additional outcome of our research but acknowledge its limitations.
Completeness We did not systematically look for minority state-owned Internet Operators, since they were beyond the scope of the process. As a result, the coverage of this list is unknown, although we assume that the list is very incomplete.
Depth We did not investigate the parent-child structure of minority state-owned companies. Therefore, we do not list subsidiaries of the companies in this list. There are some prominent examples of large companies (that may have many subsidiaries) in our list such as Orange (AS5511, PeeringDB entry), Deustche Telekom (AS3320, PeeringDB entry) and NTT DoCoMo(AS9605, PeeringDB entry). We are aware that these minority state-owned companies have a large international footprint, but their identification was incidental to our main goal of identifying majority state-owned companies.
To get familiar with the datasets, we include on this repo four Jupyter Notebooks for both datasets. We use Python on these notebooks to also provide methods that enable loading and interacting with the dataset. These examples are here:
- State-Owned ASes
- Minority State-Owned ASes
We highly recommend you to use a Python virtual environment to run these examples. In this repository, we also include a requirements.txt
to install all python packages needed to run the examples.
To install this virtual environment, you have to run the following commands
$ python3 -m venv .state-owned-ases
$ source .state-owned-ases/bin/activate
$ pip3 install ipykernel
$ ipython kernel install --user --name=.state-owned-ases
$ pip3 install -r requirements.txt
If you use our dataset, please cite it as:
@inproceedings{10.1145/3487552.3487822,
author = {Carisimo, Esteban and Gamero-Garrido, Alexander and Snoeren, Alex C. and Dainotti, Alberto},
title = {Identifying ASes of State-Owned Internet Operators},
year = {2021},
isbn = {9781450391290},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3487552.3487822},
doi = {10.1145/3487552.3487822},
booktitle = {Proceedings of the 21st ACM Internet Measurement Conference},
pages = {687–702},
numpages = {16},
location = {Virtual Event},
series = {IMC '21}
}