Congresovisible.org is a great project which provides information about :
- Colombian law projects
- How are those projects voted
- Votes made by Senators and Congressmen
Sadly they don't provide an API for this valuable information. So this repo provides :
- code to scrape their website in order to extract valuable information
- data dumps (in json format)
Every line of the json dump corresponds to a json dictionary representing a voting event, every event contains the following data:
[
{
"camara" : "Cámara de Representantes",
"estado" : "aprobado",
"id": 3014,
"ano": "2014",
"mes_dia": "Sep 03",
"desacuerdo": "1%",
"comisiones": "",
"acuerdo": "99%",
"procedimiento": "Descripcion proyecto de ley",
"detailed" : {
{"Álvaro Uribe": {"party": "Centro Democratico", "vote": "Aprobado"},
....
....
}
}
]
camara
: Which Legislature votedid
: Congresovisible.org database identifierano
: Year in which the voting took placemes_dia
: month, day in which the voting took placedetailed
: dictionary containing the name of politicians as keys, and a json object describing their party and vote as a value.
Each line of the file should be a parsable json object.
The tsv data is split in two files:
votes.tsv
: contains the votes of politicians in sessions, each session is an identifier referencing a session description insessions.csv
sessions.tsv
: contains a session description, date, and legislature.
-
If you just want to use the data, clone this repo and go to the folder
dumps
, pick your file ^^. -
If you want to generate a new dump:
- Create a virtualenv with python3.4
pip install -r requirements.txt
python main.py
clustering.r
:
- Set your working folder to the clustering sample:
setwd("path...to..repo/congresovisible/samples/senators_clustering/")
-
Run the clustering by doing:
source("clustering.r")
-
Note: please install the needed r packages.