Fetching speeches from Babel

This repository contains a CLI to retrieve speech data from Babel API.

Requirements

Python3
Latest pip installed

Install

sudo pip install pipenv # Install pipenv on your system
pipenv install # Install all requirements on a virtual environment
pipenv shell # Enter into the virtualenv created before

Usage

speeches.py [OPTIONS] INITIAL_DATE END_DATE

Options:
  -s, --stage TEXT  Initials from speech stage. For example, PE to 'Pequeno
                    Expediente'
  --help            Show this message and exit.

INITIAL_DATE and END_DATE must be on yyyy-mm-dd format.

After retrieve and process all speech data in the informed time, this scripts will create a csv called speeches.csv.

Preprocessing

After fetch the speeches that you need, you can perform a preprocessing, removing all numbers, accents, stopwords (also removing all the words that appears on more than 90% of documents and less than 1%) and stemming all tokens from the speeches. To do this follow the instructions:

./pre_process.py

This command will read speeches.csv, generated by the previous script, and generate 4 csv files:

stem.csv - list of all stems used (format: id,stem)
stemmed-speeches.csv - list of all preprocessed speeches. There will be 2 rows by speech, the first one is the list of stem ids and the second is the frequency of that stem. Both rows are started by the speech ID
metadatas.csv - list of all speeches metadatas (format: id,author_name,author_party,author_region,date,updated_at,stage)
full-speeches.csv - list of all speeches without any processing (format: id,original)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
pre_process.py		pre_process.py
speeches.py		speeches.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fetching speeches from Babel

Requirements

Install

Usage

Preprocessing

About

Releases

Packages

Languages

labhackercd/fetch-speeches

Folders and files

Latest commit

History

Repository files navigation

Fetching speeches from Babel

Requirements

Install

Usage

Preprocessing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages