Skip to content

Exploratory data analysis of the Cadastro Unico dataset (2018): pre-processing, basic statistics, data visualization, and variables correlation.

Notifications You must be signed in to change notification settings

danielgribel/CadastroUnico

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

CadastroUnico: Exploratory data analysis

In this notebook, we explore the Cadastro Unico dataset of 2018. The Cadastro Unico of the Federal Government's Social Programs provides information about low-income families, presenting detailed information regarding their income, homes, and other socioeconomic indicators. We consider the dataset of 2018, which consists of unidentified samples, i.e., the information provided in the dataset assures the security of personal information.

The Cadastro Unico datasets present 30 variables related to families and 34 related to individuals. The datasets are available at the Ministry of Citizenship webpage, with data from 2012 to 2018:

https://aplicacoes.mds.gov.br/sagi/portal/index.php?grupo=212.

Scope and objective. This notebook analyzes the data related to the families only (more than 4M entries), even though we will examine it together with the individuals' dataset in the future. The main goal of this notebook is to provide an overall view of the dataset, indicating the most prominent findings through data visualization and basic statistics. Therefore, this notebook can be used as the first step in a more detailed data analysis.

About

Exploratory data analysis of the Cadastro Unico dataset (2018): pre-processing, basic statistics, data visualization, and variables correlation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published