Skip to content

Basics of data visualisation, exploratory data analysis, ggplot2 and case study.

Notifications You must be signed in to change notification settings

imsalmanmalik/Data-Visualisation-Using-R

Repository files navigation

Data Visualisation Using R

This repository contains a data visualisation project on R that explores ggplot2, a popular data visualisation package for the statistical programming language R.

Motivation

With the growing availability of informative datasets and software tools, data visualisation has become increasingly important across many industries, academia, and government. Data visualisation provides a powerful way to communicate data-driven findings, motivate analyses, or detect flaws.

The basics of data visualisation and exploratory data analysis were learned, and this project serves as an example of the application of these skills.

Data and Tools

The project explores the case study related to the infectious disease trends in the United States. The ggplot2 package was used for data visualisation and exploratory data analysis.

It is important to note that mistakes, biases, systematic errors, and other unexpected problems often lead to data that should be handled with care. The fact that it can be difficult or impossible to notice an error just from the reported results makes data visualisation particularly important.

Contents

This repository contains the following files:

README.md: provides an overview of the project

basics_ggplot: R script containing code for data visualisation and analysis using the ggplot2 package.

gapminder_dataset: R script containing code for data visualisation and analysis using the gapminder package to apply the ggplot2 techniques, understanding how fixed scales across plots can ease comparisons and to be able to modify graphs to improve data visualization.

principles_of_visualisation.R: R script containing code for data visualisation and analysis to understand basic principles of effective data visualization. Understand the importance of keeping your goal in mind when deciding on a visualization approach. Understand principles for encoding data, including position, aligned lengths, angles, area, brightness, and color hue. Know when to include the number zero in visualizations and be able to use techniques to ease comparisons, such as using common axes, putting visual cues to be compared adjacent to one another, and using color effectively

Distributions.R: Using distributions to summarize data, the average and the standard deviation to understand the normal distribution, assess how well a normal distribution fits the data using a quantile-quantile plot and interpreting data from a boxplot.

vaccine_case_study.R: Visualise data about measles incidence in order to demonstrate the impact of vaccination programs on disease rate using the dslabs library and 'us_contagious_diseases' data.

Conclusion

This project demonstrates the power of data visualisation in communicating data-driven findings, motivating analyses, and detecting flaws. It also highlights the importance of handling data with care, and how data visualisation can help in detecting errors and biases.

Releases

No releases published

Packages

No packages published

Languages