News Article Data Analysis and Interpretation

Note

This README is currently being updated. Some sections may be incomplete or undergoing revisions. Thank you for your patience!

News Article Data Analysis and Interpretation

The goal of this project is to use R to analyze and evaluate news article data. The project is divided into two components: data analysis and data interpretation. The analysis was carried out in R, where I parsed JSON data, examined the dataset, and created visualizations. Understanding the results and generating conclusions from the examined data are both part of the interpretation process.

Project Structure

project-root
├── data
│   ├── articles.json
│   └── source_counts.csv
├── video
│   └── console.log.gif
├── outputs
│   └── Rplots.pdf
├── main.r
└── README.md

data/articles.json: JSON file containing the raw news articles data.
data/source_counts.csv: CSV file with the count of articles from each source.
video/console.log.gif: GIF showing the console log of the R script execution.
outputs/Rplots.pdf: PDF file containing the bar plot visualizing the number of articles by source.
main.r: R script for data analysis and visualization.

Technology Used

This project was developed using the following technologies:

R: For data analysis and visualization.
Replit: The development environment where the program was written and executed.

Data Analysis

Reading JSON Data

The analysis begins by reading the news articles data from a JSON file using the jsonlite package in R.

# Load necessary packages (if not already installed)
if (!require("jsonlite")) {
  install.packages("jsonlite")
}
library(jsonlite)

# Read Data from JSON File 
articles_data <- fromJSON("data/articles.json")

Cleaning and Transforming Data

The articles are converted into a data frame, and initial exploratory data analysis is performed to check for missing values and understand the structure of the data.

# Access the articles as a data frame
articles_df <- as.data.frame(articles_data$articles)

# Check for missing values
colSums(is.na(articles_df))

# Count of articles from each source
source_counts <- table(articles_df$source)

Analyzing Source Counts

The source counts are calculated, and the data is saved to a CSV file for further analysis and interpretation.

source_counts_df <- as.data.frame(source_counts)
colnames(source_counts_df) <- c("Source", "Count")  # Rename columns for clarity

# Write the data frame to a CSV file
write.csv(source_counts_df, file = "data/source_counts.csv", row.names = FALSE)

Visualizing Source Distribution

A bar plot is generated to visualize the number of articles from each source. The plot is saved as a PDF in the outputs directory.

# Create a bar plot for source counts (sorted in descending order)
barplot(sort(source_counts, decreasing = TRUE),
        main = "Number of Articles by Source",
        ylab = "Count",
        las = 2,  
        cex.names = 0.6)
mtext("Source", side = 1, line = 5, cex = 0.8)

# Move the Rplots.pdf file to the outputs folder
file.rename("Rplots.pdf", "outputs/Rplots.pdf")

Interpreting News Source Counts

The analysis provides insights into the distribution of news articles across different sources. By examining the source_counts.csv and the Rplots.pdf, you can interpret which sources contribute most heavily to the dataset and explore potential biases or trends.

How to Run the Analysis

Clone the repository: git clone https://github.com/yourusername/project-name.git cd project-name
Run the R script to perform the analysis: Rscript main.r
Review the output files:
- data/source_counts.csv for the article counts per source.
- outputs/Rplots.pdf for the visual representation of the source counts.

Example Output

Example console output

Author

Carisa Saenz-Videtto

Contact

carisasaenz@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News Article Data Analysis and Interpretation

Table of Contents

Project Structure

Technology Used

Data Analysis

Reading JSON Data

Cleaning and Transforming Data

Analyzing Source Counts

Visualizing Source Distribution

Interpreting News Source Counts

How to Run the Analysis

Example Output

Author

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
outputs		outputs
video		video
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.r		main.r

License

FizzyMo/BeyondBackend.Data-Analysis-and-Interpretation

Folders and files

Latest commit

History

Repository files navigation

News Article Data Analysis and Interpretation

Table of Contents

Project Structure

Technology Used

Data Analysis

Reading JSON Data

Cleaning and Transforming Data

Analyzing Source Counts

Visualizing Source Distribution

Interpreting News Source Counts

How to Run the Analysis

Example Output

Author

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages