Hacker News Posts - Data Cleaning
In this project, we're diving into a dataset of submissions made to the renowned tech website, Hacker News.
Hacker News, founded by the startup incubator Y Combinator, operates much like Reddit. Users submit stories, called "posts," which can garner votes and comments. The site enjoys significant popularity in tech and startup spheres. Stories that climb to the top of the Hacker News rankings can draw in immense traffic, sometimes reaching hundreds of thousands of views.
This is one of the guided project from dataquest, and the data for the same can be downloaded from here.
Skills I learnt from this project -
- How to work with strings
- Object-oriented programming
- How to work with Dates and times
- How to work with Lists
id: the unique identifier from Hacker News for the post
title: the title of the post
url: the URL that the posts links to, if the post has a URL
num_points: the number of points the post acquired, calculated as 5. 5. the total number of upvotes minus the total number of downvotes
num_comments: the number of comments on the post
author: the username of the person who submitted the post
created_at: the date and time of the post's submission