Skip to content

An analysis on the last 150 years of Major League Baseball and the impact that slugging and hitting percentages of batters are in terms of Sabermetrics and other sports statistics.

Notifications You must be signed in to change notification settings

matthewjchin/baseballstats

Repository files navigation

baseballstats

Baseball Statistics

An analysis on the last 150 years of Major League Baseball and the impact that slugging and hitting percentages of batters are in terms of Sabermetrics and other sports statistics.

The following statistics in the Jupyter notebook titled Slugging.ipynb are: On-Base Percentage (OBP), Slugging Percentage (SLG), and On-base Plus Slugging (OPS) Percentage.


Over the course of time, this repository will contain code regarding prediction models, statistical analysis, player profiles, and data visualization of certain key components of Sabermetrics. The visualizations were used with the pyplot package in the matplotlib library.


Notable Examples


Buster Posey, former catcher and first baseman of the San Francisco Giants (2009-2019, 2021), has been used in examples of data visualization. A three-time World Series Champion, seven-time All-Star and five-time Silver Slugger, some of his statistics have been and will be used as part of a small sample of one of the greatest catchers of all time.


Brandon Crawford, shortstop of the San Francisco Giants, has been used as an example of the slugging statistics as well as data visualization that used in Sabermetrics, Gold Glove and All-Star selections as a two-time World Series Champion. Below is data showing his on-base plus slugging percentage in his current twelve-year career with the Giants:


Hunter Pence, 14-year MLB veteran right fielder, has also been used as an example of the slugging statistics that are used in individual Sabermetrics examples, also a two-time World Series Champion.


Inspired by the repository Basics of Sabermetrics by Ryan Berns.


A forked repository from Mr. Berns inspired the creation of this repo, which can be found here.


Potential Future Projects:

Over time this repository will include data science-related projects that could be useful to analyze the progression of baseball statistics with Sabermetrics for years to come.

Predict the stats of players in 2020 season based on data from 2015-2019, had the 2020 season been a full 162-game season.

Determining if higher OBP (not OPS) was result of the MVP finalists as they were

Do defense and fielding matter in today's game when it comes to AL/NL MVP awards?


1/30/24 update:

There may be answers to these questions above. Stay tuned. More models will be integrated and run, including statistics and other data up to the completion of the 2023 MLB season.


Resources:


Any 2021 statistics in these files for the repository have come from [Baseball Reference](https://www.baseball-reference.com/).

References used to both current and former players in this README have been taken from the official website of Major League Baseball.

Additionally, statistics from 1871 to 2020 used across this repository come from the Baseball Databank, a resource of historical baseball data provided by the Chadwick Baseball Bureau. Their repository of data can be found here.

About

An analysis on the last 150 years of Major League Baseball and the impact that slugging and hitting percentages of batters are in terms of Sabermetrics and other sports statistics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published