NBA Prediction Project

The objective of the project is to create a predictive model that predicts players for the All-NBA Team and All-Rookie Team based on statistical data.

Results ALL-NBA

All-NBA Team	Player 1	Player 2	Player 3	Player 4	Player 5
First Team	Giannis Antetokounmpo	Luka Doncic	Jayson Tatum	Anthony Davis	Shai Gilgeous-Alexander
Second Team	Anthony Edwards	Kevin Durant	LeBron James	Nikola Jokic	Paolo Banchero
Third Team	Jalen Brunson	De'Aaron Fox	DeMar DeRozan	Domantas Sabonis	Devin Booker

Results ALL-Rookie

All-Rookie Team	Player 1	Player 2	Player 3	Player 4	Player 5
First Team	Victor Wembanyama	Chet Holmgren	Brandon Miller	Keyonte George	Scoot Henderson
Second Team	Jaime Jaquez Jr.	Amen Thompson	Brandin Podziemski	Cason Wallace	Ausar Thompson

Methods Used

Loading data from a CSV file
Data preprocessing including handling missing values, standardization, and feature selection
Modeling using Random Forest Classifier for player classification
Evaluation of different models including Random Forest, Support Vector Regressor (SVR), and XGBoost

Files Overview

all_nba.ipynb: Jupyter Notebook containing the data analysis process including data loading, preprocessing, modeling, and evaluation.

Plot and Statistics Descriptions:

Number of All-NBA Nominations for Top 20 Players:
- Displays the number of All-NBA nominations for the top 20 players with the highest number of nominations.
Feature Correlation Matrix:
- Illustrates how various features correlate with each other, aiding in feature selection and understanding feature relationships.
Features Most Correlated with All-NBA Nomination:
- Presents the features (player statistics) most correlated with All-NBA nomination, helping identify key predictors.
Average Age of All-NBA Nominated Players in Each Season:
- Shows the average age of players nominated for All-NBA in each season, indicating if age influences nomination chances.
Teams with the Most Players in All-NBA:
- Displays teams with the highest number of players nominated for All-NBA, highlighting teams with significant impact.
Teams with the Most Players in All-NBA in Each Season:
- Shows teams with the highest number of players nominated for All-NBA in each season, indicating changes in dominant teams over time.

2. Implementation

check files exist

A function check_files_exist checks if the required files are present in the specified directory. It returns a list of missing files if any.

load data

The load_data function reads the necessary CSV files into pandas DataFrames for further processing.

preprocess_data

The preprocess_data function prepares the player statistics data by performing the following steps:

Dropping Unnecessary Columns: Columns that are not needed for analysis are removed.
Converting Data Types: The "GP" (games played) column is converted to integers to ensure proper numerical operations.
Filtering Players: Only players who played more than 40 games in a season are retained.
Filtering Seasons: The data is filtered to include only the specified seasons.
Adding All-NBA Nominations: A new column "ALL_NBA_NOMINATION" is added to indicate whether a player received an All-NBA nomination. This column is initially set to 0 for all players.

features and target variables

The target variable, "ALL_NBA_NOMINATION", is separated from the features.
Columns that are not needed for the model are removed from the features DataFrame.
The "DRAFT_YEAR" and "DRAFT_NUMBER" columns are converted to integer type, with undrafted players assigned a value of -1.
Categorical data in the "TEAM_ABBREVIATION" column is converted to dummy variables.

standardize function

The standardize_features function standardizes the features using a StandardScaler. This ensures that all features have a mean of 0 and a standard deviation of 1.

train random forest

The train_random_forest function trains a Random Forest model. The dataset is split into training and testing sets, the model is trained on the training set, and predictions are made on the testing set.

predict for a new season

The predict_new_season function uses the trained Random Forest model to predict All-NBA nominations for a new season. It preprocesses the new season's data similarly to the training data, standardizes it, and makes predictions.

generate award predictions

The generate_award_predictions function assigns predicted players to different All-NBA teams based on their predicted probabilities.

save results and model

The save_results function saves the predicted results to a JSON file.
The save_model function saves the trained model and the scaler to a file using pickle.

main execution

The main checks for missing files, loads the data, preprocesses it, trains the model, makes predictions, and saves the results.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
notebooks		notebooks
src		src
20_players.png		20_players.png
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
correlation.png		correlation.png
nba_age.png		nba_age.png
requirements.txt		requirements.txt
seasonal_stats_with_awards.csv		seasonal_stats_with_awards.csv
seasonal_stats_with_awards.xlsx		seasonal_stats_with_awards.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NBA Prediction Project

Results ALL-NBA

Results ALL-Rookie

Methods Used

Files Overview

Plot and Statistics Descriptions:

2. Implementation

check files exist

load data

preprocess_data

features and target variables

standardize function

train random forest

predict for a new season

generate award predictions

save results and model

main execution

About

Releases

Packages

Languages

License

dariak153/All_NBA_Teams_Prediction

Folders and files

Latest commit

History

Repository files navigation

NBA Prediction Project

Results ALL-NBA

Results ALL-Rookie

Methods Used

Files Overview

Plot and Statistics Descriptions:

2. Implementation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages