This Streamlit app provides a comprehensive toolkit for data exploration, visualization, cleaning, and basic machine learning. It is designed to facilitate various stages of a data science project, making it an essential tool for data scientists and analysts.
#Features
-
Exploratory Data Analysis (EDA) Upload Datasets: Supports CSV, TXT, and XLSX files. View Data: Display the first few rows of the dataset. Data Summary: Show the shape and descriptive statistics of the dataset. Column Information: List all columns and allow the selection of specific columns for detailed analysis. Correlation Matrix: Visualize correlations between features using heatmaps. Scatter Plot: Create scatter plots for any two selected features.
-
Data Visualization Plot Types: Generate various plots including area, bar, line, histogram, box, KDE, pair plots, and scatter matrix. Interactive Plots: Utilize Plotly to create interactive scatter matrix plots.
-
Data Cleaning Handle Missing Values: Fill missing values with column means. Drop Columns: Remove unwanted columns from the dataset.
-
Machine Learning Model Training: Train a Random Forest classifier on selected features. Model Evaluation: Display classification reports and confusion matrices.
-
Download Processed Data Download CSV: Allow users to download the processed DataFrame as a CSV file.
**How to Use
Upload your dataset: Choose from CSV, TXT, or XLSX file formats. Select an activity: Choose from EDA, Plots, Data Cleaning, Machine Learning, or Download. Perform analysis and visualization: Use the various tools and options provided to explore and analyze your data. Download the processed data: Save your cleaned and augmented data for further use.