Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graded Assignment-5 (Jan Term 2024):- Data Visualization Tools #26

Open
Jimmi-Kr opened this issue Mar 15, 2024 · 45 comments
Open

Graded Assignment-5 (Jan Term 2024):- Data Visualization Tools #26

Jimmi-Kr opened this issue Mar 15, 2024 · 45 comments

Comments

@Jimmi-Kr
Copy link
Collaborator

With a plethora of both commercial & free visualization tools & libraries available, it can often be confusing to pick the right tool for your requirement. Also from the learning point of view, one doesn't know which tool or set of tools should invest time & effort in learning.

In her 2016 article "What I Learned Recreating One Chart Using 24 Tools", Lisa Charlotte Rost tried out 12 data vis applications and 12 data vis libraries and programming languages and reported a comparative evaluation.

In this assignment, you will recreate the exercise with at least 5 charting tools or libraries (total 5 not 5 each) for the given dataset (auto-mpg.csv). You may create any chart type, but using at least 2 variables from the dataset. Having decided on chart type & variables, repeat the same chart using the 5 chart tools or libraries. Paste your charts as a comment to this issue. Add text to each chart identifying the tool/library you used for the chart.

Note: You can only use one from Matplotlib, seaborn, and Excel.

@hanani8
Copy link

hanani8 commented Mar 17, 2024

Hanani Bathina

Roll No: 21f1006169


image
Amongst the 24 tools explored by the Lisa Charlotte Rost, I have decided to try these 5:

  1. Matplotlib
  2. D3.js
  3. Plotly
  4. Google Sheets
  5. Lyra.

I have tried to ensure that the 5 chosen tools belong to as many quadrants as depicted in the graph as possible.

Chosen Variables

  1. Weight
  2. No. of Cylinders
  3. Horsepower
  4. Acceleration

Visual Encoding

I used weight on the x-axis, the number of cylinders on the y-axis, bubble size to represent horsepower, and color gradient for acceleration to create a focused analysis on the interplay between the physical attributes of the car (weight and engine size) and its performance characteristics (horsepower and acceleration).

This configuration allows you to directly assess:

How weight affects acceleration and horsepower.
How the number of cylinders relates to horsepower (which often increases with more cylinders) and acceleration.
Whether heavier cars have more horsepower or if they tend to accelerate more slowly due to the increased mass.

Horsepower for Bubble Size:

Horsepower as the size of the bubble could provide a very intuitive visual cue because we naturally associate "bigger" with "more powerful".

Acceleration as Fill Color:

A color gradient for acceleration could work well since it could range from cooler colors for slower acceleration to warmer colors for faster acceleration.

Matplotlib

image

Description

I have used Matplotlib in a Jupyter Notebook for this chart. Used numpy for numerical pre-processing, like scaling, adding jitters to avoid overlapping. Used Matplotlib's Scatter plot for plotting, and RedS Sequential Color for color coding.

D3.js

image

Description

Used scaleLinear() of D3.js for X-axis (weight) and Y-axis (no. of cylinders), scaleSequential() for colors (acceleration), and scaleSqrt() for size of the bubble. Then, I've simply used svg circles to build the bubble chart with 0.7 opacity.

Plotly

image

Description

The code and description is pretty similar to the matplotlib graph. I have used scatter() of plotly.graph_objects.

Google Sheets

Car Attributes Bubble Chart

Description

I have used Bubble Chart of Google Sheets chart. Since Sheets does not support color gradients natively, I have binned the acceleration values, and manually assigned colors to simulate color gradients.

Lyra

image

Description

It is a drag and drop interface.

@Saikat88
Copy link

Saikat88 commented Mar 17, 2024

Name - Saikat Samanta
Roll - 21f1003501

Plotting Miles per gallon vs Weight for the graphs from the Given Dataset

Google Sheets:
SheetsGA5

RAWgraphs:
rawgraphsGA5

Tableau:
tableuGA5

PowerBI:
PowerBIGA5

Matplotlib:
MTBGA5

@shrikrishna97
Copy link

shrikrishna97 commented Mar 17, 2024

Name - Shri Krishna Pandey
Roll No. - 21f1006966

Dataset Description

  • Given dataset had 9 columns:
    • mpg - Miles per gallon.
    • cylinders - Number of cylinders in the car
    • displacement - Displacement of the car
    • horsepower - Horsepower of the car
    • weight - Weight of the car.
    • acceleration - Acceleration of the car.
    • model year - Year, when Car model was released in the market.
    • origin - Place of manufacturing.
    • car name - Name of the car.

Purpose

To find the relation between Horsepower and mileage (mpg) based on number of Cylinders.

Visualization type : Scatterplot

  • Preprocessing:

    • Removed rows which didn't had data of horsepower.
    • Added extra color column for Datawrapper.
    • Coded color encoding for each color in GG-Plot.
    • Written one query in Tableau for coloring all the dots.
  • Data Used:

    • 392 Rows
    • 3/4 Columns.
  • Parameters used in Chart

    • horsepower
    • mpg
    • cylinders

Visualizations

1. Microsoft Excel

Screenshot 2024-03-17 000406

Detail about the graph: X-axis (MPG) is in range of 5-50, Y-axis (Horsepower) is in range of 0-250 and dots are colored based on number of cylinder (i.e. 3,4,5,6,8)

2. DataWrapper.de

YfIAS-relation-between-horsepower-and-mpg-based-on-number-of-cylinders

Detail about the graph: X-axis (MPG) is in range of 0-50, Y-axis (Horsepower) is in range of 0-250 and dots are colored based on number of cylinder (i.e. 3,4,5,6,8)

3. R-Studio / GG-Plot

image

Detail about the graph: X-axis (MPG) is in range of 0-49, Y-axis (Horsepower) is in range of 0-240 and dots are colored based on number of cylinder (i.e. 3,4,5,6,8)

4. Tableau desktop

tableau_ga5

Detail about the graph: X-axis (MPG) is in range of 5-50, Y-axis (Horsepower) is in range of 0-240 and dots are colored based on number of cylinder (i.e. 3,4,5,6,8)

5. Power BI

image

Detail about the graph: X-axis (MPG) is in range of 5-50, Y-axis (Horsepower) is in range of 45-240 and dots are colored based on number of cylinder (i.e. 3,4,5,6,8)

@DebapriyoSaha
Copy link

DebapriyoSaha commented Mar 17, 2024

Name: Debapriyo Saha
Roll No.: 21f1004645

The five visualization tools that are being used for plotting are the following:

  1. Orange
  2. DataWrapper
  3. Plotly
  4. Seaborn & Matplotlib
  5. Flourish

Dataset used: The dataset contains details of different cars, along with it's different features and also it's miles per gallon (mpg) values.

x-axis: Acceleration
y-axis: Weight

Color scale is used by Origin and Size of the bubbles are based on the acceleration value (more the acceleration larger is the size of each bubble)

1) Orange Plot

GA5_plot

2) DataWrapper Plot

5liJP-acceleration-vs-weight-by-origin

3) Plotly

newplot

4) Seaborn & Matplotlib

matplotlib

5) Flourish

DVD GA 5@2x

@rt1916-IITM
Copy link

Name: ROYCE TOMY (21F1001916)

mpg vs weight

  1. MS Excel

image

  1. Power BI

power_bi

  1. Tableau

tableau

  1. Plotly (library)

plotly

  1. Altair (library)

altair

@soumyanamboo
Copy link

One Chart Using 5 Tools

Name: Soumya V Namboodiripad
Roll Number: 21f1004752

Dataset: auto-mpg (Automobile Dataset)

Variables Used: mpg, horsepower

Type of Chart: Scatterplot

1. Excel

image

2. Datawrapper

Datawrapper-mpg-vs-horsepower

link to the visualization

3. Tableau Pubic

TableauPublic_AutoMPG

link to the visualization

4. Flourish

ScatterPlot_AutoMPG_Flourish

link to the visualization

5. Matplotlib

image

@Manaswita06
Copy link

Manaswita06 commented Mar 25, 2024

Name: Manaswita Mandal
Roll no: 21f1004567

The five visualization tools that I have used for my analysis are:

  • Seaborn
  • Power Bi
  • Flourish
  • GGPlot using R
  • Plotly

Data: The dataset contains details of different cars, along with it's different features and also it's miles per gallon (mpg) values. I tried to find some correlation between some of the variables to find whether they are affecting the mpg of different cars.
Attributes used for analysing:

  • 'mpg'
  • 'cylinders'
  • displacement'
  • 'weight'
    These variables were chosen because they were having very high negative impact on the mileage of cars, as found by the following heatmap:

image

Following charts are plotted:

Bubble Chart using Seaborn:

seaborn_mpg_cars

Bubble Chart using Power Bi

power_bi_mpg_cars

Bubble Chart using Flourish

flourish_mpg_cars

Bubble Chart using GGPlot

GGplot_mpg_cars

Bubble Chart Using Plotly

plotly_mpg_cars

Charts explanation:

  1. The variable 'displacement' has been used in x - axis.
  2. The variable 'mpg' which is the target variable, has been used in y - axis.
  3. The variable 'weight' is used to determine the bubble size.
  4. Color hue has been shown based on 'cylinders', i.e., based on the number of cylinders used in cars.

Observation:

  1. As displacement increases, the mpg values of cars decrease.
  2. As the weight increases, the mpg values of cars decrease.
  3. If the number of cylinders is more, then the mpg values of cars are less.

Inference:

  1. All the variables are negatively correlated towards 'mpg' (Miles per gallon).
  2. Therefore for better mileage, the weight should be less, number of cylinders should be less and also the displacement should be less.

@kaushikpatriot
Copy link

Name: Kaushik V
Roll No: 21f1001083

Fuel efficiency vs Model Year - How has the distribution changed and has it improved over the years?

Observation from the chart: One can see the median fuel efficiency has significantly improved over the years and also the range band has gotten shorter, indicating, irrespective of the brand and specs, car manufacturers have tried to become more fuel efficient across categories.

  1. Tool: Seaborn / Matplotlib
    image

  2. Tool: Plotly
    image

  3. Tool: Vega-lite / Altair
    image

  4. Tool: Tableau
    Sheet 2

  5. Tool: ggplot2 / R
    image

@viraj19r
Copy link

Name : Viraj Sharma
Roll No: 21f1003723

For this assinment, I used a dataset auto-mpg.csv to make a bubble chart with five different tools. A bubble chart is like a scatter plot, but it uses circles to show more information. I picked the mpg (miles per gallon) and displacement (engine size) variables from the dataset, and the size of the bubbles was determined by the cylinders of the car engines.

The 5 tools that I have used are following:

  • Plolty
  • Altair
  • Matplotlib & Seaborn
  • Google Sheets
  • Bokeh
  1. Plotly:
    image

  2. Altair:
    image

  3. Matplotlib and Seaborn:
    image

  4. Google Sheets:
    image

  5. Bokeh:
    image

I learned that different tools are better for different jobs. Some are good for making simple charts fast, others are better for when you need to make something very detailed, and some are best when you want to put a chart on a website. Which tool to use depends on what we need the chart to do.

@Puravasu-Jaideep-Sesha
Copy link

Puravasu Jaideep Sesha
21f1000162

Dataset : Data

Variables used:

  • Acceleration
  • Model Year

Tools Used:

  • Seaborn
  • Plotly
  • Tableau
  • Raw
  • Altair(Vega)

Visualizations

  1. Seaborn
    Seaborn

  2. Plotly
    plotly

  3. Tableau
    Tabeau

  4. Raw
    RawGraph

  5. Altair(Vega)
    Altair(vega)

I wanted to try out Datawrapper, but they do not offer boxplots, since they are not very well known to the general public. They offer alternatives like range plot to make up for it. 😊

@Yalinisaravanan
Copy link

Displacement vs Miles per gallon:

Variables used:

  • mpg
  • Displacement

Tools used:

  • Plotly
  • Matplotlib
  • Flourish
  • Datawrapper
  • Raw graph
  • Google sheet

Raw graph:
viz

Flourish:
snapshot-1711471732003

Datawrapper
Screenshot 2024-03-26 230030

Matplotlib:
Screenshot 2024-03-26 222945

Plotly:
Screenshot 2024-03-26 222810

Google sheet:
Screenshot 2024-03-26 221419

Yalini S
21f1004138

@Anion061
Copy link

Anion061 commented Mar 26, 2024

Categorical Bubble Chart: Acceleration vs. Weight

Variable Used:

  • Y-Axis -> Acceleration
  • X-Axis -> Weight
  • Size -> MPG
  • Color -> Origin

Tools Used:

  • Seaborn
  • Tableau
  • Flourish
  • Plotly
  • PowerBI

SEABORN
Screenshot 2024-03-26 231142

TABLEAU
Sheet 1 (2)

FLOURISH
Flourish (1)

PLOTLY
Screenshot 2024-03-26 231219

POWER BI
Screenshot 2024-03-27 000929

Anushka Aggarwal
21f2000407

@Ak7210
Copy link

Ak7210 commented Mar 27, 2024

Name: Ajeet Kumar,
Roll Number: 21f1006807

Scatter Plot: Relation between MPG (miles per gallon ) and Horsepower on the number of Cylinders

Variable Used:

x-axis: MPG
y-axis: Horsepower
categories color: Cylinder

Tools Used:

Excel
Tableau
Flourish
Plotly
PowerBI

  1. Excel
    Excel_image2

  2. Tableau
    Screenshot 2024-03-27 110736

  3. Flourish
    Flourish_image

  4. Plotly

Plotly

  1. PowerBI

PowerBI

@phanijallipalli
Copy link

phanijallipalli commented Mar 27, 2024

Name : Jallipalli Phani Kumar
Roll No : 21f3002478

This is a comparative evaluation of scatter plots generated for the relationship between "Cylinders" and "Displacement" using various visualization tools. The purpose of this evaluation is to assess the strengths and weaknesses of each tool in creating effective visualizations.

Tools Used:
Excel
Flourish
Bokeh
Orange Tool
Plotnine

Dataset:
The dataset used for this evaluation is the "auto_data.xlsx", containing information about automobile specifications including cylinders and displacement.

Findings:

Excel:
Excel provides a user-friendly interface for creating scatter plots.
The scatter plot generated in Excel allows for basic customization but lacks advanced features.
It is suitable for quick visualization tasks but may not be ideal for complex visualizations.
image

Flourish:
Flourish is an online tool for creating data visualizations with interactive features.
The scatter plot created in Flourish may offer more interactivity compared to other tools but requires uploading the data to the platform.
image

Bokeh:
Bokeh is a Python library for creating interactive visualizations.
The scatter plot generated with Bokeh allows for customization and interactivity, such as zooming and panning.
Bokeh is suitable for creating professional-grade visualizations with programmable features.
image

Orange Tool:
Orange is a data visualization and machine learning tool with a graphical interface.
The scatter plot created in Orange offers basic visualization capabilities with limited customization options.
Orange is user-friendly and suitable for users with less programming experience.
image

Plotnine:
Plotnine is a Python implementation of ggplot2, a popular plotting system for the R programming language.
The scatter plot generated with Plotnine offers a high level of customization and follows the grammar of graphics principles.
Plotnine is ideal for users familiar with ggplot2 syntax and seeking advanced visualization capabilities.
image

@rrohnyy02
Copy link

Name: Rohan Khandelwal
Roll Number: 21f1005976

Scatter Plot: Relation between Weight, acceleration and model year

Axes description:
x-axis: weight
y-axis: acceleration
categories colour: model year

Tools:
Flourish
R studio
Matplotlib & Seaborn
DataWrapper
Plotly

  1. Flourish
    flourish

  2. R studio

Screenshot 2024-03-27 at 5 09 20 PM
  1. Plotly
    plotly

  2. Matplotlib & Seaborn
    seaborn

  3. DataWrapper
    flourish

@HungryPanda0212
Copy link

Name - Chandana Nisankara
Roll Number - 21f1005727

I have chosen Mpg(miles per gallon) and weight from the data , to understand how a car's weight impacts its fuel efficiency. Heavier cars have lower Mpg due to increased energy requirements.

Tools / Applications that i have chosen to represent the data are :
1.Google sheets (Application)
2.Orange (Application)
3.Plotly (Library)
4.Seaborn (Library)
5.Tableau (Application)

1. Google sheets :
Mpg vs Weight

2. Orange :
orange

3.Seaborn:
image

4.Plotly:
image

5.Tableau :
image

@Sharmaom24
Copy link

Name: Om Sharma
Roll No.: 21f1004424

Variables used: MPG and Weight

Tools Used:
a) Matplotlib
image

b) Seaborn
image

c) Plotly
image

d) Excel
image

e) Datawrapper
image

@MajorHamol
Copy link

MajorHamol commented Mar 27, 2024

Name: Amol HATWAR
Roll No.: 21f1000451

Visualisation using five tools

From the given dataset, three columns were chosen to be encoded. These were:

  1. Miles / Gallon (mpg) -- Y Axis
  2. Weight -- X Axis
  3. Horsepower -- Bubble Size

A bubble chart was chosen as it would allow easy extension by encoding of additional data like horsepower, colours could be used for region of origin etc.

1. Flourish

flourish

2. Google Sheets

GoogleSheets

3. RAWGraphs

RawCharts

4. Seaborn

seaborn

5. DataWrapper

DataWrapper

While most tools were simple and easy to use, using seaborn offered the most flexibility. However, the tool requires some Python programming skills. At the same time, using Google Sheets was a bit unwieldy as it would not offer precise control of the bubble size. This caused overlapping bubbles and made the resulting visualisation messy.

@DEENA0503
Copy link

DEENA0503 commented Mar 27, 2024

Name: DEENA GAUTAM
Roll No.: 21f1001012

A scatterplot was chosen to showcase the relationship between 3 variables.

Variables used:
X-axis : horsepower
Y-axis : acceleration
legends : cylinders

1. Power BI (tool)
powerbi_deena

2. Plotly (library)
plotly

3. Altair (library)
altair_Deena

4. Flourish (tool)
flourish_deena

5. Tableau Public (tool)
dvd_ga5_deena_tableu

@AlapeAniruddha
Copy link

Variables used

  1. Weight
  2. MPG (miles per gallon)
  3. Number of Cylinders

Plots

I have done a scatter plot on the given dataset with the three features mentioned above.

  1. Matplotlib
    GA_5_mtpltlib

  2. Plotly
    GA_5_plotly

  3. ggplot2
    GA_5_gg_plot

  4. Flourish
    GA_5_flourish

  5. Google Sheets
    GA_5_g_sheets

The tools Matplotlib, Plotly and Flourish were very easy to use. In the case of Google Sheets, it was not possible to color the dots based on the value of the number of cylinders, this prompted the split of the feature in X-axis on the basis of number of cylinders

@PriyaNathani
Copy link

PriyaNathani commented Mar 27, 2024

Name: Priyanka Nathani
Roll No: 21f1005807

Dataset details:
The dataset consists of 398 rows of data. The data contains numerical values for miles per gallon, number of cylinders, horsepower, weight and acceleration of the car models with their model year and version (marked as origin).

My approach:
From the car names, I split the values to get names of car manufacturing companies. Thereafter I plotted ‘Cylinder-wise Cars produced by Companies’ with Car companies on one axis and number of car models with 3, 4, 5, 6 or 8 cylinders.

Choice of graph:
I have chosen stacked bar for finding this relationship.

Tools used:

  1. Excel
  2. Tableau
  3. Power BI
  4. Datawrapper
  5. Matplotlib
    
  6. RawGraphs 2.0
    

1. Excel

Chart_Excel

image

2. Tableau

Chart_Tableau

image

3. Power BI

Chart_BI

image

4. Datwrapper

Chart_DW_1

image

5. Matplotlib

Chart_matplot

image

6. RawGraphs

image

image

Chart Explanation:

  1. It can be seen that 4 cylinder car models were released maximum from 1970 to 1982.
  2. Volvo, Ford, Toyota and Chevrolet released maximum number of 4 cylinder cars
  3. Chevrolet and Ford released maximum car models in various categories.

image

Insights into tools:

  1. Excel:
    a. Ease of data manipulation
    b. Lot of flexibility wrt legend, axis data etc layout and orientation.
    c. Easy to use

2. Tableau
a. Easy to use tool. A lot of animation etc and different types of graphical
representations are inbuilt.
b. Instead of stacked chart the in-built recommendation was multiple graphs.
c. This type of graph gave excellent visualization in terms of which manufacturer
prefers to deal with cars of how many cylinders, which manufacturer has
produced maximum cars in any given category etc.

  1. Power BI
    a. Also, not very difficult to use. Although some hands on is required before the
    tool can be utilized to maximum.
    b. I liked that the stacked chart here has normalized all the company data to 100%.
    Therefore, the comparison here between which company releases more cars in
    which category is simple.

  2. Datawrapper
    a. The tool is very user friendly. Beautiful visualizations can be created in the very
    first go.
    b. The chart attached here has ease of understanding as it is easy to play around
    with colours making separation between categories stark.
    c. Details like source of data etc add to the authenticity of the chart

  3. Matplotlib
    a. The use of this tool requires coding in python.
    b. The tool has some limitations like regarding placement of legends etc
    c. If the data is very large and requires cleanup, it can be a very useful tool.

6. RawGraphs
a. The tool is very easy to use
b. Does not take empty cells while tools like Datawrapper takes empty cells as well
c. Limited possibilities, e.g., I could not change the direction of xticks.

P.s. I had uploaded the graphs much before the deadline. I has missed the sentence that either excel or matplotlib can be used. I just learnt about it today. Therefore, today I uploaded the 6th graph using RawGraphs. I request you to kindly consider the submission of this graph as well. Thank you and Regards, Priyanka Nathani

@Rajkishore2904
Copy link

Name - Rajkishore Nandi
Roll No - 21f1006016

The five visualization tools used are :

  • GGPlot using R
  • Plotly
  • Power Bi
  • Flourish
  • Matplotlib

Variables Used :

  1. Weight
  2. MPG(Miles per gallon)
  3. Acceleration

Plot :

Plotted scatter plot with Weight on X-axis, Mpg(Miles per Gallon) on Y-axis and colour gradient for Acceleration to find out the correlation between the three variables.

GGPlot using R

Rplot02

Plotly

plotly2

Power Bi

powerbi

Flourish

Flourish

Matplotlib

Matplotlib chart

@maniesh1
Copy link

Name: Manish Kumar
Roll No.: 21f1004259

About the data - The dataset auto-mpg.csv provided describe the various attributes of different car models. The dataset comprises information about multiple car models, with features including their fuel efficiency (mpg), engine specifications such as cylinder count, displacement, and horsepower, along with weight, acceleration, manufacturing year, origin, and car name. Each row represents a distinct car model, and the dataset provides a comprehensive overview of these vehicles' key characteristics.

The Five tool/libraries are listed below that I explored and used for data visualization on some of the key observation and fearures on the dataset.

  • Ydata profiling
  • Tableau
  • Flourish
  • Chart-Studio plotly
  • Power BI

Here are few of the charts and visualization created using the above five tools/libraries:

  1. Pandas profiling earlier it was called ydata-profiling gives a very good initial overview of the data. Together with statistical information on the dataset it provides some of the key charts to visualize the data.
    Heat Map
    image
    Word Cloud Image :- Word cloud image on the car name.
    download

  2. Correlation chart beetween 'mpg' and 'weight' feature, ploted using Flourish.

Screenshot 2024-03-27 222350
  1. Tableau to create HorsePower vs Acceleration relationship chart.
Screenshot 2024-03-27 224604
  1. Used Power BI to generate box-plot.
Screenshot 2024-03-27 223842
  1. Created using Plotly. It shows how the lower MPG cars give the higher displacement.
    image

@prakhar-20
Copy link

Name: Prakhar Bansal
Roll No.: 21f1003810

Title- Relationship between Weight and MPG based on number of Cylinders

Tools Used-

  1. Microsoft Excel
  2. Flourish
  3. Datawrapper
  4. Plotly
  5. Tableau

Variables used-
Y-axis - Weight
X-axis - MPG (Miles Per Gallon)
Color - Number of Cylinders

Visualizations

1. Microsoft Excel

excel

2. Flourish

flourish

3. Datawrapper

datawrapper

4. Plotly

plotly

5. Tableau

tableau

@sarthak-iitm
Copy link

Name: Sarthak Khandelwal

Roll No.: 21f1004405

Altair (Python):

This scatterplot visualizes the relationship between acceleration and mileage (mpg) based on the number of cylinders using Altair, a Python library for statistical visualization. Each point represents a car in the dataset, with the X-axis showing horsepower, the Y-axis showing mileage, and colors indicating the number of cylinders. The plot provides a clear overview of how the mileage varies with acceleration across different cylinder configurations.
import altair as alt import pandas as pd alt.Chart(df).mark_point().encode( x='acceleartion', y='mpg', color='cylinders:N' ).properties( title='Horsepower vs. Mileage by Number of Cylinders (Altair)' )

IMG-20240327-WA0011

Flourish:

The Flourish scatterplot displays the correlation between acceleration and mileage (mpg) categorized by the number of cylinders in the car. Each data point represents a car, with acceleration on the X-axis, mileage on the Y-axis, and cylinder count indicated by color. The interactive nature of the plot allows users to hover over points for specific data values and explore how mileage relates to acceleration across different cylinder configurations.

IMG-20240327-WA0012

Power BI:

The Power BI scatterplot illustrates the relationship between acceleration and mileage (mpg) based on the number of cylinders in the car. Each point on the plot represents a car in the dataset, with acceleration plotted on the X-axis, mileage on the Y-axis, and cylinder count represented by color. Users can interact with the plot to filter data or drill down into specific details, making it a versatile tool for exploring the relationship between these variables.

IMG-20240327-WA0013

Seaborn (Python):

This Seaborn scatterplot visualizes the correlation between acceleration and mileage (mpg) categorized by the number of cylinders in the car. Each point represents a car, with acceleration on the X-axis, mileage on the Y-axis, and cylinder count indicated by color. The plot provides insights into how mileage changes with acceleration across different cylinder configurations, with Seaborn's built-in styling and aesthetics enhancing the presentation of the data.

import seaborn as sns import matplotlib.pyplot as plt sns.scatterplot(data=df, x='acceleration', y='mpg', hue='cylinders') plt.title('acceleration vs. Mileage by Number of Cylinders (Seaborn)') plt.show()

IMG-20240327-WA0014

Tableau:

The Tableau scatterplot depicts the relationship between acceleration and mileage (mpg) based on the number of cylinders in the car. Each point on the plot represents a car in the dataset, with acceleration plotted on the X-axis, mileage on the Y-axis, and cylinder count differentiated by color. Tableau's intuitive interface allows users to explore the data dynamically, enabling interactive analysis and visualization of how mileage varies with acceleration across different cylinder configurations.

IMG-20240327-WA0015

@AaryaM2609
Copy link

Name: Aarya Motiwala
Roll Number: 21f1003998

Scatter Plot: Relation between MPG and Displacement

Axes description:
x-axis: MPG
y-axis: Displacement

Tools:
Matplotlib
Seaborn
Excel
PowerBI
Bokeh

Excel- Excel is a widely accessible tool that offers intuitive interfaces for plotting data through its spreadsheet environment. Plotting MPG vs. Displacement in Excel allows users to quickly visualize trends and patterns, with easy-to-use formatting and styling options. However, Excel's capabilities are more limited for advanced statistical visualizations compared to Python libraries.
image

Power BI- When plotting MPG vs. Displacement, PowerBI enables users to create dynamic and interactive dashboards. Its strength lies in data manipulation, sharing capabilities, and integrating plots into comprehensive reports for decision-making.

image

Seaborn- Built on top of Matplotlib, Seaborn simplifies creating complex visualizations with more aesthetically pleasing defaults and a variety of plot types designed for statistical analysis. When plotting the same dataset, Seaborn automatically manages finer details like plot style and color palettes, making it easier to generate more informative and visually appealing plots.

image

Matplotlib- Matplotlib offers a highly customizable framework for creating a wide variety of plots in Python. When plotting MPG vs. Displacement, it provides a straightforward approach, focusing on clarity and simplicity. Users can directly manipulate plot aspects like size, labels, and color, but interactive capabilities are limited compared to tools like Bokeh.

image

Bokeh- When plotting MPG vs. Displacement with Bokeh, it provides users with interactive elements like hover tools, zooming, and panning, enhancing the exploratory analysis experience and making the data more accessible and interactive for end-users.
image

@21f1005359
Copy link

21f1005359 commented Mar 27, 2024

Name : Trivikram Umanath

Rollno: 21f1005359

Scatter Plot : Relationship between mpg and Weight and Coloured by Model Year

Tools Used-

Matplotlib
Seaborn
Pandas
Flourish
Datawrapper

Variables used-
Y-axis - Weight
X-axis - MPG (Miles Per Gallon)
Color - Model Year

Dataset details:
The dataset comprises 398 data entries, each representing various attributes of car models. These attributes include numerical values for factors such as miles per gallon, number of cylinders, horsepower, weight, and acceleration. Additionally, the dataset includes information about the model year and origin of each car version.

My approach:
There are many interesting features and bivariate relationships between features from the table.The relationship between MPG i.e Miles Per Gallon and Weight will be a very interesting relationship to observe i.e as the weight decreases so will the mpg increase i.e lighter cars should ideally be able to go longer miles per gallon as the body mass and weight shall be lower for the car.And colouring the above trend based on the Model Years would also be very interesting as a hypothesis i.e throughout the years the cars become lighter,stronger and faster..due to the advancement of technology and the change of liking as per the time.

Choice of graph:
Scatter Plot is the best graph to observe the trends.These points are put up in the graph and we can clearly see that these points showcase a trend..i,e downward trend...coloured according to the Model Year.

We observe as the Years pass the hypothesis is validated i.e with years cars become faster,lighter and stronger and we can see it clearly here.

Here are the trends as per the Tools.

1)Matplotlib

Auto-Matplot

2)Seaborn

Auto-Seaborn

3)Pandas Plot

Auto-Pandas

4)Flourish

Auto-Flourish

5)DataWrapper

Auto-DataWarapper

@CoreManish
Copy link

Name : Manish Kumar
Roll : 21F1006597
Dataset : link

Draw a scatter plot between horsepower and acceleration using these different tools

1 Google sheet

Horse power vs Acceleration

2 power BI

@Shabarish1403
Copy link

Shabarish1403 commented Mar 27, 2024

Title: Scatter plot for vehicle Weight and MPG with the number of cylinders

P V Shabarish
21F1001346

Here we will see the plotting of the vehicle Weight and MPG (Miles per Gallon) with the number of cylinders in that vehicle by using scatter plot and observing the differences among different visualization tools for the same plot.

Data Used: auto-mpg.csv

Features Used:

  • X-axis - weight
  • Y-axis - mpg
  • Color Mapping - cylinders (3,4,5,6,8)

Tools Used:

  1. Matplotlib
  2. Seaborn
  3. Plotly
  4. Google Sheets
  5. Flourish

Here I have taken the dark backgrounds for all plots because as this is posted in GITHUB, it's entire background is in the dark. So, it will be visually compelling when someone observes the plot without much stress to the eyes. Also, the color for data points are chosen in a way that it can easily notice.


1. Matplotlib

Matplotlib

In Matplotlib, the entire plot comes with very thick borders and labels. The data points are also a little blurry in nature. I have used the spring color palette for this plot.

2. Seaborn

Seaborn

In Seaborn, the quality of the plot is very good. The data points are also in appropriate size with a decent opacity in the centers which makes the data points differentiable and can be noticed easily. I tried increasing the dpi in matplotlib but the image quality still looks the same but for the seaborn, it works perfectly fine. For seaborn also, I have used the spring color palette.

3. Plotly

Plotly

In Plotly, it mainly offers interactive plots. Although the quality of the plot looks a little low but the features that it has with the interactive zooming, box select, lasso select the data points that we are interested in.

All these are visualization libraries that are mentioned above. All these libraries have the color palette inbuilt but most of the color palettes are with light color on one end and darker on the other end. As I have selected the darker background, light color palettes will be well suitable for my visualizations. Hence, I preferred the spring palette in Matplotlib and Seaborn. But unfortunately, there is no spring palette in Plotly. As I want to show the differences among various tools, I wanted it to be in the same palette for every tool. For this issue, I have taken the RGB values for the spring palette from Matplotlib and Seaborn and have used those values manually in Plotly and other tools that we are going to see next.

4. Google Sheets

Cars weight vs mpg vs cylinders

Google Sheets are mainly useful for quick and simple generated charts. In sheets, the scatter plot simply takes two variables and plots it with a single color. There is no direct option to consider the 3rd feature and color it accordingly. To achieve it, we have to split the mpg feature into multiple features according to the cylinders.

5. Flourish

DVD GA5@2x (1)

Flourish is well popular for visualizations. It is very easy to implement. Copy pasting the data in the desired visualization template and can customize the chart according to our needs. But still it offers only a limited number of visualizations.


Conclusion: After these plotting in different tools and observed that if we are using visualization libraries, I would recommend Seaborn or if we want interactive visualization from the libraries, we can go for Plotly. From the visualization tools, google sheets can be useful only for some quick plots by analyzing or filtering the datasets. We can achieve some rough visualizations from Google Sheets and can apply those into bigger visualization tools like Flourish. If we are planning for more unique and different visualizations which are not offered by any of the tools above, we can go for some other tools like Tableau, Power BI etc.

@Charan1152
Copy link

Charan1152 commented Mar 28, 2024

Bar Plot of Average MPG per Cylinder Type of a Vehicle using 5+ Charting Libraries/ Languages

Details

Name: Kruthiventi M R S Sai Charan
Roll No: 21f1004450
Level: BSc. Level
Email: 21f1004450@ds.study.iitm.ac.in
Data: auto-mpg.csv

Approaches:

1. Matplotlib(Python):

  • Matplotlib is a widely used plotting library in Python, providing a MATLAB-like interface for creating static, interactive, and animated visualizations.
  • It offers a high level of customization but may require more code for complex plots.
    ga5mtp

2. Bokeh(Python):

  • Bokeh provides extensive support for creating interactive plots. You can easily add interactive tools such as pan, zoom, hover, and selection to your plots.
  • This allows users to explore the data interactively, zooming in on specific areas of interest, inspecting data points on hover, or selecting subsets of data for further analysis.
    image

3. Plotly(Python):

  • Plotly is a Python graphing library that makes interactive, publication-quality graphs online.
  • It provides an interface for creating interactive plots with features like hover tooltips, zooming, panning, and exporting as web-ready HTML.
    ga5pltly

4. Fourish(Web-Based Open Source):

  • Flourish is an online data visualization platform that allows you to create various types of visualizations, including bar plots, without requiring any coding.
    dvdflrsh

5. Altair(Python):

  • Altair is a declarative statistical visualization library for Python, based on the Vega and Vega-Lite visualization grammars.
    -It allows users to express visualizations in a concise and readable format and is designed for easy exploration and iteration.
    ga5al1
    ga5alt2

@gokulakrishnanbalaji
Copy link

Name: Gokulakrishnan B
Roll No: 21f1006866

Idea behind visualization

The first thought that came to mind is that heavy weight vehicles consume more fuel to move, thus the mileage will be low. My intuition is that mpg is inversely proportional to weight. Higher the mpg, lighter the vehicle. So I am going to plot mpg vs weight scatter plot to observe the pattern and validate my assumption.

G-sheets

gsheets

Libreoffice Calc

libreoffice-calc

Matplotlib

Matplotlib

Plotly

plotly

Altair

altair

@sanket21f1007096
Copy link

sanket21f1007096 commented Mar 31, 2024

Name : Gaikwad Sanket Sanjay
Roll no: 21f1007096

Title- Relationship between Horsepower and MPG based on number of Cylinders

Features Used:
X-axis - mpg
Y-axis - horsepower
Legend - cylinders

1) Seaborn

image

2) Power BI

image

3) Flourish

image

4) Plotly

newplot (1)

5) Datawrapper

image

@krishnaditi
Copy link

Name - Aditi Krishana
Roll No. - 21f1004270

Dataset Description

Dataset Used - Automobile Dataset

Given dataset had 9 columns:

  1. mpg (Miles per gallon): Represents the fuel efficiency of the car in terms of how many miles it can travel per gallon of fuel.
  2. cylinders: Indicates the number of cylinders in the car’s engine.
  3. displacement: Refers to the engine displacement, which is the total volume of all cylinders in the engine.
  4. horsepower: Represents the power output of the car’s engine.
  5. weight: Denotes the weight of the car.
  6. acceleration: Measures how quickly the car can accelerate from rest to a certain speed.
  7. model year: Indicates the year when the car model was released in the market.
  8. origin: Specifies the place of manufacturing (e.g., country or region).
  9. car name: Provides the name of the car.

Purpose:

The goal is to explore the relationship between horsepower and mileage (mpg) based on the number of cylinders in the car. We want to understand how these factors are interconnected and whether the number of cylinders affects the fuel efficiency of the vehicle.

Visualization type : Scatterplot

Preprocessing Steps:

  • Rows without data for horsepower were removed.
  • An additional color column was added for visualization purposes.
  • Color encoding was applied to the data points using GG-Plot.
  • A query in Tableau was written to color all the dots.

Data Summary:

  • 392 rows of data were used.
  • The dataset contains 3 to 4 columns (specifically, horsepower, mpg, cylinders, and potentially one more).
  • The scatter plot will visually depict the relationship between horsepower and mileage (mpg) based on the number of cylinders in the cars.
  1. Scatter Plot: Horsepower vs. MPG using Microsoft Excel

mpg vs  horsepower (2)

This scatter plot shows the relationship between horsepower and mileage (mpg).
The points cluster around a downward trend, it suggests that cars with more horsepower tend to have lower mpg.

@mukeshmlb92
Copy link

mukeshmlb92 commented Mar 31, 2024

Name: Mukesh K
Roll no: 21F1000478

The dataset contains automotive fuel economy (in miles per gallon or mpg) and associated vehicle characteristics such as cylinders, displacement, horsepower, weight, acceleration, model year, origin, and car name.

Features used:
X-axis: mpg (Miles per gallon)
Y-axis: engine displacement
Legend: no: of cylinders

The following tools were used to plot the data containing engine displacement vs mpg (miles per gallon)
• Matplotlib
• Seabon
• MS Excel
• Tableau
• Matlab

1. Matplotlib
matplotlib

2. Seaborn
seaborn

3. MS Excel
excel

4. Tableau
tableau

5. Matlab
matlab

@Abhishekgu
Copy link

Abhishekgu commented Mar 31, 2024

Name : Abhishek Gupta
Roll No: 21f1004820

For this task, I utilized the auto-mpg.csv dataset to generate a bubble chart across five distinct platforms. A bubble chart, akin to a scatter plot, employs circles to convey additional data. Specifically, I selected the variables of mpg (miles per gallon) and displacement (engine size) from the dataset. The size of the bubbles corresponded to the mpg values of the car engines.

I have used these variables because they have high correlation.
The 5 tools that I have used are following:

Flourish
Datawrapper
Matplotlib & Seaborn
Excel
Datastudio

Data Wrapper:

image

DataStudio

image

Excel:

image

Flourish:

image

Matplotlib :

image

@TheMandalorian1
Copy link

TheMandalorian1 commented Mar 31, 2024

Name : ADITYA DHAR DWIVEDI
Roll No: 21f1001069

For this assignment, I used a dataset auto-mpg.csv to make a bubble chart with five different tools. I picked weight and displacement variables from the dataset, and the size of the bubbles was determined by the cylinders of the car engines.

The 5 tools that I have used are following:

  1. Matplotlib
  2. Flourish
  3. Excel
  4. DataWrapper
  5. DataStudio
  • Matplotlib

image

  • Flourish

image

  • Excel

image

  • DataWrapper

image

  • DataStudio

image

@harshOpensource
Copy link

Name: HARSH BARDHAN

Roll No- 21F1004807

In this comparative evaluation, we explore the relationship between Cylinders ,Displacement, Weight, and Acceleration using five different visualization tools: Excel, Flourish, Datawrapper, Seaborn, and Matplotlib.

1. Excel Scatter Plot:

Description: Excel offers a user-friendly interface for creating scatter plots. While it lacks advanced features, it's suitable for quick visualizations.

Screenshot 2024-03-31 222959

2. Flourish Scatter Plot:

Description: Flourish is an online tool for creating interactive visualizations. It allows for more interactivity compared to other tools but requires uploading data.

Screenshot 2024-03-31 222554

3. Datawrapper Scatter Plot:

Description: Datawrapper provides a simple yet effective way to create scatter plots. Its intuitive interface makes it easy to customize visualizations.

0sP4N-acceleration-vs-weight

4. Seaborn Scatter Plot:

Description: Seaborn, a Python library, offers a high-level interface for drawing attractive statistical graphics. It's particularly useful for exploring datasets with many variables.

Screenshot 2024-03-31 223638

5. Matplotlib Scatter Plot:

Description: Matplotlib is a versatile plotting library for Python. It allows for fine-grained control over plot elements, making it suitable for creating publication-quality visualizations.

download

These scatter plots provide insights into the relationship between the number of cylinders , acceleration, weight and displacement of automobiles, showcasing the strengths and features of each visualization tool.

@kuine
Copy link

kuine commented Mar 31, 2024

Name: Shine Priyan
Roll Number: 21f1003384

Plotting miles per gallon (mpg) by horsepower and color encoding by the number of cylinders wherever possible.

  1. Excel
    Excel

  2. Stata
    Stata

  3. Flourish
    Flourish

  4. Python
    Python

  5. R
    R

@Purva0808
Copy link

Name : Purva Sharma
Roll No. : 21f1006847

1. About the Data:
The dataset used for this analysis contains information about various automobile models, including their miles per gallon (mpg), weight, cylinders, and other attributes. Each row represents a different car model.

2. Features Utilized:

For this analysis, the following features were used:

Weight: Represented on the x-axis.
Miles Per Gallon (MPG): Represented on the y-axis.
Cylinders: Represented by different colors to distinguish between different cylinder counts.

3. Tools Used:

5 visualization tools were employed to analyze and visualize the data:

Matplotlib:
matplotlib_1

Seaborn:
seaborn_2

Plotly:
plotly_3

Bokeh:
bokeh_4

Tableau:
tableau_5

@Tris-utkarsh
Copy link

Tris-utkarsh commented Mar 31, 2024

Name: Utkarsh Gaurav
Roll No: 21F1001336

** Task :** Out of 24 tools and libraries shared , we were supposed to use any 5 tools to represent relationship among the features from the data.

About the Data:
The data is about the cars of 70s-80s model with various feature like mpg, displacement , horsepower etc offered as the new models were introduced.

Exploratory Analysis:
Using pandas library , I performed exploratory anlysis to find the presence of relationship among features and plotted using matplot library . Based on analysis , I have shown relationship between Miles per Gallon and Displacement of Engine pistons using Cylinders.

exploratory_analysis

Tools Used:

  1. Seaborn: Python library for statistical visualization with attractive themes and color palettes.
    seaborn_plot1

  2. Lyra : Interactive visualization tool enabling custom visualizations through a drag-and-drop interface.

Screenshot 2024-04-01 at 10 11 31 AM
  1. GGPlot : R package for creating complex, layered plots following a declarative approach.
    R_mp_vs_displacement

  2. Plotly : Open-source library offering interactive visualizations across multiple programming languages.

Screenshot 2024-03-31 at 11 53 34 PM
  1. Tableau: Data visualization software facilitating interactive dashboards and reports creation from various data sources.
Screenshot 2024-03-31 at 11 24 38 PM

These tools happen proved to be very powerful for various purpose and its upto the designer story what they want to convey.

@nk-droid
Copy link

nk-droid commented Mar 31, 2024

Name: Nidhish Kumar
Roll: 21F1003758

Data Used

  1. mpg
  2. weight
  3. cylinders

Tools Used

  1. MS EXCEL
    image

  2. GGPLOT
    image

  3. PLOTLY
    image

  4. TABLEAU
    image

  5. MATPLOTLIB
    image

@pkbhalla
Copy link

Name: Pratham Bhalla
Roll Number: 21f1003052

Title: Miles per gallon vs Horsepower w.r.t. No. of cylinders

X-axis : Miles per gallon
Y-axis : Horsepower
Legends : No. of Cylinders

The creation of visualizations involved preprocessing of the data which included removal of rows with null value in horsepower. The chart chosen is scatterplot on the following tools:

  1. Microsoft Excel
  2. Flourish
  3. Tableau Public
  4. Datawrapper
  5. Power BI

Here are the visualizations:

1. Microsoft Excel

Excel

2. Flourish

DataVizGA5
Published link

3. Tableau Public

Sheet 2

4. Datawrapper

v4rBP-miles-per-gallon-vs-horsepower-w-r-t-no-of-cylinders
Published link

5. Power BI

image

@DevanshGandhi-16
Copy link

Name: Devansh Gandhi
Roll No: 21f1002115

Variables Used:

  • mpg (miles per gallon)
  • weight
  • no. of cylinders (cylinders)

Chart Type: Scatterplot

  • x-axis: weight of the car (weight)
  • y-axis: miles per gallon (mpg)
  • color scale is used by no. of cylinders (cylinders).

Purpose:

To identify relationship between the Weight of the car and its corresponding MPG while considering the no. of cylinders in the car.


Tools/Libraries used:

  • Flourish
  • Datawrapper
  • Tableau
  • Matplotlib
  • Seaborn

Visualizations:

1) Flourish

DVD GA5

2) Datawrapper

Uvgdm-mpg-vs-weight-of-the-car

3) Tableau

image

4) Matplotlib

image

5) Seaborn

image

@CoreManish
Copy link

Name : Manish Kumar
Roll : 21F1006597
Dataset : link

Draw a scatter plot between horsepower and acceleration using these different tools

1 Google sheet

Horse power vs Acceleration

2 Flourish

Screenshot from 2024-04-01 17-44-05

3 matplotlib

Screenshot from 2024-04-01 18-54-10

4 tableau

image

5 polestar

Screenshot from 2024-04-01 19-32-20
Wrongly presenting by assuming horsepower as text

6 Datawrapper

image

@lehani1
Copy link

lehani1 commented Apr 1, 2024

Name: Lehani Raj Mohanta
Roll No: 21F1003574

Chart type: Scatter plot
Attributes: Horsepower and Acceleration

1. Excel

excel

2. Matplotlib

plt

3. Seaborn

seaborn

4. Altair

altair

5. Plotly express

newplot

@Jimmi-Kr Jimmi-Kr changed the title Graded Assignment-5 (May Term 2024):- Data Visualization Tools Graded Assignment-5 (Jan Term 2024):- Data Visualization Tools Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests