Skip to content

Commit

Permalink
Scripts and results for data analysis
Browse files Browse the repository at this point in the history
  • Loading branch information
WerLaj committed Jan 30, 2024
1 parent 44110e8 commit f90597f
Show file tree
Hide file tree
Showing 13 changed files with 560 additions and 1 deletion.
5 changes: 4 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,7 @@ pytest
mypy
docformatter
pre-commit
pydocstyle==6.1.1
pydocstyle==6.1.1
statsmodels>=0.12.2
pandas==1.3.1
matplotlib
70 changes: 70 additions & 0 deletions results/quantitative_analysis/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Results of Quantitative Analysis

The following sections provides the results reported in the paper along with the scripts used to generate the numbers.

## One-way ANOVA

![alt text](oneway-anova-table.png)

The results are generated using [this script](../../scripts/data_analysis/anova.py) and the following command:

`` python -m scripts.data_analysis.anova --type one-way ``

## Two-way ANOVA

![alt text](twoway-anova-table.png)

The results are generated using [this script](../../scripts/data_analysis/anova.py) and the following command:

`` python -m scripts.data_analysis.anova --type two-way ``

## Mean scores

Mean scores for response usefulness and explanation user ratings for different quality of the explanations and presentation mode can be generated using [this script](../../scripts/data_analysis/mean_scores.py).

### Mean response usefulness scores for explanations with different levels of accuracy

![](mean_scores/means_usefulness_explanation_quality.png)

### Mean response usefulness scores for explanations with different presentation modes

![](mean_scores/means_usefulness_explanation_presentation.png)

### Mean explanation ratings for explanations with different levels of accuracy

![](mean_scores/means_explanation_ratings_explanation_quality.png)

### Mean explanation ratings for explanations with different presentation modes

![](mean_scores/means_explanation_ratings_explanation_presentation.png)

Mean scores for other response dimensions for different quality of the explanations and presentation mode can be generated using ....

## Data distribution

The distribution of user-judged response dimensions per query for both user studies can be generated using [this script](../../scripts/data_analysis/data_distribution.py) and the following command:

`` python -m scripts.data_analysis.data_distribution ``

![](data_distribution.png)

## Demographic information

Demographic information about the crowd workers participating in the user study is presented below:

| Demographic Information | Option | Number of workers |
| --- | --- | --- |
| age | 18-30 | 39 |
| age | 31-45 | 76 |
| age | 46-60 | 41 |
| age | 60+ | 4 |
| age | Prefer not to say | 0 |
| education | High School | 12 |
| education | Bachelor's Degree | 111 |
| education | Master's Degree | 34 |
| education | Ph.D. or higher | 3 |
| education | Prefer not to say | 0 |
| gender | Male | 95 |
| gender | Female | 60 |
| gender | Other | 5 |
| gender | Prefer not to say | 0 |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file added scripts/__init__.py
Empty file.
Empty file.
100 changes: 100 additions & 0 deletions scripts/data_analysis/data_distribution.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
import collections
from typing import List

import matplotlib.pyplot as plt
import pandas as pd


def process_data_for_distribution_plot(
response_dimensions: List[str], data_df: pd.DataFrame, main_feature: str
):
"""Processes the data for the distribution plot.
Args:
response_dimensions: The response dimensions used in a user study.
data_df: Dataframe containing the results from a user study.
Returns:
A dictionary containing the data for the distribution plot.
"""
values_per_query = collections.defaultdict(dict)

for value in list(set(list(data_df[main_feature]))):
for response_dimension in response_dimensions:
query_sub_df = data_df[data_df[main_feature] == value]
query_response_dimension_values = list(
query_sub_df[response_dimension]
)
values_per_query[value][
response_dimension
] = query_response_dimension_values

return values_per_query


if __name__ == "__main__":
aggregated_data = pd.read_csv(
"results/user_study_output/output_processed_aggregated.csv"
)

for feature in [
"source_usefulness",
"warning_usefulness",
"confidence_usefulness",
"conversational_frequency",
"voice_frequency",
]:
f = [
int(d.replace("option_", ""))
for d in list(aggregated_data[feature])
]
aggregated_data[feature] = f

response_dimensions = [
"familiarity",
"interest",
"search_prob",
"relevance",
"correctness",
"completeness",
"comprehensiveness",
"conciseness",
"serendipity",
"coherence",
"factuality",
"fairness",
"readability",
"satisfaction",
"usefulness",
]

feature = "questions_ids"
values_per_query = process_data_for_distribution_plot(
response_dimensions, aggregated_data, feature
)

n_col = 5
fig, axs = plt.subplots(3, n_col, figsize=(15, 9))

for id, response_dimension in enumerate(response_dimensions):
boxplot_data = []

for value in range(1, 11):
boxplot_data.append(
values_per_query[value][response_dimension],
)

axs[int(id / n_col)][id % n_col].boxplot(boxplot_data)
axs[int(id / n_col)][id % n_col].set_xlabel("Query ID")
axs[int(id / n_col)][id % n_col].set_ylabel(
"Worker Self-Reported Rating"
)
axs[int(id / n_col)][id % n_col].set_title(
response_dimension.replace(
"search_prob", "search probability"
).title()
)

fig.tight_layout(pad=1.0)
plt.figure(dpi=2000)
fig.savefig("results/quantitative_analysis/data_distribution.png")
Loading

0 comments on commit f90597f

Please sign in to comment.