Skip to content

Commit

Permalink
finalized the code for the interactive charts and added README for data
Browse files Browse the repository at this point in the history
  • Loading branch information
ZoeLeBlanc committed May 10, 2024
1 parent 67e9b56 commit 597c45f
Show file tree
Hide file tree
Showing 21 changed files with 393,992 additions and 392,613 deletions.
740 changes: 569 additions & 171 deletions interactive_charts/InteractiveRecommendationsChart.ipynb

Large diffs are not rendered by default.

35 changes: 35 additions & 0 deletions interactive_charts/html_figures/figure14_chart.html

Large diffs are not rendered by default.

28 changes: 11 additions & 17 deletions speculative_reading/CollaborativeFilteringRecommendations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -816,21 +816,21 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Deduping final scores: (1740, 20) -> (946, 21)\n",
"Deduping final scores: (2601, 20) -> (1415, 21)\n",
"Deduping final scores: (244, 20) -> (60, 21)\n",
"Deduping final scores: (276, 20) -> (100, 21)\n",
"Deduping final scores: (1988, 20) -> (1097, 21)\n",
"Deduping final scores: (2990, 20) -> (1648, 21)\n",
"Deduping final scores: (60, 20) -> (60, 21)\n",
"Deduping final scores: (102, 20) -> (102, 21)\n"
"Deduping final scores: (1740, 19) -> (946, 20)\n",
"Deduping final scores: (2601, 19) -> (1415, 20)\n",
"Deduping final scores: (244, 19) -> (60, 20)\n",
"Deduping final scores: (276, 19) -> (100, 20)\n",
"Deduping final scores: (1988, 19) -> (1097, 20)\n",
"Deduping final scores: (2990, 19) -> (1648, 20)\n",
"Deduping final scores: (60, 19) -> (60, 20)\n",
"Deduping final scores: (102, 19) -> (102, 20)\n"
]
}
],
Expand Down Expand Up @@ -880,7 +880,8 @@
"\tfinal_scores = pd.merge(mode_scores[['member_id', 'period', 'item_uri', 'formatted_table_title', 'formatted_chart_title', 'mode_score', 'mode_zscore', 'subscription_start', 'subscription_end']], scores_df, on=['member_id', 'period', 'item_uri', 'formatted_table_title', 'formatted_chart_title', 'subscription_start', 'subscription_end'])\n",
"\n",
"\t# Drop duplicate rows from 'final_scores' to get a DataFrame of unique scores.\n",
"\tfinal_scores_dedup = final_scores.drop_duplicates(final_scores.columns.difference(['metric']))\n",
"\tfinal_scores = final_scores.drop(columns=['metric'])\n",
"\tfinal_scores_dedup = final_scores.drop_duplicates()\n",
"\n",
"\t# Calculate the coefficient of variation (standard deviation divided by median) for each score in 'final_scores_dedup'.\n",
"\t# Add the coefficient of variation to 'final_scores_dedup' as a new column.\n",
Expand All @@ -900,13 +901,6 @@
"aggregated_formatted_predictions_top200_limit_circulation_with_periodicals = aggregate_predictions(formatted_predictions_top200_limit_circulation_with_periodicals, './data/collaborative_filtering_results/aggregated_top200_predictions_with_periodicals_circulation_limited.csv')\n",
"aggregated_formatted_predictions_top200_with_periodicals = aggregate_predictions(formatted_predictions_top200_with_periodicals, './data/collaborative_filtering_results/aggregated_top200_predictions_with_periodicals.csv')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
127 changes: 79 additions & 48 deletions speculative_reading/LenskitRecommendations.ipynb

Large diffs are not rendered by default.

23 changes: 22 additions & 1 deletion speculative_reading/data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,25 @@

This folder contains data files used in the analysis of the "Speculative Reading" section of the "Missing Data, Speculative Reading" article.

## Contents
## Contents

### Lenskit Results

This folder holds csv files containing the results of the Lenskit analysis. The files are named with the following experiment names:

- `with_periodicals` or `without_periodicals` to indicate whether the model includes periodicals.
- `comparison_model_runs` are the results of the comparison model runs to validate the lenskit model.
- `sampled` indicates selecting the top recommendations from the model.
- `full` indicates selecting all recommendations from the model.
- `aggregated` are combined results from all runs of the model.
- `popularity` are the top popularity results.

### Collaborative Filtering Results

This folder holds csv files containing the results of the collaborative filtering analysis. The files are named with the following experiment names:

- `with_periodicals` or `without_periodicals` to indicate whether the model includes periodicals.
- `full` indicates selecting all recommendations from the model.
- `top200` indicates selecting the top 200 recommendations from the model.
- `aggregate` are combined results from all runs of the model.
- `circulation limited` are the results when we limit to the circulation window.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

387,800 changes: 193,900 additions & 193,900 deletions speculative_reading/data/lenskit_results/full_predictions_model100_with_periodicals.csv

Large diffs are not rendered by default.

Loading

0 comments on commit 597c45f

Please sign in to comment.