-
Notifications
You must be signed in to change notification settings - Fork 58
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
6 changed files
with
199 additions
and
1,694 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,199 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"# CS 429: Information Retrieval\n", | ||
"<br>\n", | ||
"\n", | ||
"## Lecture 28: Conclusions\n", | ||
"\n", | ||
"<br>\n", | ||
"\n", | ||
"### Dr. Aron Culotta\n", | ||
"### Illinois Institute of Technology" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## What have we learned?\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"\n", | ||
"![system](files/system.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## What's left?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Adaptability**\n", | ||
"\n", | ||
"~25% of search queries are new each day\n", | ||
"\n", | ||
"<br><br><br><br>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Ambiguity**\n", | ||
"\n", | ||
"- apple the company vs. apple the fruit\n", | ||
"- \"man walking a dog\" vs \"a dog walking a man\"\n", | ||
"\n", | ||
"<br><br><br><br>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Personalized Search**\n", | ||
"- If I search for \"government\" in the US vs in Canada\n", | ||
"- If I search for IIT\n", | ||
"\n", | ||
"\n", | ||
"<br><br><br><br>\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Multimedia Search**\n", | ||
"- Images of cats playing with yarn\n", | ||
"- Videos of dogs playing with cats\n", | ||
"\n", | ||
"<br><br><br><br>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"** Recommendation Systems **\n", | ||
"- If you liked X you may like Y...\n", | ||
"\n", | ||
"<br><br><br><br>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Scaling**\n", | ||
"- ~[300 hours of new YouTube videos uploaded per minute](http://www.marketingpilgrim.com/2014/12/in-the-next-60-seconds-300-hours-of-video-will-be-uploaded-to-youtube.html)\n", | ||
"- ~[55 million Facebook status updates](https://blog.kissmetrics.com/facebook-statistics/)\n", | ||
"\n", | ||
"<br><br><br><br>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"![survey.png](survey.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Final**\n", | ||
"\n", | ||
"- Tuesday, 5/3, 10:30am-12:30pm, SB104\n", | ||
"- Pencil/pen/eraser only \n", | ||
"- 10 questions (5 T/F, 5 short answer)\n", | ||
"\n", | ||
"**Topics include:**\n", | ||
"\n", | ||
"- Classification\n", | ||
" - $k$-nearest neighbor\n", | ||
" - naive bayes\n", | ||
" - multinomial / binomial\n", | ||
" - smoothing\n", | ||
" - logistic regression\n", | ||
" - gradient descent recipe\n", | ||
" - update rules\n", | ||
" - learning rate\n", | ||
" - Bias / variance tradeoff\n", | ||
" - more parameters -> higher variance, lower bias\n", | ||
" - overfitting\n", | ||
" - Learning to rank\n", | ||
"- Clustering\n", | ||
" - $k$-means\n", | ||
" - problems: shape of clusters; influence of outliers\n", | ||
" - Expectation Maximization\n", | ||
" - word clustering vs. document clustering\n", | ||
" - initialization\n", | ||
" - error functions\n", | ||
" - picking number of clusters\n", | ||
"- Web search\n", | ||
" - PageRank\n", | ||
" - recursive formulation\n", | ||
" - matrix formulation\n", | ||
" - random surfer idea\n", | ||
" - random teleports for spidertraps, dead ends\n", | ||
" - HITS\n", | ||
" - recursive formulation\n", | ||
" - Link farms\n", | ||
" - Crawling\n", | ||
" - Mercator crawler\n", | ||
" - front/back queues for prioritization and politeness\n", | ||
" - Minhash for near-duplicate detection\n", | ||
" \n", | ||
" \n", | ||
"** Types of questions: **\n", | ||
"\n", | ||
"- True / false\n", | ||
"- Computation:\n", | ||
" - e.g., compute PageRank scores, $k$-means error, logistic regression updates, ...\n", | ||
"- New algorithms\n", | ||
" - How can you improve algorithm X to have desired property Y\n", | ||
" - Requires understanding the motivations behind the methods we've discussed\n", | ||
" - Why did we do things the way we did?\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.5.0" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
Oops, something went wrong.