-
Notifications
You must be signed in to change notification settings - Fork 58
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
90 additions
and
101 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,67 @@ | ||
### CS 429: Information Retrieval | ||
**Spring 2015** | ||
**See the [Schedule](Schedule.md) for a detailed list of readings and due dates.** | ||
|
||
This repository contains class files for CS 429: Information Retrieval, taught at the [Illinois Institute of Technology](http://cs.iit.edu) by [Aron Culotta](http://cs.iit.edu/~culotta). | ||
### Overview | ||
|
||
The contents are organized as follows: | ||
- **Course:** CS 429: Information Retrieval | ||
- **Instructor:** [Dr. Aron Culotta](http://cs.iit.edu/~culotta) | ||
- **Meetings:** 3:15 - 4:30 pm T/R Room TBA | ||
- **E-mail:** culotta at cs.iit.edu | ||
- **Phone:** 312-567-5261 | ||
- **Office Hours:** T/R 10:00 a.m. - 11:00 a.m. | ||
- **Office:** Stuart Hall 229B | ||
- **TA:** TBA | ||
|
||
- [`admin`](admin): the syllabus and related resources | ||
- [`assignments`](assignments): instructions and code for homework assignments | ||
- [`lectures`](lectures): class notes | ||
**Description:** Overview of fundamental issues of information retrieval with theoretical foundations. The information-retrieval techniques and theory, covering both effectiveness and run-time performance of information-retrieval systems are covered. The focus is on algorithms and heuristics used to find documents relevant to the user request and to find them fast. The course covers the architecture and components of the search engine such as parser, stemmer, index builder, and query processor. The students learn the material by building a prototype of such a search engine. Prerequisites: CS 331 or CS 401; requires strong programming knowledge. 3-0-3 (C) (T) | ||
|
||
**Textbook:** [*Introduction to Information Retrieval*](http://nlp.stanford.edu/IR-book/), Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press. 2008. | ||
|
||
You can use the [electronic version](http://nlp.stanford.edu/IR-book/) of this book. | ||
|
||
### Grading | ||
|
||
- 250 points - [Assignments](../assignments) (5 @ 50 points each) | ||
- 100 points - Midterm | ||
- 100 points - Final | ||
- 32 points - 1 quiz <del>50 points - Quizzes / In-class assignments (5 @ 10 points each)</del> | ||
- **682 total points** <del>700 total points</del> | ||
|
||
| **Percent** | **Grade** | | ||
|-------------|-----------| | ||
| 100-90 | A | | ||
| 89-80 | B | | ||
| 79-70 | C | | ||
| 69-60 | D | | ||
| < 60 | E | | ||
|
||
**Academic Integrity** | ||
|
||
- Please read IIT's [Academic Honesty Policy](http://www.iit.edu/student_affairs/handbook/information_and_regulations/code_of_academic_honesty.shtml) | ||
- All work you turn in must be done by you alone. | ||
- All violations will be reported to `academichonesty@iit.edu`. | ||
- The first violation will result in a failing grade for that assignment/test. The second will result in a failing grade for the course. | ||
|
||
|
||
**Late Submission Policy** | ||
|
||
- Late assignments will **not** be accepted, unless: | ||
- There is an unavoidable medical, family, or other emergency, **and** | ||
- You notify me **prior** to the due date | ||
|
||
### Course Outcomes | ||
|
||
1. Explain the information retrieval storage methods (Inverted Index and Signature Files) | ||
2. Explain retrieval models, such as Boolean model, Vector Space model, Probabilistic model, Inference Networks, and Neural Networks. | ||
3. Explain retrieval utilities such as Stemming, Relevance Feedback, N-gram, Clustering, and Thesauri, and Parsing and Token recognition. | ||
4. Design and implement a search engine prototype using the storage methods, retrieval models and utilities. | ||
5. Apply the research ideas into their experiments in building a search engine prototype | ||
|
||
|
||
### Program Outcomes | ||
|
||
- a. An ability to apply knowledge of computing and mathematics appropriate to the discipline. | ||
- c. An ability to design, implement and evaluate a computer-based system, process, component, or program to meet desired needs. | ||
- d. An ability to function effectively on teams to accomplish a common goal. | ||
- f. An ability to communicate effectively with a range of audiences. | ||
- i. An ability to use current techniques, skills, and tools necessary for computing practices. | ||
- j. An ability to apply mathematical foundations, algorithmic principles, and computer science theory in the modeling and design of computer-based systems in a way that demonstrates comprehension of the tradeoffs involved in design choices. | ||
- k. An ability to apply design and development principles in the construction of software systems of varying complexity. |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,41 +1,41 @@ | ||
| Date | Topic | Readings | Due | Lecture | | ||
| ----- |----------------------------------|--------------------------------------------------------|-----|---- | ||
||**Part I: Indexing**| | ||
| 1/13 | Boolean Search | [Ch1](http://nlp.stanford.edu/IR-book/pdf/01bool.pdf) | |[L01](../lectures/lec01) | ||
| 1/15 | Indexing I: stemming/stopping | [Ch2](http://nlp.stanford.edu/IR-book/pdf/02voc.pdf) | |[L02](../lectures/lec02) | ||
| 1/20 | Indexing II: phrases, skip lists, position | [Ch2](http://nlp.stanford.edu/IR-book/pdf/02voc.pdf) | | [L03](../lectures/lec03) | ||
| 1/22 | Dictionaries | [Ch3](http://nlp.stanford.edu/IR-book/pdf/03dict.pdf) | [A0](../assignments/assignment0) | [L04](../lectures/lec04) | ||
| 1/27 | Scalable indexing | [Ch4](http://nlp.stanford.edu/IR-book/pdf/04const.pdf) | | [L05](../lectures/lec05) | ||
| 1/29 | Index compression | [Ch5](http://nlp.stanford.edu/IR-book/pdf/05comp.pdf) | | [L06](../lectures/lec06) | ||
| 1/12 | Boolean Search | [Ch1](http://nlp.stanford.edu/IR-book/pdf/01bool.pdf) | |[L01](../lectures/lec01) | ||
| 1/14 | Indexing I: stemming/stopping | [Ch2](http://nlp.stanford.edu/IR-book/pdf/02voc.pdf) | |[L02](../lectures/lec02) | ||
| 1/19 | Indexing II: phrases, skip lists, position | [Ch2](http://nlp.stanford.edu/IR-book/pdf/02voc.pdf) | | [L03](../lectures/lec03) | ||
| 1/21 | Dictionaries | [Ch3](http://nlp.stanford.edu/IR-book/pdf/03dict.pdf) | [A0](../assignments/assignment0) | [L04](../lectures/lec04) | ||
| 1/26 | Scalable indexing | [Ch4](http://nlp.stanford.edu/IR-book/pdf/04const.pdf) | | [L05](../lectures/lec05) | ||
| 1/28 | Index compression | [Ch5](http://nlp.stanford.edu/IR-book/pdf/05comp.pdf) | | [L06](../lectures/lec06) | ||
|| **Part II: Ranking** | | ||
| 2/03 | Vector space model | [Ch6](http://nlp.stanford.edu/IR-book/pdf/06vect.pdf) | | [L07](../lectures/lec07) | ||
| 2/05 | Scoring for search |[Ch7](http://nlp.stanford.edu/IR-book/pdf/07system.pdf)| [A1](../assignments/assignment1) (now due 2/6) | [L08](../lectures/lec08) | [A1](../assignments/assignment1) | ||
| 2/10 | Evaluation | [Ch8](http://nlp.stanford.edu/IR-book/pdf/08eval.pdf) | | [L09](../lectures/lec09) | ||
| 2/12 | Query Expansion | [Ch9](http://nlp.stanford.edu/IR-book/pdf/09expand.pdf)| | [L10](../lectures/lec10) | ||
| 2/17 | Probabilistic IR | [Ch11](http://nlp.stanford.edu/IR-book/pdf/11prob.pdf) | | [L11](../lectures/lec11) | ||
| 2/19 | Probabilistic IR | [Ch11](http://nlp.stanford.edu/IR-book/pdf/11prob.pdf) | [A2](../assignments/assignment2) | [L12](../lectures/lec12) | ||
| 2/24 | Language Models | [Ch12](http://nlp.stanford.edu/IR-book/pdf/12lmodel.pdf) | | [L13](../lectures/lec13) | ||
| 2/26 | Language Models | [Ch12](http://nlp.stanford.edu/IR-book/pdf/12lmodel.pdf) | | [L14](../lectures/lec14) | ||
| 2/02 | Vector space model | [Ch6](http://nlp.stanford.edu/IR-book/pdf/06vect.pdf) | | [L07](../lectures/lec07) | ||
| 2/04 | Scoring for search |[Ch7](http://nlp.stanford.edu/IR-book/pdf/07system.pdf)| [A1](../assignments/assignment1) (now due 2/6) | [L08](../lectures/lec08) | [A1](../assignments/assignment1) | ||
| 2/09 | Evaluation **(Aron travels)** | [Ch8](http://nlp.stanford.edu/IR-book/pdf/08eval.pdf) | | [L09](../lectures/lec09) | ||
| 2/11 | Query Expansion **(Aron travels)** | [Ch9](http://nlp.stanford.edu/IR-book/pdf/09expand.pdf)| | [L10](../lectures/lec10) | ||
| 2/16 | Probabilistic IR **(Aron travels)** | [Ch11](http://nlp.stanford.edu/IR-book/pdf/11prob.pdf) | | [L11](../lectures/lec11) | ||
| 2/18 | Probabilistic IR | [Ch11](http://nlp.stanford.edu/IR-book/pdf/11prob.pdf) | [A2](../assignments/assignment2) | [L12](../lectures/lec12) | ||
| 2/23 | Language Models | [Ch12](http://nlp.stanford.edu/IR-book/pdf/12lmodel.pdf) | | [L13](../lectures/lec13) | ||
| 2/25 | Language Models | [Ch12](http://nlp.stanford.edu/IR-book/pdf/12lmodel.pdf) | | [L14](../lectures/lec14) | ||
|| **Part III: Classification**| | ||
| 3/03 | Naive Bayes | [Ch13](http://nlp.stanford.edu/IR-book/pdf/13bayes.pdf)| | [L15](../lectures/lec15) | ||
| 3/05 | Logistic Regression | [Ch14](http://nlp.stanford.edu/IR-book/pdf/14vcat.pdf) | [A3](../assignments/assignment3) | ||
| 3/10 | **Midterm** | | | ||
| 3/12 | KNN | [Ch14](http://nlp.stanford.edu/IR-book/pdf/14vcat.pdf) | | [L16](../lectures/lec16/bayes.pdf) | ||
| 3/05 | Logistic Regression | [Ch14](http://nlp.stanford.edu/IR-book/pdf/14vcat.pdf) | | [L17](../lectures/lec17) | ||
| 3/01 | Naive Bayes | [Ch13](http://nlp.stanford.edu/IR-book/pdf/13bayes.pdf)| | [L15](../lectures/lec15) | ||
| 3/03 | Logistic Regression | [Ch14](http://nlp.stanford.edu/IR-book/pdf/14vcat.pdf) | [A3](../assignments/assignment3) | ||
| 3/08 | **Midterm** | | | ||
| 3/10 | KNN | [Ch14](http://nlp.stanford.edu/IR-book/pdf/14vcat.pdf) | | [L16](../lectures/lec16/bayes.pdf) | ||
| 3/15 | **Spring Break** | | | ||
| 3/17 | **Spring Break** | | | ||
| 3/19 | **Spring Break** | | | ||
| 3/22 | Logistic Regression **(Aron travels)** | [Ch14](http://nlp.stanford.edu/IR-book/pdf/14vcat.pdf) | | [L17](../lectures/lec17) | ||
| 3/24 | Logistic Regression | [Ch15](http://nlp.stanford.edu/IR-book/pdf/15svm.pdf) | | ||
| 3/26 | Bias/Variance | Handouts | | ||
| 3/29 | Bias/Variance | Handouts | | ||
||**Part IV: Clustering**| | ||
| 3/31 | Learning to Rank | [Ch16](http://nlp.stanford.edu/IR-book/pdf/16flat.pdf) | | ||
| 4/02 | K-Means | [Ch16](http://nlp.stanford.edu/IR-book/pdf/16flat.pdf) | [A4](../assignments/assignment4) | | ||
| 4/05 | K-Means | [Ch16](http://nlp.stanford.edu/IR-book/pdf/16flat.pdf) | [A4](../assignments/assignment4) | | ||
| 4/07 | EM | [Ch18](http://nlp.stanford.edu/IR-book/pdf/18lsi.pdf) | | [L22](../lectures/lec22) | ||
| 4/09 | Word Clustering | Handouts | | [L23](../lectures/lec23) | ||
| 4/12 | Word Clustering | Handouts | | [L23](../lectures/lec23) | ||
||**Part V: Web Search**| | ||
| 4/14 | Web search overview | [Ch19](http://nlp.stanford.edu/IR-book/pdf/19web.pdf) | | ||
| 4/16 | PageRank | [Ch21](http://nlp.stanford.edu/IR-book/pdf/21link.pdf) | [A5](../assignments/assignment5) | ||
| 4/19 | PageRank | [Ch21](http://nlp.stanford.edu/IR-book/pdf/21link.pdf) | [A5](../assignments/assignment5) | ||
| 4/21 | PageRank | [Ch21](http://nlp.stanford.edu/IR-book/pdf/21link.pdf) | | [L26](../lectures/lec26) | ||
| 4/23 | Web Crawling | [Ch20](http://nlp.stanford.edu/IR-book/pdf/20crawl.pdf)| | ||
| 4/26 | Web Crawling | [Ch20](http://nlp.stanford.edu/IR-book/pdf/20crawl.pdf)| | ||
| 4/28 | Review | | [A6](../assignments/assignment6) | ||
| 4/30 | **Final Exam** | | | ||
| TBA | **Final Exam** | | | ||
|
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.