A repository with example code, exercises and solutions from the Natural Language Toolkit Book.
Content for this book is continuously being added for this project. For any inquiries email me at rehensle@calpoly.edu and for any specific issues or suggestions with the Jupyter Notebooks, create a new issue in the Issues tab of this repository.
- Quick Feedback Form
- This link is for leaving a quick comment about the repository; anyone is welcome to share feedback here!
- Detailed Feedback Survey
- This survey is designed for people who have completed the setup process and have tried one of the Chapter Notes / Exercises
- There are multiple questions, but the survey is short and each question is optional
This repository is intended to be a starting point for students wanting to learn both Natural Language Processing concepts and the Python programming language using the NLTK book. NLTK is short for "Natural Language Toolkit", a set of linguistic tools used to analyze text for educational and research purposes. The NLTK book aims to teach users common Natural Language Processing concepts and Python 3.0 simultaneously. The book also introduces other important concepts to general programming and Data Science, including:
- data collection
- data manipulation
- data structures
- machine learning
- readable and structured program writing
There are no pre-requisites for this program. It assumes no programming experience. If you already have some programming experience in Python, you can still use this resource to learn NLTK. Refer to the NLTK Book Preface for more information regarding which chapters to use.
The contents of this repository will teach students how to set up an environment to run python code to learn NLTK. Each chapter of the NLTK book has a set of Jupyter Notebooks where students can write notes as well as well as run code. The types of Jupyter Notebooks for each chapter in this repository are described in this table:
Type | File Name | Description |
---|---|---|
Notes and Examples | ##_notes.ipynb |
notebook with runnable code from examples provided in the book as well as tips and resources to accompany the text |
Exercises | ##_exercises.ipynb |
notebook to work on chapter exercises |
Exercise Solutions | ##_solutions.ipynb |
selected solutions to review challenging material |
The Initial Setup Notebook explains how to run and save these notebooks. Students can choose to use either an Anaconda installation of NLTK and Jupyter Notebooks on their computer, or use Google Colab in a web browser (with some limitations). Instructions to install and use Anaconda or Google Colab are provided in the Initial Setup Notebook.
The Notes and Examples notebook can be used to run and edit example code in the book to have a better conceptual understanding of how the code work. Tips and references to modern resources such as datasets, Python libraries, APIs and tutorials will be given throughout these notes as well. Students are also encouraged to try the chapter exercises in the Exercises notebook, and use the Exercise Solutions notebooks to understand more challenging problems.
These links open up to Google Colab notebooks. To learn how to use Google Colab (or Jupyter Notebooks), click on the Initial Setup Notebook.
Chapter | Contents |
---|---|
0. Preface | Preface Text Initial Setup Notebook |
1. Language Processing and Python | Chapter 1 Text Chapter 1 Notes and Examples Chapter 1 Exercises Chapter 1 Exercise Solutions |
2. Accessing Text Corpora and Lexical Resources | Chapter 2 Text Chapter 2 Notes and Examples Chapter 2 Exercises Chapter 2 Exercise Solutions |
3. Processing Raw Text | Chapter 3 Text Chapter 3 Notes and Examples Chapter 3 Exercises Chapter 3 Exercise Solutions |
4. Writing Structured Programs | Chapter 4 Text Chapter 4 Notes and Examples Chapter 4 Exercises Chapter 4 Exercise Solutions |
5. Categorizing and Tagging Words | Chapter 5 Text |
6. Learning to Classify Text | Chapter 6 Text |
7. Extracting Information from Text | Chapter 7 Text |
8. Analyzing Sentence Structure | Chapter 8 Text |
9. Building Feature Based Grammars | Chapter 9 Text |
10. Analyzing the Meaning of Sentences | Chapter 10 Text |
11. Managing Linguistic Data | Chapter 11 Text |
12. Afterword: Facing the Language Challenge | Chapter 12 Text |