CS 5761 - Introduction to Natural Language Processing - Spring 2002
Instructor: Dr. Ted Pedersen
Office: 309 Heller Hall
Office Hours: Tue, Thu 3:30 - 5:00 pm
Email: tpederse@d.umn.edu
TA:
Siddharth Patwardhan ("Sid")
Office Hours (HH314) : Wed 3:00-5:00 pm, Fri 3:00-4:00 pm
Email: patw0006@d.umn.edu
Class Web Page:
http://www.d.umn.edu/~tpederse/Courses/CS5761/class.html
Course Objectives:
Explore techniques for creating computer programs that analyze, generate,
and understand natural human language. Topics include syntactic analysis,
semantic interpretation, and discourse processing. Applications selected
from speech recognition, conversational agents, machine translation, and
language generation. Substantial programming project required.
Required Text:
Speech and Language Processing : an Introduction to Natural Language
Processing, Computational Linguistics, and Speech Recognition by Daniel
Jurafsky and James H. Martin. Prentice-Hall. There is a
supporting Web Site
with quite a bit of information.
There is a copy of the text on 2-hour reserve in the library. This is not
meant to replace your own copy, but may prove useful if you find yourself
in the library without your text.
Reading assignments will be given in the lecture and posted
here. Our focus will be on written language,
and we will not consider the many very interesting issues involved in
processing spoken language. We may cover portions of Chapters 1, 2, 3, 6,
8, 9, 10, 12, 13, 14, 15, 16, 17, 19, and 21.
Suggested Perl Texts:
We will do our programming assignments in Perl. While we will discuss
Perl from time to time in the lecture, there will be a fair bit of
self-study required. As such you are strongly advised to have at least
one of the following at your disposal:
Learning Perl by Randal Schwartz and Tom Phoenix. O'Reilly
Publishers. AKA the Llama book.
You can get this book from
amazon.com , or any other bookstore. You can also get an electronic
copy of this book from the UMD library! This book takes a tutorial
approach and is especially good if you have limited C or Unix experience.
Programming Perl by Larry Wall, Tom Christiansen and Jon Orwant.
O'Reilly Publishers. AKA The Camel Book.
You can get this book from
amazon.com
or any other bookstore. It is also on 2 hour reserve at the UMD
library. This book is more like a reference
manual, although it is still very readable. This is a good choice
if you have extensive C and Unix backgrounds.
Mastering Regular Expressions: Powerful Techniques for Perl and Other
Tools by Jeffrey E. Friedl. O'Reilly Publishers. AKA The Owl Book.
You can get this book from
amazon.com , or any other bookstore. You can also get an electronic
copy of this book from the UMD library! This book is a very in-depth
treatment of regular expressions in Perl and is really invaluable.
It is more advanced than either Learning Perl or Programming Perl.
Prerequisites:
You must have already taken and passed CS 2511 (Software Development)
and Math 3355 (Discrete Mathematics). If you have not already taken and
passed both of these classes you must drop this class.
Grading Basis:
- Quizzes : 10%
- Exam1 : 20% (Thu, Feb 21)
- Exam2 : 20% (Date TBA)
- Programming Assignments + Project : 25%
- Final Exam : 25% (Wed, May 15, 4:00-6:00 pm)
Programming Assignments and Project:
Programming assignments and your project are to be completed in Perl.
There will be 4 to 6 programming assignments and one project. The
assignment and project together account for 25% of your grade. The exact
grading breakdown between the assignments and project is yet to be
determined. Each programming assignment is worth 10 points.
All programming assignments and your project will be demonstrated during
designated lab sessions. You should also submit an electronic copy of
your source code to the TA prior to the designated demo session. (His
email address is patw0006@d.umn.edu.) There is no other way to submit
your programming assignments or project. Failure to submit AND demo on
time will result in a zero.
Any code you submit should be commented. I must be able to understand
what your code does simply by reading the comments. This understanding
should extend down to the details of your code. So do not simply
describe the input and output, also include comments that describe
your particular algorithm and coding techniques. Failure to comment
to this degree will result in a zero.
All assignments and the project are to be done individually. You are
required to write your own code. Unless otherwise specified, you must
only turn in code that you personally wrote. The only possible exception
to this is if I tell you to use a module that is available in a book
or online archive. However, I will clearly indicate when this is
permissible. Violations of this policy will result in severe grading
penalties and/or failure in the class.
Quizzes:
There will be unannounced quizzes in the lecture that may concern any
topics discussed up to and including the previous lecture session.
They may also include any readings assigned up to and including the
previous week. There will be scheduled quizzes during the lab that will
require you to write and demo a short program during the lab period. There
will be approximately 8-10 quizzes, and your lowest score will be dropped.
Quizzes may only be made up in the case of documented personal
emergencies. Each quiz is worth 10 points.
Exams:
All exams are closed-note, closed-book.
You must take exams at the scheduled time and place. Exams will
not be given early. Make-up exams will only be offered in the
event of documented personal emergencies.
Grading Scale:
- 93 - 100 = A, 90 - 92 = A-
- 87 - 89 = B+, 83 - 86 = B, 80 - 82 = B -
- 77 - 79 = C+, 73 - 76 = C, 70 - 72 = C -
- 67 - 69 = D+, 63 - 66 = D, 60 - 62 : D -
- 0 - 59 = F
Grades will be curved only if the overall median grade at the end of the
semester is less than 75. In this case grades will be curved so that the
median class grade is 75.
Lecture Notes:
After a few years of experimentation, I have concluded that posting
lecture notes online isn't especially helpful, and can in some cases
act as a deterrent to attending lecture and/or taking careful notes in
class. Since attendance and note taking are important life skills in
general, and will help you in this class in particular, I will not be
posting or distributing my lecture notes.
However, if you have some temporary or permanent disability that affects
your ability to take notes then please let me know at the start of the
semester and I will make alternate arrangements with you.
Equal Access:
If you have any disability (either permanent or temporary) that might
affect your ability to perform in this class, please inform me at the
start of the semester. I may adapt methods, materials, or testing so that
you can participate equitably. To learn about the services that UMD
provides to students with disabilities, contact the Access Center, 138
Kplz, phone 8217 or visit their
web page.
By:
Ted Pedersen
- tpederse@d.umn.edu
Last update: 01/20/2002