CS 5761 - Introduction to Natural Language Processing - Spring 2004
Instructor:
Dr. Ted Pedersen
Office: 309 Heller Hall
Office Hours: Mon, Wed 3:00 - 4:30 pm
Email: tpederse@d.umn.edu
TA:
Bridget McInnes
Office Hours (314 Heller Hall) : Weds 9 - 10 am, Thu 12 - 1 pm, Fri 8 - 10 am
Email: bthomson@d.umn.edu
Please consider using email for smaller or fairly precise questions. We
can usually respond quite quickly and at odd hours via email. Send any
email questions to both tpederse and bthomson.
Class Web Page:
http://www.d.umn.edu/~tpederse/Courses/CS5761-SPR04/class.html
Course Objectives:
Explore techniques for creating computer programs that analyze, generate,
and understand natural human language. Topics include syntactic analysis,
semantic interpretation, and discourse processing. Applications selected
from speech recognition, conversational agents, machine translation, and
language generation. Substantial programming project required.
Required Text:
Speech and Language Processing : an Introduction to Natural Language
Processing, Computational Linguistics, and Speech Recognition by Daniel
Jurafsky and James H. Martin. Prentice-Hall. There is a
supporting Web Site
with quite a bit of information.
There is a copy of the text on 2-hour reserve in the library. This is not
meant to replace your own copy, but may prove useful if you find yourself
in the library without your text.
Reading assignments will be given in the lecture and posted
here. Our focus will be on written language,
and we will not consider the many very interesting issues involved in
processing spoken language. We will also augment readings from the text
with outside material that I will provide.
Suggested Perl Texts:
We will do our programming assignments in Perl. While we will discuss
Perl from time to time in the lab and lecture, there will be a fair bit of
self-study required. As such you are strongly advised to have at least
one of the following at your disposal:
Learning Perl by Randal Schwartz and Tom Phoenix. O'Reilly
Publishers. AKA the Llama book.
You can get this book from
amazon.com , or any other bookstore.
You can also get an electronic
copy of this book from the UMD library
via Safari!
This book takes a tutorial approach and is especially good if you have
limited C or Unix experience.
Programming Perl by Larry Wall, Tom Christiansen and Jon Orwant.
O'Reilly Publishers. AKA The Camel Book.
You can get this book from
amazon.com
or any other bookstore. You can also get an electronic copy of this
book from the UMD library via Safari!
This book is more like a reference manual, although it is still very
readable. This is a good choice if you have some C and Unix background.
Mastering Regular Expressions: Powerful Techniques for Perl and Other
Tools by Jeffrey E. Friedl. O'Reilly Publishers. AKA The Owl Book.
You can get this book from
amazon.com , or any other bookstore. You can also get an electronic
copy of this book from the UMD library
via Netlib! This is an in-depth
treatment of regular expressions in Perl and is quite invaluable. It is
more advanced than either Learning Perl or Programming Perl.
Prerequisites:
You must have already taken and passed CS 2511 (Software Development)
and Math 3355 (Discrete Mathematics). If you have not already taken and
passed both of these classes you must drop this class.
Grading Basis:
- Quizzes : 10%
- Programming Assignments : 15%
- Project : 20%
- Exam 1 : 20% (Wed, Feb 25)
- Exam 2 : 20% (Wed, Mar 31)
- Final Exam : 20% (Thu, May 13, 10:00-12:00)
- Note that your low exam (of the 3) will be weighted at 15%.
Programming Assignments and Project:
Programming assignments and your project are to be completed in Perl.
There will be 4 to 6 programming assignments and one project. Each
programming assignment is worth 10 points.
All programming assignments and your project will be demonstrated during
designated lab sessions. You should also submit an electronic copy of
your source code to the
webdrop prior to the designated demo session. If you aren't familiar
with webdrop, here are the
instructions. There is no other way to submit your programming
assignments or project. Failure to submit AND demo on time will result in
a zero.
Please comment your code. I must be able to understand what your code
does simply by reading the comments. This understanding should extend
down to the details of your code. So do not simply describe the input and
output, also include comments that describe your particular algorithm and
coding techniques. I reserve the right to deduct some of all of the
points from an assignment if you don't comment your code to this degree.
Unless otherwise indicated, all assignments and the project are to be done
individually. You are required to write your own code. Unless otherwise
specified, you must only turn in code that you personally wrote. The only
possible exception to this is if I tell you to use a module that is
available in a book or online archive. However, I will clearly indicate
when this is permissible. If you submit code that is not your own, I
reserve the right to give you a zero on that assignment.
Quizzes:
There will be unannounced quizzes in the lecture that may concern any
topics discussed up to and including the previous lecture session. They
may also include any readings assigned up to and including the previous
week. There will be scheduled quizzes during the lab that will require
you to write and demo a short program during the lab period. There will
be approximately 8-10 quizzes, and your lowest score will be dropped.
Quizzes may only be made up in the case of documented personal
emergencies. Each quiz is worth 10 points.
Exams:
All exams are closed-note, closed-book.
You must take exams at the scheduled time and place. Exams will
not be given early. Make-up exams will only be offered in the
event of documented personal emergencies.
Grading Scale:
- 93 - 100 = A, 90 - 92 = A-
- 87 - 89 = B+, 83 - 86 = B, 80 - 82 = B -
- 77 - 79 = C+, 73 - 76 = C, 70 - 72 = C -
- 67 - 69 = D+, 63 - 66 = D, 60 - 62 : D -
- 0 - 59 = F
Grades will be curved only if the overall median grade at the end of the
semester is less than 75. In this case grades will be curved so that the
median class grade is 75.
Equal Access:
If you have any disability (either permanent or temporary) that might
affect your ability to perform in this class, please inform me at the
start of the semester. I may adapt methods, materials, or testing so that
you can participate equitably. To learn about the services that UMD
provides to students with disabilities, contact the Access Center, 138
Kplz, phone 8217 or visit their
web page.
By:
Ted Pedersen
- tpederse@d.umn.edu