CS 8761 - Natural Language Processing - Fall 2004
Instructor: Dr. Ted Pedersen
Office: 309 Heller Hall
Office Hours: Mon & Wed 4:45pm-6:00pm, Thu 1:00pm-2:00pm
Email: tpederse@umn.edu
Course Web Page:
http://www.d.umn.edu/~tpederse/Courses/CS8761-FALL04/class.html
Course Mailing List:
http://groups.yahoo.com/group/duluth-cs8761-fall2004/
Please make sure you are signed up for this mailing list, and that you
check it regularly. There are online archives available, and you are also
free to post your own questions and comments.
Course Objectives:
Natural Language Processing seeks to analyze, generate, and understand
human language via computational techniques. This course focuses on
empirical approaches to lexical and syntactic analysis, semantic
interpretation, and discourse processing. Specific applications include
part-of-speech tagging, machine translation, and authorship attribution.
Required Text:
Foundations of Statistical Natural Language Processing by
Christopher Manning and Hinrich Schutze. MIT Press.
There is a
supporting Web Site
with quite a bit of information.
Required reading assignments from Manning and Schutze will be given in the
lecture and also posted here.
There is a copy of the Manning and Schutze text on
2 hour reserve
in the UMD library.
Other Required Reading :
The following document (see
html or
pdf)
describes what plagiarism is, why it's a bad thing, and how you can
easily avoid it. You are required to read and understand this document
before submitting any assignment or project work.
Supporting Texts (optional):
There are two textbooks on 2-hour reserve in the library that may prove
useful. The first is by Daniel Jurafsky and James Martin,
Speech and Language Processing, Prentice-Hall. The second is by
Eugene Charniak,
Statistical Language Learning,
MIT Press.
The Charniak book focuses
on empirical methods, while the Jurafsky and Martin book is more general
in nature and includes some discussion of speech processing. Both are
excellent complements to our Manning and Schutze text.
Both of these books are available at the UMD library on
2 hour reserve.
Suggested Perl Texts:
We will do our programming assignments and projects in Perl on computers
running the Solaris Unix or Linux operating systems.
While we will discuss Perl from time to time in the lecture, there will
be a fair bit of self-study required. As such you are strongly advised
to have at least one of the following at your disposal:
Learning Perl (3rd Edition) by Randal Schwartz and Tom Phoenix. O'Reilly
Publishers. You can get this book from
amazon.com
or most any bookstore. This takes a tutorial approach and is
especially good if you have limited C or Unix experience.
Programming Perl (3rd Edition) by Larry Wall, Tom Christiansen, and Jon
Orwant. O'Reilly Publishers. You can get this book
from
amazon.com or most any bookstore. This book is more like a reference
manual than Learning Perl, although it is still very readable. This is a
good choice if you have extensive C and Unix backgrounds.
Mastering Regular Expressions: Powerful Techniques for Perl and Other
Tools (2nd Edition) by Jeffrey E. Friedl. O'Reilly Publishers. You can get
this book from
amazon.com , or most any bookstore. This is an in-depth treatment of
regular expressions in Perl and is the best available reference on this
topic.
You are strongly encouraged to view the books
available
via the online
Safari service from O'Reilly
Publishers - you will find the complete text of many Perl and Linux books
availble here, including all three of the Perl books mentioned above!
Prerequisites:
This class is only open to currently enrolled CS graduate students.
Grading Basis:
- Quizzes : 10%
- Programming Assignments : 15%
- Project : 30%
- Midterm Exam : 20% (TBA)
- Final Exam : 25% (Saturday, December 18, 4:00pm - 5:55pm)
Grading Scale:
- 93 - 100 = A, 90 - 92 = A-
- 87 - 89 = B+, 83 - 86 = B, 80 - 82 = B -
- 77 - 79 = C+, 73 - 76 = C, 70 - 72 = C -
- 67 - 69 = D+, 63 - 66 = D, 60 - 62 : D -
- 0 - 59 = F
Programming Assignments:
Programming assignments are to be completed in Perl and must be
submitted on time. Late work is not accepted and will result in a score
of zero for that assignment. You must use the web drop link on the class
web page to turn in assignments.
Each programming assignment is worth 10 points. There will be 3-5
programming assignments.
All programming assignments are individual. You are required to write your
own code. If you turn in code that is not your own (e.g., taken from a
book or online archive, written by a friend, etc.) the best you can hope
for is a zero on that assignment. I reserve the right to deal more
harshly with such cases if I deem it necessary.
Final Project:
You will be assigned to a team and given a challenging problem in natural
language processing to tackle. You must deliver a software solution and a
final report that includes a discussion of your team's solution, an
evaluation of its effectiveness, and a survey of related work. All teams
will work on the same problem and we will have a comparative evaluation
to see how well each team fares relative to the others.
You are expected to collaborate and work closely with your teammates. You
may not collaborate with anyone outside of your team for any reason. All
members of a team will receive the same grade. Your team must produce
original work. Any type of plagiarism, whether it is deliberate or
accidental, will be dealt with harshly.
Exams:
All exams are closed-note, closed-book. You must take exams at the
scheduled time and place. Exams will not be given early. Make-up exams
will only be offered in the event of documented personal
emergencies. The final exam must be taken at the date and time as
determined by the official university schedule (Saturday, December 18).
Quizzes:
All quizzes are closed-note, closed-book. They may cover any topic
discussed in the lecture or included in the assigned readings. They will
not be announced ahead of time. If you are not in the lecture at the time
the quiz is given you will receive a 0. Your low quiz score will be
dropped. We will have at least 8 but no more than 12 quizzes. Each quiz
will be worth 10 points.
Equal Access:
If you have any disability (either permanent or temporary) that might
affect your ability to perform in this class, please inform me at the
start of the semester or as soon as you learn of such a condition. I may
adapt methods, materials, or testing so that you can participate
equitably. To learn about the services that UMD provides to students with
disabilities, contact the Access Center, 138 Kplz, phone 8217 or visit
their
web page.
By:
Ted Pedersen
- tpederse@umn.edu