CS 8761 - Natural Language Processing - Fall 2004
Class Information:
Syllabus
Required Reading and Suggested Activities
This class took place in Fall 2004 and is now over.
Project Downloads and Demos!
The final projects have been submitted! You can download any of these
automated essay grading systems, and also try them out via the web.
Project:
Automated Essay Grading
project overview. Project Teams. More about your team names
here.
Programming Assignments:
-
Assignment 1 Poor Man's LSA (Due Weds Sept 29, noon)
- Associated Press data for your
experiments. You may consider each line to be a context. (gz file)
- Assignment 2 Play the Shannon Game (Due Fri Oct 15, noon)
- Assignment 3 Retrieving Collocations with Mean-Variance (Due Mon Nov 1, noon)
- New York Times data for your
experiments. (.gz file)
- A stoplist that will work with
NSP, if you want to use it. Note that to make a fair comparison you should
also have your method eliminate stop words too if you use this!
- Assignment 4 Collect Essay Data (Due Mon Nov 8, noon)
- An
The corpus you collected! This
consists of 568 five-paragraph essays with prompts!
- The
River Plate Essay Corpus,
which consists of 250 essays (prompts and questions) collected and kindly
provided by the River Plate team.
Project Resources:
Finding Collocations
WordNet and Related Tools
Corpora
Machine Readable Dictionaries
General Language Processing Tools
Web Tools
Lots of Links to Resources
Sources of Essay Questions (These are reserved for the instructor, so don't use these for assignment 4)
Web Drop to Make Submissions:
click here
Perl Resources:
Sources of Text:
Google Resources:
Other Resources:
- Manning and Schutze
(Foundations of Statistical NLP resource pages - many many useful link)
- Safari Tech Books Online
(O'Reilly Publishers, many books about Perl and Linux online)
- WordNet
(a free online dictionary/lexical database/semantic network)
- UMD Library
(don't overlook this obvious source of information!)
By:
Ted Pedersen
- tpederse@umn.edu