Enron Email Corpus by Topic
We are creating a topic annotated version of the Enron email corpus. We
will then compare the results of automated clustering of these email
messages with our manual annotations using
SenseClusters.
-
Annotated Data
EnronData-v0.03 (approx 3,000 annotated messages, released March 7,
2006)
-
Software
Coder2Sval-v0.03 (released Nov 18, 2005) These programs convert the
annotated data into a form suitable for use with SenseClusters.
Acknowledgments
The annotation of this data and development of supporting software was
carried out by Apurva Padye.
Please visit her
Enron page
for additional background on Enron the company and the email corpus.