SenseClusters Publications
These publications describe the development and use of the
SenseClusters
package. Papers prior to 2003 trace the origins of the methodology.
2019
2015
2013
-
Duluth: Word Sense Induction Applied to Web Page Clustering (Pedersen) -
Appears in the Proceedings of the 7th International Workshop on Semantic
Evaluation (SemEval 2013), in conjunction with the Second Joint Conference
on Lexical and Computational Semantics (*SEM-2013), June 13-15, 2013,
pp. 202-206, Atlanta, Georgia.
2010
-
The Effect of Different Context Representations on Word Sense
Discrimination in Biomedical Texts (Pedersen) - Appears in the
Proceedings of the 1st ACM International Health Informatics
Symposium, November 11 - 12, 2010, pp. 56 - 65, Arlington, VA.
[acceptance rate 17%]
-
Computational Approaches to Measuring the Similarity of Short Contexts :
A Review of Applications and Methods (Pedersen),
University of Minnesota Supercomputing Institute Research Report UMSI
2010/118, October 2010. (Also available from CMP-LG E-Print Archive as
0806.3787)
-
Duluth-WSI: SenseClusters Applied to the Sense Induction Task of SemEval-2
(Pedersen) - Appears in the Proceedings of the SemEval 2010
Workshop : the 5th International Workshop on Semantic Evaluations, July
15-16, 2010, pp. 363-366, Uppsala, Sweden
2009
2008
2007
-
UMND2 : SenseClusters Applied to the Sense Induction Task of
Senseval-4 (Pedersen) - Appears in the Proceedings of SemEval-2007:
4th International Workshop on Semantic Evaluations, June 23-24, 2007,
pp. 394-397, Prague, Czech Republic.
-
Unsupervised Discrimination of Person Names in Web Contexts
(Pedersen and Kulkarni) - Appears in the Proceedings of the Eighth
International Conference on Intelligent Text Processing and
Computational Linguistics, pp. 299-310, February 18-24, 2007, Mexico
City. [acceptance rate 29%]
Download the
data used in this paper (Kulkarni name corpus).
-
Discovering Identities in Web Contexts with Unsupervised Clustering
(Pedersen and Kulkarni) - Appears in the Proceedings of the
IJCAI-2007 Workshop on Analytics for Noisy Unstructured Text Data,
pp. 23-30, January 8, 2007, Hyderabad, India.
Download the
data used in this paper (Kulkarni name corpus).
2006
-
Determining Smoker Status using Supervised and Unsupervised Learning with
Lexical Features (Pedersen) - Appears in the Working Notes of the
i2b2 Workshop on Challenges in Natural Language Processing for Clinical
Data, Nov 10-11, 2006, Washington, DC.
-
Unsupervised Context Discrimination and Automatic Cluster Stopping
(Kulkarni and Pedersen),
University of Minnesota Supercomputing Institute
Research Report UMSI 2006/90, August 2006. [Note: This is Anagha's MS
thesis, from July 2006.]
-
How many different "John Smiths", and who are they?
(Kulkarni and Pedersen) - Appears in the Proceedings
of the Twenty-First National Conference on Artificial Intelligence,
pp. 1885-1886, July 19, 2006, Boston, MA. (Student Poster)
-
Unsupervised Corpus Based Methods for WSD (Pedersen), In Agirre, E.
and Edmonds, P. (Editors), Word Sense
Disambiguation : Algorithms and Applications, June 2006, pp.
133-166, Springer.
-
Automatic Cluster Stopping with Criterion Functions and the Gap Statistic
(Pedersen and Kulkarni), Appears in the Proceedings of the
Demonstration Session of the Human Language Technology Conference and the
Sixth Annual Meeting of the North American Chapter of
the Association for Computational Linguistics, pp. 276-279, June 6,
2006, New York City.
-
Selecting the "Right" Number of Senses Based on Clustering Criterion Functions
(Pedersen and Kulkarni), Appears in the Proceedings of the Posters
and Demo Program of the Eleventh Conference of the European Chapter of
the Association for Computational Linguistics, pp. 111-114, April 5-7,
2006, Trento, Italy. [acceptance rate 40%]
-
Improving Name Discrimination : A Language Salad Approach (Pedersen,
Kulkarni, Angheluta, Kozareva, and Solorio) - Appears in the Proceedings
of the EACL 2006 Workshop on Cross-Language Knowledge Induction,
pp. 25-32, April 3, 2006, Trento, Italy.
Download the Bulgarian, English, Spanish, and Romanian
data used in this paper!
-
An Unsupervised Language Independent Method of Name Discrimination Using
Second Order Co-occurrence Features (Pedersen, Kulkarni, Angheluta,
Kozareva, and Solorio) - Appears in the Proceedings of the Seventh
International Conference on Intelligent Text Processing and
Computational Linguistics,
pp. 208-222,
February 19-25, 2006, Mexico City. [acceptance rate 30%]
Download the Bulgarian, English, Spanish, and Romanian
data and
stoplists used in this paper.
2005
-
Name Discrimination and Email Clustering using Unsupervised Clustering
and Labeling of Similar Contexts (Kulkarni
and Pedersen) - Appears in the Proceedings of the Second Indian
International Conference on Artificial Intelligence,
pp. 703-722, December 20-22, 2005,
Pune, India. [acceptance rate 35%] Download the data
used in this paper.
-
Identifying Similar Words and Contexts in Natural Language with
SenseClusters (Pedersen and Kulkarni) - Appears in the Proceedings
of the Twentieth National Conference on Artificial Intelligence,
pp. 1694-1695,
July 12, 2005, Pittsburgh, PA. (Intelligent Systems Demonstration)
Download the data
used in this demo.
-
Unsupervised Discrimination and Labeling of Ambiguous Names
(Kulkarni) - Appears in the Proceedings of the Student Research Workshop
of the 43rd Annual Meeting of the Association for Computational
Linguistics. pp. 145-150, June 27, 2005, Ann Arbor, MI. [acceptance rate
28%] Download the data
used in this paper.
-
SenseClusters: Unsupervised Clustering and Labeling of Similar Contexts
(Kulkarni and Pedersen) - Appears in the Proceedings of the Demonstration
and Interactive Poster Session of the 43rd Annual Meeting of the
Association for Computational Linguistics, pp. 105-108, June 26, 2005,
Ann Arbor, MI. [acceptance rate 55%]
Download the
data
used in this paper.
-
Resolving Ambiguities in Biomedical Text with Unsupervised Clustering
Approaches
(Savova, Pedersen, Purandare and Kulkarni) - University of Minnesota
Supercomputing Institute Research Report UMSI 2005/80 and CB Number
2005/21, May.
-
Name
Discrimination by Clustering Similar Contexts (Pedersen, Purandare,
and Kulkarni) - Appears in the Proceedings of the Sixth International
Conference on Intelligent Text Processing and Computational
Linguistics, pp. 220-231, February 13-19, 2005, Mexico City. [acceptance
rate 37%]
Download the data
used in this paper.
2004
-
Improving Word Sense Discrimination with Gloss Augmented Feature Vectors
(Purandare and Pedersen) - Appears in the Proceedings of the Workshop on
Lexical Resources for the Web and Word Sense Disambiguation, pp. 123-130,
November 22, 2004, Puebla Mexico.
-
Word Sense Discrimination by Clustering Similar Contexts
(Purandare and Pedersen), University of Minnesota Supercomputing
Institute Research Report UMSI 2004/146, September 2004. [Note: This is
Amruta's MS thesis, from August 2004.]
-
Discriminating Among Word Meanings by Identifying Similar Contexts
(Purandare and Pedersen) - Appears in the Proceedings of the Nineteenth
National Conference on Artificial Intelligence (AAAI-04), pp. 964-965,
July 25-29, 2004, San Jose, CA (Student Abstract)
[ppt]
-
SenseClusters - Finding Clusters that Represent Word Senses
(Purandare and Pedersen) - Appears in the Proceedings of the Nineteenth
National Conference on Artificial Intelligence (AAAI-04), pp. 1030-1031,
July 25-29, 2004, San Jose, CA (Intelligent Systems Demonstration)
-
Word Sense Discrimination by Clustering Contexts in Vector and Similarity
Spaces (Purandare and Pedersen) - Appears in the Proceedings of the
Conference on Computational Natural Language Learning (CoNLL),
pp. 41-48, May 6-7, 2004, Boston, MA. [acceptance rate 48%]
-
SenseClusters - Finding Clusters that Represent Word Senses
(Purandare and Pedersen) - Appears in the Proceedings
of Fifth Annual Meeting of the North American Chapter of the
Association for Computational Linguistics (NAACL-04),
pp. 26-29, May 3-5, 2004, Boston, MA. (Demonstration System)
2003
1998
-
Knowledge Lean Word Sense Disambiguation
(Pedersen & Bruce) - Appears in the Proceedings of the Fifteenth
National Conference on Artificial Intelligence (AAAI-98), p. 800-805,
July 28-30, 1998, Madison, WI [acceptance rate 30%]
-
Raw Corpus Word Sense Disambiguation
(Pedersen) - Appears in the Proceedings of the Fifteenth
National Conference on Artificial Intelligence (AAAI-98),
p. 1198, July 28-30, 1998, Madison, WI (Student Poster)
1997
-
Distinguishing Word Senses in Untagged Text
(Pedersen & Bruce) - Appears in the Proceedings of the Second
Conference on Empirical Methods in Natural Language Processing
(EMNLP-2),
pp. 197-207,
August 1-2, 1997, Providence, RI. [acceptance rate 35%]
(Also available from CMP-LG E-Print Archive as
#9706008
)
-
Knowledge Lean Word Sense Disambiguation
(Pedersen) - Appears in the Proceedings of the Fourteenth
National Conference on Artificial Intelligence (AAAI-97),
p. 814, July 27-31, 1997, Providence, RI (Doctoral Consortium)
By:
Ted Pedersen
- tpederse AT d umn edu