Dette emnet er erstattet av IN4080 – Natural Language Processing.

Pensum/læringskrav

Introductory Textbook (Also Suitable for Self-Study)

Jurafsky, Daniel, and James H. Martin: Speech and Language Processing, 2008. Prentice Hall. (Second, Revised Edition; In Preparation). Draft On-Line Version.

Research Articles (Obligatory Readings)

Grefenstette, Gregory, and Pasi Tapanainen: "What is a Word, What is a Sentence? Problems of Tokenization" i Proceedings of The 3rd International Conference on Computational Lexicography, 1994. pp. 79–87. On-Line Copy.

Fred Karlsson: "Constraint Grammar as a Framework for Parsing Running Text" i Proceedings of the 13th International Conference on Computational Linguistics, 1990. pp. 168–173. On-Line Copy.

Adwait Ratnaparkhi: "A Maximum Entropy Model for Part-Of-Speech Tagging" i Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1996. pp. 133–142. On-Line Copy.

Samuelsson, Christer, and Atro Voutilainen: "Comparing a Linguistic and a Stochastic Tagger" i Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, 1997. 246–253. On-Line Copy.

Reynar, Jeffrey C, and Adwait Ratnaparkhi: "A Maximum Entropy Approach to Identifying Sentence Boundaries" i Proceedings of the 5th Conference on Applied Natural Language Processing, 1997. pp. 16–19. On-Line Copy.

Nivre, Joakim: "Two Strategies for Text Parsing" i A Man of Measure: Festschrift in Honour of Fred Karlsson on his 60th Birthday, 2006. pp. 440–448. On-Line Copy.

Charniak, Eugene: " Statistical Techniques for Natural Language Parsing" i AI Magazine, 1997. (On-Line Copy).

Collins, Michael: "Head-Driven Statistical Models for Natural Language Parsing" i Computational Linguistics, 2003. (On-Line Copy).

Klein, Dan, and Christopher D. Manning: "Accurate Unlexicalized Parsing" i Proceedings of the 41st Meeting of the Association for Computational Linguistics, 2003. (On-Line Copy).

Charniak, Eugene: " A Maximum-Entropy-Inspired Parser" i Proceedings of the 1st Annual Meeting of the North American Chapter of the Association for Computational Linguistics, 2000. (On-Line Copy).

Charniak, Eugene, and Mark Johnson: "Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking" i Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, 2005. (On-Line Copy).

Briscoe, Edward, and John A. Carroll : "Robust Accurate Statistical Annotation of General Text." i Proceedings of the 3rd International Conference on Language Resources and Evaluation, 2002. pp. 1499–1504. (On-Line Copy).

Carroll, John A., Edward Briscoe, and Antonio Sanfilippo: "Parser Evaluation. A Survey and a New Proposal" i Proceedings of the 1st International Conference on Language Resources and Evaluation, 1998. pp. 447–454. (On-Line Copy).

Briscoe, Edward, and John A. Carroll: "Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank" i Proceedings of the COLING|ACL 2006 Main Conference Poster Sessions, 2006. (On-Line Copy).

Nivre, Joakim, Johann Hall, Jense Nilsson, A. Chanev, G. Eryigit, S. Kübler, S. Marinov, and E. Marsi : "MaltParser: A Language-Independent System for Data-Driven Dependency Parsing" i Natural Language Engineering, 2007. 13 (2), pp. 95–135. On-Line Copy.

Gildea, Daniel: "Corpus Variation and Parser Performance" i Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, 2001. (On-Line Copy).

Kaplan, Ron, Stefan Riezler, Tracy King, John Maxwell, Alexander Vasserman, and Richard Crouch: "Speed and Accuracy in Shallow and Deep Stochastic Parsing." i Proceedings of the Human Language Technology Conference and the 4th Annual Meeting of the North American Chapter of the Association for Computational Linguistics, 2004. (On-Line Copy).

Toutanova, Kristina, Christopher D. Manning, Dan Flickinger, and Stephan Oepen: "Stochastic HPSG Parse Disambiguation using the Redwoods Corpus" i Research on Language and Computation, 2005. 3(1): pp. 83–105. (On-Line Copy).

Palmer, Martha, Daniel Gildea, and Paul Kingsbury: "The Proposition Bank: An Annotated Corpus of Semantic Roles" i Computational Linguistics, 2005. 31(1): pp. 71–106. (On-Line Copy).

Miyao, Yusuke, and Jun'ichi Tsujii: "Deep Linguistic Analysis for the Accurate Identification of Predicate-Argument Relations" i Proceedings of the 20th International Conference on Computational Linguistics, 2004. pp. 1392–1397. (On-Line Copy).

Background Reading (Optional or Project-Related)

Manning, Christopher D., and Hinrich Schuetze: Foundations of Statistical Natural Language Processing, 1999. The MIT Press.

Hagen, Kristin, Johannessen, Janne Bondi, and Anders Nøklestad: "A Constraint-Based Tagger for Norwegian" i Proceedings of the 17th Scandinavian Conference of Linguistics 1998, 1998. (On-Line Copy).

Brants, Thorsten: "TnT. A Statistical Part-of-Speech Tagger" i Proceedings of the 6th Conference on Applied Natural Language Processing, 2000. (On-Line Copy).

Berger, Adam, Stephen Della Pietra, and Vincent Della Pietra: "A Maximum Entropy Approach to Natural Language Processing" i Computational Linguistics, 1996. (On-Line Copy).

Charniak, Eugene: "Statistical Parsing with a Context-Free Grammar and Word Statistics" i Proceedings of the Fourteenth National Conference on Artificial Intelligence, 1997. (On-Line Copy).

Marcus, Mitch, Beatrice Santorini, and Mary Ann Marcinkiewicz: "Building a Large Annotated Corpus of English. The Penn Treebank" i Computational Linguistics, 1993. (On-Line Copy).

Petrov, Slav, Leon Barrett, Romain Thibaux, and Dan Klein: "Learning Accurate, Compact, and Interpretable Tree Annotation" i Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, 2006. (On-Line Copy).

Riezler, Stefan, Tracy H. King, Ronald M. Kaplan, Richard Crouch, John T. Maxwell, and Mark Johnson: "Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques" i Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002. (On-Line Copy).

Publisert 30. aug. 2007 17:34 - Sist endret 12. nov. 2007 12:28