ColloquiaLinguistics talk reminder
Sean Andrew Fulop
sfulop at uchicago.edu
Wed Mar 6 15:08:41 CST 2002
The Dept. of Linguistics and the Chicago Linguistic Society
present
Syntactic Constituents and Lexical Categories
(What your syntax textbook took for granted)
by SEAN ANDREW FULOP
University of Chicago
Thursday, March 7, 2002
Harper 103, 3:00-4:30 pm
Modern syntactic theory of many different stripes (e.g.\ X-bar
theoretic, HPSG) is burdened by poorly justified basic constructs. In
particular the concepts "lexical category" and "syntactic constituent"
which underlie so many theoretical frameworks are intuitively (and thus
loosely) understood by linguists, and have no rigorous foundation.
Bloomfield (1933) and the American structuralists who followed (Wells
1947; Harris 1951) did formulate a particular concept that in a sense
replaces both notions which, while still slightly unclear, is amenable
to formalization and ultimate use as part of the foundation for
syntactic theory. To begin this talk, I will survey what has been said
in the past three decades of expository syntax literature about
syntactic constituents and lexical categories in order to show that
there is no formal foundation there.
I will then explain two closely related distributional concepts that can
serve as a rigorous foundation for something like the usual notions of
constituent and category, although part of the point will be to argue
that the usual notions are fatally flawed and cannot lead to adequate
grammars that are maximally linguistically satisfying (i.e.
explanatory).
First, I discuss the structuralist construct which Bloomfield called a
"form class," which is in essence any class of word sequences having at
least one common environment in grammatical sentences. What we know
today as NPs, for instance, may constitute a (very large) form class.
However, "John cooked" and "Mary ate" are also part of a single form
class, since they share the legal environment "____ beans."
I will show how the most important form classes in any sentence can be
discovered using procedures amenable to automated corpus analysis, but
which are impossible for someone to perform by inspection. The
discovery of these form classes provides the sentence with a
hierarchical structure that Wells called its immediate constituent
analysis. Automation of this analysis is what the structuralists would
have needed to make their
ideas catch on; Chomsky's (1957) simplification of the form class
construct allowed the analysis to proceed by inspection so that
linguists could do something with it, but such analysis doesn't quite
work because it is no longer based on a concept that functions properly
in natural languages.
Second, I discuss the prospect of automatically learning lexical
categories expressed in a categorial grammar framework. Such categories
can be derived from a corpus of sentences which have skeletal semantic
structures assigned, using a procedure (Fulop forthcoming) which relies
on a specified theory of the syntax-semantics interface that is
compositional but which allows semantic and syntactic constituents to
differ.
The categories, similar to form classes, are learned by a subtle
distributional analysis called optimal unification (Buszkowski and Penn
1990), and unlike traditional lexical categories they exactly determine
the syntactic behaviors of the words in the lexicon.
--
Sean A. Fulop
Visiting Assistant Professor
Depts. of Linguistics (and CS)
The University of Chicago
1010 E. 59th Street
Chicago, IL 60637
http://people.cs.uchicago.edu/~sfulop/index.htm
More information about the Colloquium
mailing list