ColloquiaLinguistics talk reminder

Sean Andrew Fulop sfulop at uchicago.edu
Wed Mar 6 15:08:41 CST 2002





            The Dept. of Linguistics and the Chicago Linguistic Society
            present 

            Syntactic Constituents and Lexical Categories 
            (What your syntax textbook took for granted) 



            by SEAN ANDREW FULOP 
            University of Chicago 
            Thursday, March 7, 2002 
            Harper 103, 3:00-4:30 pm 


Modern syntactic theory of many different stripes (e.g.\ X-bar
theoretic, HPSG) is burdened by poorly justified basic constructs.  In
particular the concepts "lexical category" and "syntactic constituent"
which underlie so many theoretical frameworks are intuitively (and thus
loosely) understood by linguists, and have no rigorous foundation.
Bloomfield (1933) and the American structuralists who followed (Wells
1947; Harris 1951) did formulate a particular concept that in a sense
replaces both notions which, while still slightly unclear, is amenable
to formalization and ultimate use as part of the foundation for
syntactic theory.  To begin this talk, I will survey what has been said
in the past three decades of expository syntax literature about
syntactic constituents and lexical categories in order to show that
there is no formal foundation there.

I will then explain two closely related distributional concepts that can
serve as a rigorous foundation for something like the usual notions of
constituent and category, although part of the point will be to argue
that the usual notions are fatally flawed and cannot lead to adequate
grammars that are maximally linguistically satisfying (i.e.
explanatory).
First, I discuss the structuralist construct which Bloomfield called a
"form class," which is in essence any class of word sequences having at
least one common environment in grammatical sentences.  What we know
today as NPs, for instance, may constitute a (very large) form class.
However, "John cooked" and "Mary ate" are also part of a single form
class, since they share the legal environment "____ beans."
I will show how the most important form classes in any sentence can be
discovered using procedures amenable to automated corpus analysis, but
which are impossible for someone to perform by inspection.  The
discovery of these form classes provides the sentence with a
hierarchical structure that Wells called its immediate constituent
analysis.  Automation of this analysis is what the structuralists would
have needed to make their
ideas catch on; Chomsky's (1957) simplification of the form class
construct allowed the analysis to proceed by inspection so that
linguists could do something with it, but such analysis doesn't quite
work because it is no longer based on a concept that functions properly
in natural languages.

Second, I discuss the prospect of automatically learning lexical
categories expressed in a categorial grammar framework.  Such categories
can be derived from a corpus of sentences which have skeletal semantic
structures assigned, using a procedure (Fulop forthcoming) which relies
on a specified theory of the syntax-semantics interface that is
compositional but which allows semantic and syntactic constituents to
differ.
The categories, similar to form classes, are learned by a subtle
distributional analysis called optimal unification (Buszkowski and Penn
1990), and unlike traditional lexical categories they exactly determine
the syntactic behaviors of the words in the lexicon. 

-- 

Sean A. Fulop
Visiting Assistant Professor
Depts. of Linguistics (and CS)
The University of Chicago
1010 E. 59th Street
Chicago, IL 60637
http://people.cs.uchicago.edu/~sfulop/index.htm



More information about the Colloquium mailing list