RESEARCH TOPICS: QUESTIONS AND DIRECTIONS
Machine Learning and Information Extraction
My research in this area is motivated by the following basic questions. What are the fundamental limits of learning machines? Does an understanding of these limits direct us towards the construction of tractable learning paradigms? Can one use these learning paradigms to explain learning in the human or develop learning techniques for real-world applications? Do these learning paradigms allow us to usefully extract information from large amounts of partially organized data collected from the real world? I list below some of the specific research directions that I think are important and have made progress in:
With F. Girosi, I have obtained theoretical results on the fundamental limits of neural network learning. The results are quite general and illustrate the two fundamental sources of error due to learning: limited representational capacity of the hypothesis classes, and limited amounts of data. The characterization of the tension between these two sources of error allows us a way to choose neural networks of the right complexity for any learning problem. More generally, they allow us to trade-off the complexity of one's model with the amount of data available leading to learning paradigms like the support vector machines that I have investigated with V. Vapnik.
Novel Algorithms and Paradigms
One way to reduce the informational complexity of learning is by active learning---a mechanism of learning by choosing information selectively. In a series of papers (some jointly with K. K. Sung), I have considered various formulations of the problem, theoretically derived the conditions under which such techniques are likely to work for function learning and pattern classification, and developed applications to object detection and image retrieval. This direction is of crucial importance in the intelligent retrieval of information from large knowledge repositories where one has to derive intelligent ways of sampling the target space. It is also closely related to the general theme of incorporating prior knowledge usefully in machine learning tasks. Recently, in joint work with N.K. Karmarkar, I have developed a framework for unsupervised learning that is tractable in the sense that all algorithms within the framework provably converge to the globally optimal solution in polynomial time---a property that is rare since most frameworks typically use gradient-descent type learning schemes (backprop or EM) that converge to local solutions. A provably correct algorithm for clustering has been derived within the framework and extensions to various other kinds of learning problems are being considered. Applications in speech, vision, and data mining are being developed.
The Human Language System
There are two aspects of the human language system that fascinate me --- (1) that it is learnt (2) that it has two very different manifestations, in the physical world as speech and in the mental world as language. How the child might move from the highly variable, continuous, acoustic stream that it gets to the structured, discrete, symbolic representations of language poses some of the deepest unsolved scientific questions of our time. How we might get a computer to do the same presents some of the greatest technological challenges that we face. I list below some of the research directions I have chosen to conentrate on---each topic below has some connection to learning and/or recognition that I think is interesting.
Speech Recognition and Perception
Work with Victor Zue attempted to characterize speaker variability and incorporate articulatory constraints in speech recognition. More recently, with a variety of people at Bell Laboratories, I have been exploring alternative techniques for speech recognition. This is motivated by the fact that there seems to be good reason to believe that the lexicon is organized in terms of distinctive features and acoustic cues for these features are distributed in a non-uniform manner in the time-frequency plane. The research program has several sub-components including frameworks for the robust and accurate detection of distinctive features and for the integrating the asynchronous outputs of such feature detectors to form phonetic hypotheses. We are proceeding on these issues in parallel using techniques from machine learning, linguistic representations, and signal processing to construct a perceptually motivated approach towards speech recognition that seems promising at this point.
This is the classic learning problem that humans solve---they learn their native language. I have developed (jointly with R. C. Berwick) algorithms for the acquisition of syntax and analyzed the informational complexity of learning syntactic problems. However, syntax is only a small part of the language acquisition story---the child receives continuous speech inputs. From this it has to uncover the phonetic inventory, the phonological rules, the lexicon and so on. I am currently examining ways in which this sort of information can be extracted with a focus on acquiring phonetic and phonological knowledge. Progress would lead to computers that can automatically learn language directly from the speech signal---much as humans do.
A twist to the whole language acquisition story is provided by the fact that if children truly attained the language of the parental generation perfectly, then languages would be transmitted perfectly from generation to generation with no change. This however is not true since we know that languages change with time. By considering a population of language learners and taking ensemble averages over the population, one can derive models of language change. Such models are the evolutionary consequences of language learning. This has developed into an extremely promising direction of research and suggests a computational framework within which various aspects of historical linguistics and language evolution can be studied---something that was not possible before. In addition to the obvious applications to historical linguistics, there are strong algorithmic connections to genetic algorithms, artificial life, populations of interacting agents, computational economic agents and the like that I would like to explore further to shed light on the general theme of the interaction of learning with evolution.
My research portfolio, spanning as it does the fields of learning, language, and vision will result in many applications in the context of the following two areas of technology:
Multimodal, Intelligent, Adaptive, Human-Computer Interaction
Ultimately, we want to be able to build multimodal computer systems that interact with humans and learn from such interactions. My research on learning, vision, and language is aimed at this eventual goal. In the process of understanding the fundamental principles that would underlie the construction of such human computer interaction systems, various shorter term applications can be conceived, e.g. audio-visual speech recognition; combining speech recognition with natural language processing leading to spoken language systems; learning to recognize salient audio-visual cues that are correlated with end user objectives, combining handwriting recognition with language modeling etc.
Analysis and Retrieval of Large Knowledge Repositories
Increasingly, we are being forced to deal with huge amounts of data---large databases arising from linguistic corpora, image databases, internet browsing, neural data from fMRI and multiple electrode studies and so on. We will need to understand the structure of such data sets, store them and retrieve them intelligently. There is a natural nexus therefore with techniques that lie on the boundary of computer science and statistics---precisely where modern computational learning resides and I expect applications to emerge from my work in this area.