Experience with traditional forms of metadata has shown that it is expensive and time-consuming to produce, that people (e.g., authors) often resist creating it when there is no immediate or direct benefit, and that information-seekers often find it difficult to relate their requests to pre-specified ontology's or controlled vocabularies. Generating a common ontology for a domain also tends to be controversial. New standards for communicating metadata, such as XML, do nothing to address the underlying issue of where it originates. Controlled vocabularies and relatively static ontology's are not solid foundations for information systems that must cover a wide range of subjects, support rapid integration of new information, be easy for the general population to use, and can only be maintained at moderate expense. Large-scale use of metadata requires new answers to fundamental questions. Recently, the Digital Government program of the National Science Foundation has funded a number of projects to address the challenge of integrating large, heterogeneous, widely distributed and disparate Government data collections. This paper describes two complementary approaches: large ontology-based data access planning using small domain models semi-automatically acquired, and dynamic metadata creation from language models.
National Science Foundation, Digital Government Program: http://www.digitalgovernment.org/