ES3 Project
Description
Large data sets derived from environmental models and global satellite imagery are managed and published by a relatively small number of Earth science data centers. Thousands of Earth system scientists use these data sets to develop algorithms for generating higher-level data products. Rather than submitting these new data products back to a centralized repository, a new approach to managing Earth science data involves scientists disseminating their products directly to their research communities [1]. The Earth System Science Server (ES3) (formally Earth System Science Workbench - ESSW) is a nonintrusive data management infrastructure for these researchers who must also be data producers.
A directed acyclic graph (DAG) serves as a framework for defining the processing workflow of an environmental model. The DAG specifies a model's inputs, outputs and processes as"science objects". Metadata templates, defined with XML Document Type Definitions (DTDs), specify the set of metadata elements for each science object. Each execution of the model, or processing activity, is called an "experiment". ES3 records metadata about science objects during each experiment, hence providing a processing history, or data "lineage" of each science object. ES3 is nonintrusive in that it relies on scripting to track science objects. These scripts, or "wrappers", log metadata about science objects with minimal alteration to existing processing methods.
By using the components of ES3, researchers can:
· store metadata for their experiments with little effort,
· reveal the lineage of their data products,
· manage and control product storage, and
· efficiently use distributed computer resources.These concepts are implemented in ES3's Lab Notebook, No Duplicate-Write Once Read Many (ND-WORM) services, and products and users services. The Lab Notebook, a Java client/server application, logs metadata and lineage for experiments and their constituent science objects to XML documents stored in a relational database. The ND-WORM provides a managed storage archive for the Lab Notebook by keeping unique file digests and namespace metadata in a relational datase. A conceptual diagram of ES3 is below. More detailed description of ES3 components can be found in "An Overview of the Earth System Science Workbench" and in other documents.
Acknowledgements
ES3 is part of the NASA REASoN project titled "Multi-resolution Snow Products for the Hydrologic Sciences", which is a component of the ESIP Federation.
References
[1]. National Research Council, Global Environmental Change: Research Pathways for the Next Decade. Washington, DC: National Academy Press, 1999.


