Case Study - 2001 Global Solar Radiation Model
2001 was one of the first models incorporated into ESSW and was used as a test case for the usefulness of ESSW.
The first step to utilizing ESSW was to visually articulate the specifics steps necessary to run 2001 in the form of a Directed Acyclic Graph (DAG). In the case of 2001 this resulted in 7 items For each of these a Document Type Definition (DTD) was created, namely,
1) ftp_data.dtd
2) idl_preprocess.dtd
3) input_2001_D1.dtd
4) run_2001_D1.dtd
5) inst_D1_output.dtd
6) calc_aveday_D1.dtd
7) D1_2001product.dtd
Each dtd is designed to identify and save useful information about the processing step, the who, what, where, when, why for each step. It is useful to think backwards, from the kinds of questions you are likely to ask later about the processing steps or data used, to what sort of metadata is useful to capture now.
======================
Description of each dtd.
The following is a brief description of each of the seven dtds. A list of the associated tags (metadata) for each dtd is listed in parenthesis after the name.
1) ftp_data.dtd (data_format, data_resolution, date_processed?, ftp_date,
ftpsource, local_data_file, platform, provider, sensor,
user_name-psswd?, what_is_it)This dtd records information associated with the acquisition of data via ftp. This is the first step in the 2001 processing stream. The source for the data is outside our local domain, in this case ISCCP D1 data from the Langley Research center.
It is important to include the date of acquisition, since there may be different versions of data from the same provider for the same day. Ideally this information would allow the reacquisition of the source data.
Caveats: Websites may decay with age. Sensors/Providers may
change over time.2) idl_preprocess.dtd (account_used, host_name, idl_version, local_script_file, operating_system, who?)
This dtd records metadata associated with the running of an IDL script in general and the preparation of 2001 input data in particular. Including platform, version, who and a filename where the actual IDL commands can be found.This particular IDL script reads in the ISCCP D1 data acquired via ftp in an hdf format and spits out files required as inputs to the 2001 model. Some simple calculations are made, such as totaling the amount of precipitable water from the individual layers provided. Also, there are conversions into the appropriate units expected by 2001. It is at this point where the spatial domain of the available ISCCP data is cropped from the entire globe, to all longitudes from 60 south to 60 north latitude.
To minimize calculations performed by 2001, these input data are saved as a one dimensional vector which is later (calc_daily_ave_D1.dtd) transformed to fit a regular grid.
3) input_2001_D1.dtd (data_2001_format, input_2001_date, input_param*,
local_2001_file)Describes the input files used in a single run of the 2001 model. The ISCCP D1 data is available every 3 hours. Each run of 2001 uses 14 input files.
Example (for 1200 GMT, January 15th, 1999):
93011512.c_sat 93011512.pw 93011512.snow
93011512.c_sun 93011512.rel_az 93011512.surf_alb
93011512.cf 93011512.sat_type latitude
93011512.ozone 93011512.scrad_cld longitude
surf_type visibility
Caveats: these are binary files, so their representation will be platform dependant.
4) run_2001_D1.dtd (account_used, command_line, control_file,
host_name, operating_system, who?)This dtd keeps track of the metadata associated with the actual running of the 2001 model. (i.e. the who, when, where, platform, etc.).
5) inst_D1_output.dtd (dimension, format, inst_data_date, latrange,
longrange, out*, resolution)This dtd contains metadata about the attributes of the 2001 model
output. It keeps track of what parameter is produced (i.e. SW, PAR, UVA, UVB), its dimensions (rows/columns, latrange/longrange), format and what the output filenames (and pathname) are.For each parameter there are 3 associated files. For example, SW,
SW clear and CL_SW where,SW = instantaneous SW calculated from all given inputs
SW.clear = instantaneous SW calculated for clear sky conditions
CL.SW = A model related parameter which relates SW to SW.clearThe CL parameter is independent of solar zenith angle (time of day)and is used to calculate daily averages of a given parameter.
6) calc_daily_ave_D1.dtd (account_used, host_name, idl_version,
local_script_file, operating_system, what_is_it, who?)This dtd keeps track of metadata associated with the IDL process
used to determine daily averages from 8 instantaneous values calculated by 2001. In addition this step converts the data from a one dimensional vector (representing the equal area data given by ISCCP) onto a 2 dimensional regular grid (an equal angle projection = 2.5 x 2.5 degrees).
7) D1_2001product.dtd (consumer?, creation_date, filename, format,
lower_left_lat, lower_left_lon, product_date, product_type,
resolution, start_date, temporal_span, upper_right_lat,
upper_right_lon)This dtd saves information about the finished product which are most important to the end user, such as the spatial/temporal extent and
resolution of the product, who made it when, filename and format, and
who might use it in the future.
=====================================================
Have dtd's will slurp.....
There are two basic strategies for having your model runs inform the database. In the first case, the calls to the database are coded into the "wrapper scripts" that run your model. These access the database deamon and send information related to the tags in your dtds which are then available for the database. The second method, used by the 2001 model, is to have the individual processing steps write out simple flat files as they go. These have the format:
dtd_name | dtd_tag_name | valueThese flat files are later ingested into the database in a separate operation. The benefit of this later method is that glitches in the operation of the database do not interrupt the running of your model.
======================================================
The final steps of ESSW-izing your model is to build some kind of product interface. The 2001 model has a product ordering tool at,
http://essw.bren.ucsb.edu/cgi-bin/2001.cgi where you can select one of the 4 products, a date range, format and get a chance to look at browse images before you order and receive products.

