NAME
db2stationxml - export station metadata to FDSN StationXML format.
SYNOPSIS
#include "stationxml.h"
int db2stationxml(Dbptr db , Pf *pf, Arr *subset_rule, ostream &out, int flags)
DESCRIPTION
Db2stationxml will export the station metadata from an open Datascope database into the International Federation of Digital Seismic Networks (
FDSN)
Station XML (v.1.0) format. This format is designed to closely represent several of the major Standard for the Exchange of Earthquake Data (
SEED)
2.4 metadata structures. A detailed description of the StationXML format can be found at the FDSN website
http://www.fdsn.org/xml/station/
The output XML file contains the metadata tree, nested according to the network-station-channel-location
paradigm used in SEED.
Root
- Network(s)
- Station(s)
- Channel/Location combinations
- Response
Each network may contain multiple stations, which may in turn have multiple channels. Each channel may have a response that is described by multiple filter stages.
Also, Since there is not a 1-to-1 correspondence between metadata contained within css3.0 and SEED 2.4, not every stationxml element will be represented. This is documented in BUGS AND CAVEATS.
The station metadata is gathered from several tables located in the dbmaster directory.
[bu] network - network information
[bu] snetsta - SEED network and station mappings
[bu] site - station details, such as location
[bu] sitechan - channel details, such as instrument orientation
[bu] instrument - instrument name and sample rate
[bu] calibration - detector details
[bu] stage - detailed response information
Flags
The verbosity of the exported data is a number from 0 to 3, in order of increasing verbosity.
The following flags control the exported level of detail:
-
STAXML_LEVEL_NEWORK
Export network-level details, including only the network names and code.
-
STAXML_LEVEL_STATION
Export both network and station details. The station details include the location and description of the station.
-
STAXML_LEVEL_CHANNEL (default)
Export network, station, and channel details, including the instrument orientation and simple response summary
-
STAXML_LEVEL_RESPONSE
Export all details including the filter stage responses.
For verbose output at the station level:
int flags = 1 | STAXML_LEVEL_STATION;
FILES
The output StationXML roughly follows the format:
<FDSNStationXML>
<Network code="IU"... >
network details here, such as network name
<Station code="ANMO" startdate="2011-08-12T00:00:00"
endDate="2599-12-31T23:59:59">
station details here, such as latitude, longitude, depth, etc.
<Channel code="BHZ" location="00" startDate="2011-08-12T00:00:00"
endDate="2599-12-31T23:59:59">
channel level details, such as depth and instrument orientation
<Response>
response level details, including stage info
</Response>
</Channel>
</Network>
</FDSNStationXML>
The above example would have been created from a hypothetical CSS 3.0 database where station = "ANMO_IU" and channel = "BHZ_00".
When the station or channel setup has been through multiple iterations (e.g., when a sensor has changed or some station detail has changed), then the information is repeated using the appropriate start and end dates for each setup.
Full details describing the StationXML format can be found at the FDSN website
http://www.fdsn.org/xml/station/
PARAMETER FILE
The parameter file
db2stationxml.pf primarily contains a series of
group structures which provide the blueprints used to translate the Datascope database into FDSNstationXML. Each group represents the information for one level of the XML tree, such as the element name, attributes and child elements. Child elements may be either a simple element-value combination or may refer to additional groups.
The creation of an XML file begins with
db2stationxml parsing the group named by the
root_node parameter. The program will recursively parse any group that is referenced. The following examples illustrate these relationships between groups and elements.
Groups
A group is an array containing a name, an attribute table, and an element table. The following represents a simplified paramter file containing only the
root_node and the outline of a group.
# simplified pf example
root_node name of first group to parse
sample_group &Arr{
name element name, a.k.a. tag name used to enclose this block of XML
attributes &Tbl{
list of attributes that appear with the opening tag
}
elements &Tbl{
list describing the child elements of this group
}
}
The attributes and elements table consists of a series of lines of the format:
xml_name flag type value
for example:
color r text green
Where:
-
xml_name
Name of the element or attribute as it appears in XML or as another top-level pf key. If the name is - then no surrounding tag will be created. This allows for more flexible grouping.
-
flag
A code stating whether this has to appear in the XML regardless of whether there is
appropriate data in the database or not. Values include:
-
o - optional: If the value is empty, then it should not appear in the output
-
r - required: Include in output, even if value is empty
-
x - ignore this field. The field is not parsed by db2stationxml
-
type
provides instructions on how to interpret this line. Values may be:
-
text - insert value into the xml exactly as it appears
-
lookup - insert value from the specified table.field
-
epoch - db continas a dbTIME which will be output as YYYY-MM-DDTHH:MM:SS.FFFFF
-
julian - db contains a dbYEAR_DAY, whose value will be output as yyyy-mm-ddTHH:MM:SS
-
float - value from db is a dbREAL, convert to string for xml
-
integer - value from db is dbINTEGER, convert to string for xml
-
other_pf:site - the value is a field that comes from a different pf (in this case, site)
-
GROUP - XmlName refers to another top-level key in the pf that describes a child element
-
CATEGORIZED_GROUP - as GROUP, except the value contains an expression or field name used to create multiple children
-
CATEGORIZED_BY_NUMBER - as CATEGORIZED_GROUP except each category is assumed to be numeric, and is returned in numeric sorted order.
-
value
expression, text, or table.fieldname that represents the values to be output to xml.
Group example
Here is an example of how a group structure would translate to XML.
# simplified pf example
root_node dog_group
dog_group &Arr{
name Dog
attributes &Tbl{
color r text black
age r integer 5
}
elements &Tbl{
description r text Friendly and loyal
}
}
When parsed, this parameter file would generate the following XML:
<Dog color="black" age="5">
<description>Friendly and loyal</description>
</Dog>
If the values were instead retrieved from a hypothetical pets database with a table called
petdesc, then
the dog_group might have been described thusly:
dog_group &Arr{
name Dog
attributes &Tbl{
color r lookup petdesc.color
age r integer petdesc.age
}
elements &Tbl{
description r lookup petdesc.personality
}
}
css 3.0 group example
The following example parameter file would query a css 3.0 database to create a simple XML structure that lists the stations that exist within a network.
root_node root_group
root_group &Arr{
name ROOT
attributes &Tbl{
}
elements &Tbl{
network_group CATEGORIZED_GROUP snetsta.snet
}
}
network_group &Arr{
name Network
attributes &Tbl{
Code r lookup snetsta.snet
created r epoch now()
}
elements &Tbl{
Description o lookup network.description
simple_sta_group r CATEGORIZED_GROUP snetsta.fsta
}
}
simple_sta_group &Arr{
name Station
attributes &Tbl{
Code r lookup snetsta.snet
}
elements &Tbl{
}
}
Depending on what networks and stations exist in the database, the resulting XML might look something like:
<ROOT>
<Network Code="AV" created="2016-03-21T00:00:00">
<Description>Some description here</Description>
<Station code="OKCF"></Station>
<Station code="OKTU"></Station>
</Network>
<Network Code="IU" created="2016-03-21T00:00:00">
<Description>Some description here</Description>
<Station code="ANMO"></Station>
<Station code="ANTO"></Station>
</Network>
<ROOT>
An aritrarily complex XML tree can be created through the combination of multiple groups.
Parameters
-
schema
Specifies which version of the CSS schema is described by the parameter file
-
stationxml_version
-
root_node
This is the key name of the root node. The program starts creating the output XML based on the information contained within the root node.
-
subset_criteria
This array contains additional filters to apply to the database before exporting to XML.
Each line should be a table.field combination followed by a regular expression.
Valid tables are sitechan, snetsta, and schanloc
For example: schanloc.fchan /LCE/
-
schema_location
URL of the primary schema document, required by FDSN Station XML
-
schema_xsd
URL of the XML Schema Definition (XSD) that describes FDSN StationXML
Formats
-
epochtime_xmlstring_format
Describes format used to output dates and times to XML
-
jdate_xmlstring_format
Describes format used to output dates to XML
-
unknown_text
Placeholder for unknown data in the resulting stationxml
Namespace Declarations
The namespace provides each XML element with a unique identity that prevents confusion between elements with the same name. There are multiple namespaces used within a StationXML document. The default namespace is defined by the FDSN. When additional definitions are needed, they must be associated with a different namespace. These will appear in the document as
namespace:element. Each namespace must be associated with a URI, though the URI need not refer to an existing document on the Internet.
-
namespace_stationxml
This is the namespace that StationXML belongs to.
-
namespace_schema_instance
Schema that describes XML in general.
-
namespace_css30
Namespace for CSS3.0-specific additions to the FDSNStationXML format.
Element groups
An element group is used to describe the XML format for everything between the opening and closing
element for that node. For example, the
network_group describes the database-XML relationship for
everything between
<Network> to
</Network>, including the attributes (such as
code) and
child elements such as
description. Each group may contain additional groups, ex.
station_starttime_group.
The major groups are listed below:
-
root_fdsnstationxml_v1.0_css3.0
This group represents the root of the document.
-
unknown_group
This represents a generic grouping
-
network_group
Describes <Network> element. This group contains the network details.
-
station_starttime_group
This group allows stations to be grouped by epoch. This is a logical grouping, and has no specific elements associated with it.
-
station_group
Describves <Station> element. This group contains the details for each station, such as station code, location, and station description.
-
channel_starttime_group
This is a logical grouping that allows channels to be grouped by epoch.
-
channel_group
Describes <Channel> element. This group contains channel information, such as channel and location codes, sensor orientations, and is equivalent to SEED blockette 52.
-
calib_units_group
Corresponds to SEED Blockette 57
-
stage
Describes <Stage> element. Represents channel response covering SEED BLockettes 53-56. Most of the details for this group is retrieved from the response files.
EXAMPLE
% cat staxml_example.cpp
#include <stdlib.h>
#include <iostream>
#include <fstream>
#include "db.h"
#include "stock.h"
#include "dbstationxml.h"
using namespace std;
int main()
{
Dbptr db;
char *database = "/opt/antelope/data/demo/gsn/dbmaster/gsn" ;
Pf *pf = "db2stationxml.pf";
char *sta_expr = NULL;
char *outputname = "out.xml";
if ( pfread( pfname , &pf ) != 0 ) return -1;
if ( dbopen( database, "r", &db ) < 0 ) return -1;
// if desired, create rules used to subset the database
// arr[table.field] = 'value';
Arr *subset_rules = newarr(1);
setarr("snetsta.fsta", "/B.*");
int result = -1;
ofstream ofs;
ofs.open ( outputname );
if (ofs.good())
{
string myStationXML;
result = db2stationxml( db , pf , subset_rules , myStationXML , verbose | STAXML_LEVEL_RESPONSE);
if (result==0) ofs << myStationXML;
ofs.close();
}
dbclose(db);
return result;
}
RETURN VALUES
db2stationxml exits with 0 upon success, nonzero upon error. Any errors encountered are registered with the
elog(3) facility.
LIBRARY
-ldbstationxml $(DBLIBS)
SEE ALSO
db2stationxml(1)
BUGS AND CAVEATS
db2stationxml works with any Datascope database using the css3.0 schema.
CSS3.0 limitations to the datafile:
Latitudes, Longitudes, and elevations belong solely to the site table, meaning that they are associated with a station. Channels will have the same Lat/Lon/Elev as the parent station.
Calibration values are retrieved from
calibration.calib
Response stages
db2stationxml is designed for metadata deployments that use full stage-table information, and therefore retrieves response data via the
stage table (stage.dir, stage.dfile). Thus
instrument.dir instrument.dfile are ignored. Databases created with
dbbuild have the stage directory, as do databases created with seed2db using the -stagedir option.
Fields not represented by stationxml
There are a couple db fields that seem like they contain useful information, but don't have a direct equivalent in the stationxml format. These were added as attributes in the appropriate xml sections and are denoted by the namespace css30.
Specifically:
css30:netType containing network.nettype
chanType has attribute css30:chanType containing sitechan.ctype
Sensor has css30:responseFrequencyBand contains instrument.band
InstrumentSensitivity->InputUnits has css30:segtype containing calibration.segtype
Stage has css30:AntelopeStageType containing stage.gtype
Not every CSS3.0 field has been exported. Database specific fields, such as keys and lddates are not included in the StationXML output.
Units
Units are copied directly form the database without conversion. It has been the author's experience that an antelope database may host a variety of units and/or the same units written in a variety of inconsistent ways, subject to data entry errors. By not trying to interpret these, I avoid having to chase (and convert) related values/units. (eg. if nm/s is changed to m/s then at minimum the calib will change too).
Miscellenous fields
Fields that appear to be unused or whose purpose wasn't clear in the development database have been ignored. For example, stage.leadfac.
site.refsta, site.dnorth, site.deast do not have functional equivalents in stationxml. Instead, each channel can have its own location. These have not been incorporated these into the stationxml output.
It is possible to create a circular reference if a group refers to a parent group.
Known issues
Empty lines in pf file can throw off parsing routine
The elements and attributes tables within the parameter file contain flags specifying optional o and required r, but both are treated as required.
Possible Design impact
When binning data into categories, a single category of -1 means that no category exists.
AUTHOR
Celso G. Reyes