NAME
rtexec_setup - setting up an rt processing node
DESCRIPTION
This describes how to set up a directory to run as a real time system
node. Such a node is highly configurable, so this is not simply a matter
of rote, but this gives a general outline of the procedure.
This presumes that the computer is already configured with the correct
operating system, with the Antelope real time software
installed under /opt/antelope, including perl and tcl/tk.
It's often useful to have top(1), xcpustate(1) and less(1)
installed also.
Initial Setup
A processing node is usually run in the home directory of a dummy
user rt. Multiple people may have access to the rt account,
to perform common maintenance operations. Begin, therefore, by
creating the rt user account.
Next, make a subdirectory of ~rt to hold the processing for
this network. We have found that a generalized name, such as rtsystems
allows for flexibility in operations, particularly when more than a single
processing node will be run. Then, within this subdirectory, create all the
files and directories necessary to run the network.
The shortcut approach at this point is to run rtinit, which
creates a number of directories, and copies in a number of parameter
files. You must still at least create the master database, and customize
the parameter files, however.
-
Make the directories db, bin, dbmaster, logs, orb, pf, rtlogs, rtsys, and state.
-
Create the master database in the directory dbmaster -- this database
includes the generally static tables network, site, sitechan, sensor and instrument.
This step can be very labor intensive. Look at dbbuild(1) for information on building
your dbmaster from field notes, and seed2db(1) for building your dbmaster from a
pre-existing dataless SEED volume.
-
Create a database descriptor file in the db directory. In its simplest form,
it might look like this:
% cat nw
# Datascope Database Descriptor File
schema css3.0
dbpath ../dbmaster/{nw}
dblocks 1
This construction causes the active database in this directory
to reference the static tables in the directory dbmaster, as well
as the lastid table there. Run dbe against this database to be sure
that you see the lastid table.
In a more complex installation, you might wish to have a
central server for database ids, and use locks which allow
machines which nfs mount the rt directory write access to
the database (e.g., for dbloc2(1)). To accomplish this,
you must
-
1)
Set up dbids(1) on some machine where it runs
constantly, perhaps as a task under rtexec.
-
2)
Modify the descriptor file to specify the id server, and nfs locks:
% cat nw
# Datascope Database Descriptor File
schema css3.0
dbpath ../dbmaster/{nw}
dblocks nfs
dbidserver rthost
-
If you do not choose to use an id server, then you should explicitly create a
lastid table in the dbmaster directory -- a command like dbnextid nw wfid
does the trick.
This lastid table must be used as the master lastid
table by all programs which generate database ids, to ensure that
ids are compatible across all real time databases. Typical programs
which generate ids are orbassoc(1) (a real time location program), orb2db(1)
(which copies waveform data to a database), and dbloc2(1) (the program to
review locations).
-
Copy rtexec.pf from $ANTELOPE/data/pf to the local directory, and edit
it to configure the programs you want to run. In particular, you
need to edit the Processes list and the Run list. The Processes
list probably begins with the orbserver. Then there are the task(s)
which bring data into the ring buffer, for example q3302orb(1) and orb2orb(1).
Finally, there should be the processing programs
(orbdetect(1) and orbassoc(1)), and the programs
to save waveforms and processing results (orb2db(1) and orb2dbt(1)).
If more than one real time system is running on a single
machine, the orbservers must use different ports.
Specify a different port on the orbserver command line with the -p option,
and use a hostname:port notation for the clients. The easiest method
is to change the default definition of ORB in the Defines array from ":"
to :ddddd: you choose the port number ddddd. This number must be
greater than 2049, and less than 65537. It should be chosen to not
conflict with other services. Most numbers are available, but it
may be useful to look at /etc/services to make sure your choice is
available.
You can also make this port number a name by editing the
file $ANTELOPE/data/pf/orbserver_names.pf; this is more mnemonic,
but the orbserver_names.pf file must be copied around to any machines
which have separate installations of Antelope.
-
Copy $ANTELOPE/data/pf/orbrtd.pf to the directory pf, and edit the
traces table if you wish to customize the display of stations
when running the real time waveform display. See orbrtd(1) for
more information.
Ring Buffer
At least one ring buffer is needed per node. This ring buffer contains
both the incoming data, and various processing results: detections,
arrivals, and locations. Some systems may have additional orbservers
for special purposes: a ring buffer for communicating with Quanterra
dataloggers, or a separate ring buffer to separate and save low sample
rate data for longer periods, for example.
-
Copy $ANTELOPE/data/pf/orbserver.pf to the directory pf, and edit the
ringsize parameter, and the valid_ip_addresses table. The ringsize
determines how large your ring buffer is, and thus how many minutes
of data are buffered there. The size chosen should be generous (usually a day or more),
but also be commensurate
with physical memory; larger sizes may occasionally cause undue paging.
With the 64-bit compliation of orbserver, orbs with a ringsize
greater than 2Gb are now possible.
The table valid_ip_addresses may just contain the localhost entry and
a second entry for the ip address of the current host. To allow
users from a host at address 128.243.23.22 readonly access to this orb,
you add a new line like this:
128.243.23.22 255.255.255.255 readonly #
To add every host on the network 128.243.23, change the mask to
255.255.255.0.
A connection is allowed when the masked remote ip address and the
masked ip address from the table match.
Additionally, you can limit the types of packets a single host is
allowed to collect:
128.243.23.22 255.255.255.255 XX_.*(M100|M40|M1) #
This table
can be edited after the system is running to add new hosts (or remove
old ones).
Data input
Data may arrive in the ring buffer from a variety of sources, either
directly from data loggers, from other ring buffers, or (in replay mode)
from an existing database. Depending on the sources for this node,
you may need to copy and edit the parameter files for datalogger import
modules, like
q3302orb(1) for Quanterra Q330 dataloggers.
Other modules like
orb2orb(1) (for copying from another orbserver) and
dbreplay(1) (for
copying from a database) don't require a parameter file.
Event Detection, Location and Magnitudes
-
Copy $ANTELOPE/data/pf/orbdetect.pf to the directory pf. You must
indicate the channels on which the detector runs, and you may want
to customize the detection parameters for each channel, depending on the site.
Refer to orbdetect(1) for more detail on the parameters.
-
Copy $ANTELOPE/data/pf/orbassoc.pf to the directory pf. Edit it to indicate
the grid to be used for a grid search for the location.
Run ttgrid(1) to generate a data file for this grid (usually named ttgrid.pf);
place the resulting file in $ANTELOPE/data/pf.
-
Copy $ANTELOPE/data/pf/orbevproc.pf to the directory pf. Edit it to indicate
what magnitude processing modulues should be run. Note that this is a highly
customizable interface where your own magnitude modules can be used. See
orbevproc(1) for additional information.
Saving waveforms and locations
-
Copy $ANTELOPE/data/pf/orb2db.pf to the directory pf. orb2db copies waveforms
to an output database. Its parameter file
specifies the output format; steim 2 compressed mini-seed is the default
and most compact form, provided the data is continuous. If the incoming
packet data may be out of order, you can specify the maximum number of
packets by which a packet may be late.
If your data is only event data, you should specify an uncompressed
datatype (like s4) and set flush_wf_writes to yes. Otherwise, old data
is held internally by orb2db until new data appears.
Refer to the man page orb2db(1) for more detail.
-
orb2dbt doesn't require a parameter file; it copies processing data in
the form of database records -- e.g., arrival picks and event locations --
to the output database.
Try the Configuration
At this point, you are probably ready to try things out. Begin
by running
rtm(1). rtm complains if the directory structure is
not what it expects.
After fixing any initial configuration problems,
you should have a graphical window which has two large buttons Stop and
Start. Press the Start button, and also press the Processing
Tasks button below the top panel to open a panel which shows
the individual tasks which make up this node.
When the Start button is pressed, rtm runs rtexec, which in turn
starts up the Processes which were specified in the rtexec parameter file.
Watching the Processing Tasks panel, you should see tasks starting
up in the order they were specified in the parameter file.
Here you may discover additional problems if a program starts and
immediately dies. Use the menu button under the task name to look
at the log for the program to see why it failed, and fix the problem.
If the command line is the problem, you can edit it using another of
the menu options for that task.
Once all the programs are running properly, you
may take some time to get the detection and
location parameters properly configured.
Typically, you should run dbloc2 to review the automatic
locations.
Monitoring Operations
Once a processing node is properly setup, it should run independently and
without too much trouble. However, it's important to monitor some
aspects.
The current condition is indicated by the real time monitor
rtm(1) display;
operators typically leave rtm running continuously on the
computer running rtexec, along with an orbmonrtd displaying network
waveforms.
-
Disk Space
It's particularly important to ensure that no disk used by the processing
system fill up, as this is guaranteed to cause a mess that is painful
to deal with. Primarily, this means that older waveforms must be cleaned
off the filesystem at regular intervals. The Disk Usage bars on the
top panel of rtm are intended to warn of problems.
-
Cpu and Memory Usage
The computer should be chosen so that the processing system is a
fairly small load on it. The various load bars on the rtm window
should make it fairly easy to monitor both cpu and memory usage.
If either becomes too high, the cause should be investigated. (
Note there is often a transient of heavy activity when the system is started;
this is usually not a matter of great concern; it should typically
settle down in 5-10 minutes.)
-
Database Size
If the input waveform data is fragmentary (has lots of gaps), this causes
some tables in the database to become large quickly: wfdisc, gap,
and perhaps retransmit and ratechange. If this happens, you may
want to truncate the wfdisc table more often.
-
Program failures/restarts
When a program dies for any reason, rtexec restarts it. Normally programs
should not die; however, if they do, it's easy not to notice. The log
files in the directory logs should be inspected on a regular basis for error
messages. If a program is repeatedly dying, this should be investigated.
It's also useful to run daily reports rtsys(1) and rtreport(1), which send
email summarizing the previous day's performance.
rtsys searches for problems and error messages in the log files, and reports
on disk availability. rtreport summarizes the data returned, and
the number of arrivals and events. These programs should be run from
the rtexec crontab, typically sometime after midnight UTC, e.g. 2:00.
SEE ALSO
rtinit(1)
install_boot_scripts(1)
rt(5)
rtsetup(8)
rtm(1)
rtexec(1)
orbserver(1)
q3302orb(1)
orbdetect(1)
orbassoc(1)
orbevproc(1)
dborigin2orb(1)
orb2db(1)
orb2dbt(1)
AUTHOR
Daniel Quinlan