If you want to simply test the system and quickly get going, please see the Quickstart section.
install_scripts/single_machine_install.sh
install script that comes with the installation bundle.
This install script accommodates installations with a "standard" set of parameters and installs the EPICS archiver appliance on one machine.
In addition to the System requirements, the install_scripts/single_machine_install.sh
will ask for
mysql-connector-java-5.1.21-bin.jar
--user=archappl --password=archappl --database=archappl
that can be used with the MySQL client like so mysql ${MYSQL_CONNECTION_STRING} -e "SHOW DATABASES"
.
This implies that the MySQL schema has already been created using something like
mysql --user=root --password=*****
CREATE DATABASE archappl;
GRANT ALL ON archappl.* TO 'archappl' identified by 'archappl';
install_scripts/single_machine_install.sh
install script creates a couple of scripts in the deployment folder that can be customized for your site.
sampleStartup.sh
- This is a script in the fashion of scripts in /etc/init.d
that can be used to start and stop the four Tomcat processes of your archiver appliance.deployRelease.sh
- This can be used to upgrade your installation to a new release of the EPICS archiver appliance.
The deployRelease.sh
also includes some post install hooks to deploy your site specific content as outlined here.
appliances.xml
appliances.xml
is a file that lists all the appliances in a cluster of archiver appliance.
While it is not necessary to point to the same physical file, the contents are expected to be identical across all appliances in the cluster.
The details of the file are outlined in the ConfigService javadoc.
A sample appliances.xml
with two appliances looks like
<appliances>
<appliance>
<identity>appliance0</identity>
<cluster_inetport>archappl0.slac.stanford.edu:16670</cluster_inetport>
<mgmt_url>http://archappl0.slac.stanford.edu:17665/mgmt/bpl</mgmt_url>
<engine_url>http://archappl0.slac.stanford.edu:17666/engine/bpl</engine_url>
<etl_url>http://archappl0.slac.stanford.edu:17667/etl/bpl</etl_url>
<retrieval_url>http://archappl0.slac.stanford.edu:17668/retrieval/bpl</retrieval_url>
<data_retrieval_url>http://archproxy.slac.stanford.edu/archiver/retrieval</data_retrieval_url>
</appliance>
<appliance>
<identity>appliance1</identity>
<cluster_inetport>archappl1.slac.stanford.edu:16670</cluster_inetport>
<mgmt_url>http://archappl1.slac.stanford.edu:17665/mgmt/bpl</mgmt_url>
<engine_url>http://archappl1.slac.stanford.edu:17666/engine/bpl</engine_url>
<etl_url>http://archappl1.slac.stanford.edu:17667/etl/bpl</etl_url>
<retrieval_url>http://archappl1.slac.stanford.edu:17668/retrieval/bpl</retrieval_url>
<data_retrieval_url>http://archproxy.slac.stanford.edu/archiver/retrieval</data_retrieval_url>
</appliance>
</appliances>
ARCHAPPL_APPLIANCES
for the location of the appliances.xml
file. Use an export statement like so
export ARCHAPPL_APPLIANCES=/nfs/epics/archiver/production_appliances.xml
to set the location of the appliances.xml
file.
appliances.xml
has one <appliance>
section per appliance.
You can have more entries than you have appliances; that is, if you plan to eventually deploy a cluster of 10 machines but only have a budget for 2, you can go ahead and add entries for the other machines.
The cluster should start up even if one or more appliances are missing.
identity
for each appliance is unique to each appliance.
For example, the string appliance0
serves to uniquely identify the archiver appliance on the machine archappl0.slac.stanford.edu
.
cluster_inetport
is the TCPIP address:port
combination that is used for inter-appliance communication.
There is a check made to ensure that the hostname portion of the cluster_inetport
is either localhost
or the same as that obtained from a call to InetAddress.getLocalHost().getCanonicalHostName()
which typically returns the fully qualified domain name (FQDN).
The intent here is to prevent multiple appliances starting up with the same appliance identity (a situation that could potentially lead to data loss).
A
of a cluster should be able to communicate with any member B
of a cluster using B
's cluster_inetport
as defined in the appliances.xml
.localhost
should be used for the cluster_inetport
only if you have a cluster with only one appliance. Even in this case, it's probably more future-proof to use the FQDN.cluster_inetport
is the same on all machines. This is the port on which the appliances talk to each other.mgmt_url
has the smallest port number amongst all the web apps.retrieval
webapp.
retrieval_url
is the URL used by the mgmt
webapp to talk to the retrieval
webapp.data_retrieval_url
is used by archive data retrieval clients to talk to the cluster.
In this case, we are pointing all clients to a single load-balancer on archproxy.slac.stanford.edu
on port 80.
One can use the mod_proxy_balancer of Apache to load-balance among any of the appliances in the cluster.
AJP
for load-balancing between Apache and Tomcat.
For this software, we should use simple HTTP; this workflow does not entail the additional complexity of the AJP
protocol.
policies.py
(from the tests
site) that creates a three stage storage environment. These are
ARCHAPPL_SHORT_TERM_FOLDER
at the granularity of an hour.
ARCHAPPL_MEDIUM_TERM_FOLDER
at the granularity of a day.
ARCHAPPL_LONG_TERM_FOLDER
at the granularity of an year.
policies.py
file, you can use the ARCHAPPL_POLICIES
environment variable, like so.
export ARCHAPPL_POLICIES=/nfs/epics/archiver/production_policies.py
On the other hand, if you are using a site specific build, you can bundle your site-specific policies.py
as part of the mgmt WAR
during the site specific build.
Just add your policies.py
to the source code repository under src/sitespecific/YOUR_SITE/classpathfiles
and build the war by setting the ARCHAPPL_SITEID
during the build using something like export ARCHAPPL_SITEID=YOUR_SITE
.
In this case, you do not need to specify the ARCHAPPL_POLICIES
environment variable.
TOMCAT_HOME
to the location where the Tomcat distribution is expanded.
Many of the following steps require a TOMCAT_HOME
to be set.
conf/server.xml
file to change the ports to better suit your installation.
mgmt
webapp for this appliance, in this example, 17665.
<Connector connectionTimeout="20000" port="808017665" protocol="HTTP/1.1" redirectPort="8443"/>
conf/server.xml
file, one for the HTTP connector and the other for the SHUTDOWN
command.
lib/log4j.properties
.
Here's a sample that logs exceptions and errors with one exception - log messages logged to the config
namespace are logged at INFO level.
# Set root logger level and its only appender to A1.
log4j.rootLogger=ERROR, A1
log4j.logger.config.org.epics.archiverappliance=INFO
log4j.logger.org.apache.http=ERROR
# A1 is set to be a DailyRollingFileAppender
log4j.appender.A1=org.apache.log4j.DailyRollingFileAppender
log4j.appender.A1.File=arch.log
log4j.appender.A1.DatePattern='.'yyyy-MM-dd
# A1 uses PatternLayout.
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
${TOMCAT_HOME}/bin/commons-daemon-native.tar.gz
and follow the instructions.
Once you have built this, copy the jsvc
binary to the Tomcat bin
folder for convenience.
Note, it's not required that you use Apache Commons Daemon
especially, if you are already using system monitoring and management tools like Nagios or Hyperic.
[ bin ]$ tar zxf commons-daemon-native.tar.gz
[ bin ]$ cd commons-daemon-1.1.0-native-src
[ commons-daemon-1.1.0-native-src ]$ cd unix/
[ unix ]$ ./configure
*** Current host ***
checking build system type... x86_64-pc-linux-gnu
...
[ unix ]$ make
(cd native; make all)
...
[ unix ]$ cp jsvc ../../../bin/
innodb_flush_log_at_trx_commit=0
(assuming you are ok with this) will go a long way in improving performace (especially when importing channel archiver configuration files etc).
Each appliance has its own installation of MySQL. In each appliance,
chkconfig
)
archappl
and grant a user (in this example, also called archappl
) permissions for this schema.
CREATE DATABASE archappl;
GRANT ALL ON archappl.* TO 'archappl'@localhost IDENTIFIED BY '<password>';
archappl_mysql.sql
that is included as part of the mgmt
WAR file.
Execute this script in you newly created schema.
Confirm that the tables have been created using a SHOW TABLES
command.
There should be at least these tables
PVTypeInfo
- This table stores the archiving parameters for the PVsPVAliases
- This table stores EPICS alias mappingsExternalDataServers
- This table stores information about external data servers.ArchivePVRequests
- This table stores archive requests that are still pending.lib
folder.
In addition to the log4j.properties file, you should have a mysql-connector-java-XXX.jar
as show here.
[ lib ]$ ls -ltra
...
-rw-r--r-- 1 mshankar cd 505 Nov 13 10:29 log4j.properties
-rw-r--r-- 1 mshankar cd 1007505 Nov 13 10:29 mysql-connector-java-5.1.47-bin.jar
jdbc/archappl
.
You can use the Tomcat management UI or directly add an entry in conf/context.xml
like so
<Resource name="jdbc/archappl"
auth="Container"
type="javax.sql.DataSource"
factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
username="archappl"
password="XXXXXXX"
testWhileIdle="true"
testOnBorrow="true"
testOnReturn="false"
validationQuery="SELECT 1"
validationInterval="30000"
timeBetweenEvictionRunsMillis="30000"
maxActive="10"
minIdle="2"
maxWait="10000"
initialSize="2"
removeAbandonedTimeout="60"
removeAbandoned="true"
logAbandoned="true"
minEvictableIdleTimeMillis="30000"
jmxEnabled="true"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/archappl"
/>
Of course, please do make changes appropriate to your installation.
The only parameter that is fixed is the name of the pool and this needs to be jdbc/archappl
.
All other parameters are left to your discretion.
tomcat-jdbc.jar
file into /usr/share/tomcat7/lib
.policies.py
.
However, if you are using the default policies.py
that ships with the box or a variant thereof, you'll need to set up three stages of storage.
A useful way to do this is to create a folder called /arch
and then create soft links in this folder to the actual physical location.
For example,
[ arch ]$ ls -ltra
total 32
lrwxrwxrwx 1 archappl archappl 8 Jun 21 2013 sts -> /dev/shm
lrwxrwxrwx 1 archappl archappl 4 Jun 21 2013 mts -> data
lrwxrwxrwx 1 archappl archappl 40 Feb 12 2014 lts -> /nfs/site/archappl/archappl01
drwxr-xr-x 195 archappl archappl 4096 Oct 15 15:05 data
We then set environment variables in the startup script that point to the locations within /arch
.
For example,
export ARCHAPPL_SHORT_TERM_FOLDER=/arch/sts/ArchiverStore
export ARCHAPPL_MEDIUM_TERM_FOLDER=/arch/mts/ArchiverStore
export ARCHAPPL_LONG_TERM_FOLDER=/arch/lts/ArchiverStore
mgmt.war
file contains a script deployMultipleTomcats.py
in the install
folder that will use the information in the appliances.xml
file and the identity of this appliance to generate individual Tomcat containers from a single Tomcat install (identified by the environment variable TOMCAT_HOME
).
To run this script, set the following environment variables
TOMCAT_HOME
- This is the Tomcat installation that you prepared in the previous steps.ARCHAPPL_APPLIANCES
- This points to the appliances.xml
that you created in the previous steps.ARCHAPPL_MYIDENTITY
- This is the identity of the current appliance, for example appliance0
.
If this is not set, the system will default to using the machine's hostname as determined by making a call to InetAddress.getLocalHost().getCanonicalHostName()
.
However, this makes ARCHAPPL_MYIDENTITY
a physical entity and not a logical entity; so, if you can, use a logical name for this entry.
Note, this must match the identity
element of this appliance as it is defined in the appliances.xml
.
deployMultipleTomcats.py
script passing in one argument that identifies the parent folder of the individual Tomcat containers.
[ single_machine_install ]$ export TOMCAT_HOME=/arch/single_machine_install/tomcats/apache-tomcat-9.0.20
[ single_machine_install ]$ export ARCHAPPL_APPLIANCES=/arch/single_machine_install/sample_appliances.xml
[ single_machine_install ]$ export ARCHAPPL_MYIDENTITY=appliance0
[ single_machine_install ]$ ./install_scripts/deployMultipleTomcats.py /arch/single_machine_install/tomcats
Using
tomcat installation at /arch/single_machine_install/tomcats/apache-tomcat-9.0.20
to generate deployments for appliance appliance0
using configuration info from /arch/single_machine_install/sample_appliances.xml
into folder /arch/single_machine_install/tomcats
The start/stop port is the standard Tomcat start/stop port. Changing it to something else random - 16000
The stop/start ports for the new instance will being at 16001
Generating tomcat folder for mgmt in location /arch/single_machine_install/tomcats/mgmt
Commenting connector with protocol AJP/1.3 . If you do need this connector, you should un-comment this.
Generating tomcat folder for engine in location /arch/single_machine_install/tomcats/engine
Commenting connector with protocol AJP/1.3 . If you do need this connector, you should un-comment this.
Generating tomcat folder for etl in location /arch/single_machine_install/tomcats/etl
Commenting connector with protocol AJP/1.3 . If you do need this connector, you should un-comment this.
Generating tomcat folder for retrieval in location /arch/single_machine_install/tomcats/retrieval
Commenting connector with protocol AJP/1.3 . If you do need this connector, you should un-comment this.
[ single_machine_install ]$
This is the last of the steps that are install specific; that is, you'll execute these only on installation of a new appliance.
The remaining steps are those that will be executed on deployment of new release, start/stop etc.
webapps
folder; all we have to do is to copy the (newer) WAR into this folder and Tomcat (should) will expand the WAR file and deploy the WAR file on startup.
The deployment/upgrade steps are
webapps
folder (if present).webapps
folder.webapps
folder
DEPLOY_DIR
is the parent folder of the individual Tomcat containers and WARSRC_DIR
is the location where the WAR files are present, then the deploy steps (steps 2 and 3 in the list above) look something like
pushd ${DEPLOY_DIR}/mgmt/webapps && rm -rf mgmt*; cp ${WARSRC_DIR}/mgmt.war .; mkdir mgmt; cd mgmt; jar xf ../mgmt.war; popd;
pushd ${DEPLOY_DIR}/engine/webapps && rm -rf engine*; cp ${WARSRC_DIR}/engine.war .; mkdir engine; cd engine; jar xf ../engine.war; popd;
pushd ${DEPLOY_DIR}/etl/webapps && rm -rf etl*; cp ${WARSRC_DIR}/etl.war .; mkdir etl; cd etl; jar xf ../etl.war; popd;
pushd ${DEPLOY_DIR}/retrieval/webapps && rm -rf retrieval*; cp ${WARSRC_DIR}/retrieval.war .; mkdir retrieval; cd retrieval; jar xf ../retrieval.war; popd;
CATALINA_HOME
- This is the install folder for Tomcat that is common to all Tomcat instances; in our case this is $TOMCAT_HOME
CATALINA_BASE
- This is the deploy folder for Tomcat that is specific to each Tomcat instance; in our case this is
${DEPLOY_DIR}/mgmt
${DEPLOY_DIR}/etl
${DEPLOY_DIR}/engine
${DEPLOY_DIR}/retrieval
function startTomcatAtLocation() {
if [ -z "$1" ]; then echo "startTomcatAtLocation called without any arguments"; exit 1; fi
export CATALINA_HOME=${TOMCAT_HOME}
export CATALINA_BASE=$1
echo "Starting tomcat at location ${CATALINA_BASE}"
pushd ${CATALINA_BASE}/logs
${CATALINA_HOME}/bin/jsvc \
-server \
-cp ${CATALINA_HOME}/bin/bootstrap.jar:${CATALINA_HOME}/bin/tomcat-juli.jar \
${JAVA_OPTS} \
-Dcatalina.base=${CATALINA_BASE} \
-Dcatalina.home=${CATALINA_HOME} \
-cwd ${CATALINA_BASE}/logs \
-outfile ${CATALINA_BASE}/logs/catalina.out \
-errfile ${CATALINA_BASE}/logs/catalina.err \
-pidfile ${CATALINA_BASE}/pid \
org.apache.catalina.startup.Bootstrap start
popd
}
function stopTomcatAtLocation() {
if [ -z "$1" ]; then echo "stopTomcatAtLocation called without any arguments"; exit 1; fi
export CATALINA_HOME=${TOMCAT_HOME}
export CATALINA_BASE=$1
echo "Stopping tomcat at location ${CATALINA_BASE}"
pushd ${CATALINA_BASE}/logs
${CATALINA_HOME}/bin/jsvc \
-server \
-cp ${CATALINA_HOME}/bin/bootstrap.jar:${CATALINA_HOME}/bin/tomcat-juli.jar \
${JAVA_OPTS} \
-Dcatalina.base=${CATALINA_BASE} \
-Dcatalina.home=${CATALINA_HOME} \
-cwd ${CATALINA_BASE}/logs \
-outfile ${CATALINA_BASE}/logs/catalina.out \
-errfile ${CATALINA_BASE}/logs/catalina.err \
-pidfile ${CATALINA_BASE}/pid \
-stop \
org.apache.catalina.startup.Bootstrap
popd
}
and you'd invoke these using something like
stopTomcatAtLocation ${DEPLOY_DIR}/engine
stopTomcatAtLocation ${DEPLOY_DIR}/retrieval
stopTomcatAtLocation ${DEPLOY_DIR}/etl
stopTomcatAtLocation ${DEPLOY_DIR}/mgmt
and
startTomcatAtLocation ${DEPLOY_DIR}/mgmt
startTomcatAtLocation ${DEPLOY_DIR}/engine
startTomcatAtLocation ${DEPLOY_DIR}/etl
startTomcatAtLocation ${DEPLOY_DIR}/retrieval
Remember to set all the appropriate environment variables from the previous steps
JAVA_HOME
TOMCAT_HOME
ARCHAPPL_APPLIANCES
ARCHAPPL_MYIDENTITY
ARCHAPPL_SHORT_TERM_FOLDER
or equivalentARCHAPPL_MEDIUM_TERM_FOLDER
or equivalentARCHAPPL_LONG_TERM_FOLDER
or equivalentJAVA_OPTS
- This is the environment variable typically used by Tomcat to pass arguments to the VM. You can pass in appropriate arguments like so
export JAVA_OPTS="-XX:+UseG1GC -Xmx4G -Xms4G -ea"
LD_LIBRARY_PATH
- If you are using JCA, please make sure your LD_LIBRARY_PATH includes the paths to the JCA and EPICS base .so
's.