Manually Failing-Over to the Passive Node
To force a passive node to assume the active role, disable ACM on the active node and stop all Faspex services on that node.
-
Determine the active node with the acmctl command.
# /opt/aspera/acm/bin/acmctl -i Checking current ACM status... Aspera Cluster Manager status ----------------------------- Local hostname: faspex-ha2 Active node: faspex-ha2 (me) Status of this node: active ...
-
Disable ACM locally.
# /opt/aspera/acm/bin/acmctl -d ACM is disabled locally
-
Check to confirm that no ACM instances are running.
# ps aux | grep acm root 1248 0.0 0.0 103252 824 pts/0 S+ 17:18 0:00 grep acm
-
Stop the Faspex services.
# asctl all:stop Faspex Mongrels: Stop... done Faspex Background: Stop... done Faspex DS Background: Stop... done Faspex DB Background: Stop... done Faspex NP Background: Stop... done MySQL: Stop... done Apache: Stop... done
-
Run the acmctl -i command to verify that all Faspex services have been stopped.
# /opt/aspera/acm/bin/acmctl -i Checking current ACM status... ...
Faspex active/active services status ------------------------------------ Apache: stopped Faspex Mongrels: stopped s
Faspex active/passive services status ------------------------------------- MySQL: stopped Faspex Background: stopped Faspex NP Background: stopped Faspex DS Background: stopped Faspex DB Background: stopped
-
Check the ACM logs to observe the other node taking over (this can take several
minutes):
2013-07-11 18:16:01 (-0700) acm faspex-ha2 (28736): ACM is disabled locally on this host: aborting 2013-07-11 18:16:01 (-0700) acm faspex-ha1 (24404): ACM START (
1.97
) 2013-07-11 18:16:06 (-0700) acm faspex-ha1 (24404): Lock acquired 2013-07-11 18:16:06 (-0700) acm faspex-ha1 (24404): Checking if this node is active or passive 2013-07-11 18:16:06 (-0700) acm faspex-ha1 (24404): Status file found 2013-07-11 18:16:06 (-0700) acm faspex-ha1 (24404): From status file: active host is faspex-ha2 2013-07-11 18:16:06 (-0700) acm faspex-ha1 (24404): This node is passive 2013-07-11 18:16:06 (-0700) acm faspex-ha1 (24404): Checking if the status file is current 2013-07-11 18:16:06 (-0700) acm faspex-ha1 (24404): Status file acm.status is too old (diff: 123) 2013-07-11 18:16:07 (-0700) acm faspex-ha1 (24404): Status file acm.status is too old (diff: 124) 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Status file acm.status is too old (diff: 125) 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Failover scenario 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Stopping MySQL on this node (if running) 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Checking if MySQL is still active on the active node 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Trying to establish a connection to MySQL on host 10.0.115.102 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): The connection to mysql failed, testing TCP port 4406 on host 10.0.115.102 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Connection to 10.0.115.102 on port TCP/4406 failed, MySQL is likely to be down 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Becoming the active node 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): ACM RESET 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Deleting status file... 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Status file deleted 2013-07-11 18:16:08 (-0700) acm faspex-ha1 (24404): Stopping Faspex services 2013-07-11 18:16:12 (-0700) acm faspex-ha1 (24404): Updating file /opt/aspera/acm/config/database.yml 2013-07-11 18:16:12 (-0700) acm faspex-ha1 (24404): Active processing BEGIN 2013-07-11 18:16:24 (-0700) acm faspex-ha1 (24404): Active processing END (12 seconds) 2013-07-11 18:16:29 (-0700) acm faspex-ha1 (24404): Updating status files (hostname=faspex-ha1) 2013-07-11 18:16:29 (-0700) acm faspex-ha1 (24404): ACM STOP -
Check that the active node is no longer active.
# /opt/aspera/acm/bin/acmctl -i Checking current ACM status... Aspera Cluster Manager status ----------------------------- Local hostname: faspex-ha1 Active node: faspex-ha2 Status of this node:
passive
Status file: current Disabled globally: no Disabled on this node: yes ...And check that the other node is now the active one:# /opt/aspera/acm/bin/acmctl -i Checking current ACM status... Aspera Cluster Manager status ----------------------------- Local hostname: faspex-ha1 Active node: faspex-ha1 (me) Status of this node: active
Status file: current Disabled globally: no Disabled on this node: no Database host: 10.0.143.6 IBM Aspera Faspex active/passive services status -------------------------------------- Apache: running MySQL: running IBM Aspera Faspex: running
...Note: If the node does not become active, copy the keystore.jks (/opt/aspera/faspex/lib/daemons/np/etc/keystore.jks) on one node to the other to make sure they are identical. -
Re-enable ACM on the node that recently became passive to let it start the active/active Faspex services.
# /opt/aspera/acm/bin/acmctl -e ACM is enabled locally
-
After a several minutes, you can verify that the active/active
services have started on the passive node:
# /opt/aspera/acm/bin/acmctl -i Checking current ACM status... Aspera Cluster Manager status ----------------------------- Local hostname: faspex-ha2 Active node: faspex-ha1 Status of this node: passive Status file: current Disabled globally: no Disabled on this node: no Database configuration file --------------------------- Database host: 10.0.115.101
Faspex active/active services status ------------------------------------ Apache: running Faspex Mongrels: running
...