Ongoing cluster membership
Once the cluster is up and running, a system remains an active member of the cluster as long as peer systems receive a heartbeat signal from that system over the cluster interconnect. A change in cluster membership is determined as follows:
When LLT on a system no longer receives heartbeat messages from a system on any of the configured LLT interfaces for a predefined time (peerinact), LLT informs GAB of the heartbeat loss from that specific system.
This predefined time is 16 seconds by default, but can be configured.
You can set this predefined time with the set-timer peerinact command. See the llttab manual page.
If you enable faster link failure detection, then LLT detects the link failures immediately.
When LLT informs GAB of a heartbeat loss, the systems that are remaining in the cluster coordinate to agree which systems are still actively participating in the cluster and which are not. This happens during a time period known as GAB Stable Timeout (5 seconds).
VCS has specific error handling that takes effect in the case where the systems do not agree.
GAB marks the system as DOWN, excludes the system from the cluster membership, and delivers the membership change to the fencing module.
The fencing module performs membership arbitration to ensure that there is not a split brain situation and only one functional cohesive cluster continues to run.
The fencing module is turned on by default.
Review the details on actions that occur if the fencing module has been deactivated: