- Mnesia schema
db_nodes - nodes of the schema. Either disk nodes or nodes where tables are replicated to.
extra_db_nodes - configuration telling mnesia which nodes to connect to on startup.
running_db_nodes - nodes which mnesia is currently connected to. [1]
table nodes - nodes on which tables are replicated.
A list with all nodes, and with "active" nodes.
"all nodes" is a subset of db_nodes, "active nodes" is a subset of running_db_nodes
In a way db_nodes and running_db_nodes are same as
"all nodes" and "active nodes" of the
schema
table - nodes_running_at_shutdown
a list of nodes, which are currently running. This is similar to
running_db_nodes
, but is monitored by the node_monitor. It's modified when a node starts, joins/leaves the cluster or when therabbit
process stops on a node. - cluster_nodes.config two lists, one containing all clustered nodes, another containing disc nodes. modified when a node joins/leaves the cluster
-
mnesia_monitor - a process linked to other monitors on all db_nodes.
-
rabbit_node_monitor - monitors nodes (net_kernel:monitor_nodes/2) and
rabbit
processes on remote nodes. -
All the queues/channels/gm can monitor state across nodes.
-
nodedown - a message from erlang internal node monitor. handled by mnesia_monitor to keep track on down nodes (does not directly remove them from running_db_nodes) and rabbit_node_monitor to track how many nodes are running for
pause_minority
andpause_if_all_down
triggers check_partial_partition -
nodeup - counterpart of nodedown. handled by mnesia_monitor to check the cluster status. This handler may send an
inconsistent_database
event. rabbit_node_monitor logs the event and does nothing
- Link EXIT signal from mnesia_monitor Updates running_db_nodes and active nodes for all tables
-
notify_node_up notify all nodes from running_db_nodes (except self) by sending
node_up
to them -
DOWN from
rabbit
process Update cluster status (removes the stopped node) Clean up transient queues, listeners, alarms. Updates partition tracking (handle_dead_rabbit) -
node_up (not to be confused with nodeup) sent by a node monitor on a started remote node to notify the cluster (in a boot step) Update cluster status, update alarms, cleanup started node from recoverable slaves for mirrored queues (handle_live_rabbit)
-
joined_cluster/left_cluster - update cluster status
-
{mnesia_system_event, {inconsistent_database, running_partitioned_network, Node}} this message is being treated as reconnect after partial partition update alarms, cleanup started node from recoverable slaves for mirrored queues (handle_live_rabbit) record partitioned state I'm not sure this is the right message to report reconnect. This message may be emitted multiple times and does not necessary mean that a node have rejoined
-
check_partial_partition: the message is sent by a node handling a nodedown message to all the running nodes except the sender and the node, which is "down". The message contains GUIDs of these two nodes
A node, which receives this message, will check that the "down" node is actually down by checking it's status (in the node_monitor data) and by sending an RPC request to call rabbit:is_running/0 If the "down" node is running, the "checker" node responds to the "reporter" node with
partial_partition
message with the "checker" node and the "down" node The RPC request is sent in a one-off process.This feels dangerous intuitively and not that easy to reason about.
-
partial_partition: the message tells a node that there is a partial partition. It contains the "checker" node and the "not_really_down" node. On this message node monitor will force disconnect from the "checker" node and send it a
partial_partition_disconnect
message The node may also pause instead if it's in pause_minority or pause_if_all_down mode -
partial_partition_disconnect: the message tells a node to disconnect from another node.
The assumption here is that a node should be promoted to a full partition, disconnecting from the "checker" node and leaving the "checker" and the "down" nodes in a partition together.
But because DOWN messages are symmetric and there is no additional coordination this process may leave entire cluster disconnected or keep disconnecting nodes for some time.
a note on disconnect:
When disconnecting, the nodes will disable reconnection for 1 second.
When some nodes are down, node monitor will ping entire cluster every 1 second.
Also it will send a cast keepalive
message to all running nodes every 10 seconds.
[1] running_db_nodes:
This value is maintained by internal mnesia monitors.
A node is removed from this list when mnesia_monitor processes detects another mnesia_monitor to be "down".
When rediscovering the node it will not be automatically re-added unless schema is merged.
This can be called exolicitly: mnesia:change_config(extra_db_nodes, [Node]) or if the node restarts.
You may need to set the same extra_db_nodes configuration, which is already there, to reconnect the cluster.
When nodes are discovered, mnesia sends a message like this:
{mnesia_system_event, {inconsistent_database, running_partitioned_network, Node}}
to all processes subscribed to such events.
This may happen every time mnesia checks schema consistency, both when the node
is discovered to be up (e.g. a message is sent between nodes) or when connecting
with mnesia:change_config(extra_db_nodes, ...)
.