cman_tool man page on YellowDog

Man page or keyword search:  
man Server   18644 pages
apropos Keyword Search (all sections)
Output format
YellowDog logo
[printable version]

CMAN_TOOL(8)							  CMAN_TOOL(8)

NAME
       cman_tool - Cluster Management Tool

SYNOPSIS
       cman_tool  join	|  leave  | kill | expected | votes | version | wait |
       status | nodes | services | debug [options]

DESCRIPTION
       cman_tool is a program that manages the	cluster	 management  subsystem
       CMAN.  cman_tool	 can  be used to join the node to a cluster, leave the
       cluster, kill another cluster node or  change  the  value  of  expected
       votes of a cluster.
       Be  careful that you understand the consequences of the commands issued
       via cman_tool as they can affect all nodes in your cluster. Most of the
       time  the cman_tool will only be invoked from your startup and shutdown
       scripts.

SUBCOMMANDS
       join   This is the main use of cman_tool. It instructs the cluster man‐
	      ager  to	attempt to join an existing cluster or (if no existing
	      cluster exists) then to form a new one on its own.
	      If no options are given to this command then it  will  take  the
	      cluster  configuration information from CCS. However, it is pos‐
	      sible to provide all the information on the command-line	or  to
	      override CCS values by using the command line.

       leave  Tells CMAN to leave the cluster. You cannot do this if there are
	      subsystems (eg DLM, GFS) active. You  should  dismount  all  GFS
	      filesystems,  shutdown  CLVM, fenced and anything else using the
	      cluster  manager	before	using  cman_tool   leave.    Look   at
	      'cman_tool  status'  and	group_tool to see how many (and which)
	      subsystems are active.
	      When a node leaves the cluster, the remaining nodes  recalculate
	      quorum  and this may block cluster activity if the required num‐
	      ber of votes is not present.  If this node is to be down for  an
	      extended	period	of  time and you need to keep the cluster run‐
	      ning, add the remove option, and the remaining nodes will recal‐
	      culate quorum such that activity can continue.

       kill   Tells  CMAN to kill another node in the cluster. This will cause
	      the local node to send a "KILL" message to that node and it will
	      shut down.  Recovery will occur for the killed node as if it had
	      failed.  This is a sort of remote version of  "leave  force"  so
	      only use if if you really know what you are doing.

       expected
	      Tells  CMAN  a  new  value of expected votes and instructs it to
	      recalculate quorum based on this value.
	      Use this option if your cluster has lost	quorum	due  to	 nodes
	      failing and you need to get it running again in a hurry.

       version
	      Used  alone  this will report the major, minor, patch and config
	      versions used by CMAN (also displayed in 'cman_tool status'). It
	      can  also	 be  used  with	 -r to set a new config version on all
	      cluster members.

       wait   Waits until the node  is	a  member  of  the  cluster  and  then
	      returns.

       status Displays the local view of the cluster status.

       nodes  Displays the local view of the cluster nodes.

       services
	      Displays	the  local  view of subsystems using cman (deprecated,
	      group_tool should be used instead).

       debug  Sets the debug level of the running cman	daemon.	 Debug	output
	      will  be sent to syslog level LOG_DEBUG. the -d switch specifies
	      the new logging  level.  This  is	 the  same  bitmask  used  for
	      cman_tool join -d

LEAVE OPTIONS
       -w     Normally, "cman_tool leave" will fail if the cluster is in tran‐
	      sition (ie another node is joining or leaving the	 cluster).  By
	      adding  the  -w  flag,  cman_tool	 will wait and retry the leave
	      operation repeatedly until it succeeds or a more	serious	 error
	      occurs.

       -t <seconds>
	      If  -w  is also specified then -t dictates the maximum amount of
	      time cman_tool is prepared to wait. If the operation  times  out
	      then a status of 2 is returned.

       force  Shuts  down the cluster manager without first telling any of the
	      subsystems to close down. Use this option with extreme  care  as
	      it could easily cause data loss.

       remove Tells  the  rest	of the cluster to recalculate quorum such that
	      activity can continue without this node.

EXPECTED OPTIONS
       -e <expected-votes>
	      The new value of expected votes to use.  This  will  usually  be
	      enough  to  bring	 the  cluster  back to life. Values that would
	      cause incorrect quorum will be rejected.

KILL OPTIONS
       -n <nodename>
	      The node name of the node to  be	killed.	 This  should  be  the
	      unqualified node name as it appears in 'cman_tool nodes'.

VERSION OPTIONS
       -r <config_version>
	      The new config version. You don't need to use this when adding a
	      new node, the new cman node will tell the rest of the cluster to
	      get  their  latest version of the config file from CCS automati‐
	      cally.

WAIT OPTIONS
       -q     Waits until the cluster is quorate before returning.   -t	 <sec‐
	      onds>  Dictates the maximum amount of time cman_tool is prepared
	      to wait.	If the operation times out  then  a  status  of	 2  is
	      returned.

JOIN OPTIONS
       -c <clustername>
	      Provides a text name for the cluster. You can have several clus‐
	      ters on one LAN and they are distinguished by  this  name.  Note
	      that the name is hashed to provide a unique number which is what
	      actually distinguishes the cluster, so it is possible  that  two
	      different names can clash. If this happens, the node will not be
	      allowed into the existing cluster and  you  will	have  to  pick
	      another name or use different port number for cluster communica‐
	      tion.

       -p <port>
	      UDP port number used for cluster communication. This defaults to
	      5405.

       -v <votes>
	      Number of votes this node has in the cluster. Defaults to 1.

       -e <expected votes>
	      Number  of  expected  votes  for the whole cluster. If different
	      nodes provide different values then the  highest	is  used.  The
	      cluster  will only operate when quorum is reached - that is more
	      than half the available votes  are  available  to	 the  cluster.
	      There  is	 no  default for this value. If you are using CCS then
	      ccs_tool will use the total number of votes for all nodes in the
	      configuration file.

       -2     Sets  the cluster up for a special "two node only" mode. Because
	      of the quorum requirements mentioned above, a  two-node  cluster
	      cannot  be  valid.   This	 option tells the cluster manager that
	      there will only ever be two nodes in the cluster and  relies  on
	      fencing  to  ensure  cluster integrity.  If you specify this you
	      cannot add more nodes without taking down the  existing  cluster
	      and  reconfiguring  it.  Expected votes should be set to 1 for a
	      two-node cluster.

       -n <nodename>
	      Overrides the node name. By default the unqualified hostname  is
	      used.  This  option  is  also used to specify which interface is
	      used for cluster communication.

       -N <nodeid>
	      Overrides the  node  ID  for  this  node.	 Normally,  nodes  are
	      assigned	a  node id in CCS. If you specify an incorrect node ID
	      here, the node might not be allowed to join the cluster. Setting
	      node  IDs in CCS is a far better way to do this.	 Note that the
	      node's application to join the cluster may be  rejected  if  you
	      try  to  set the nodeid to one that has already been used, or if
	      the node was previously a member of the cluster but with a  dif‐
	      ferent nodeid.

       -o <nodename>
	      Override	the name this node will have in the cluster. This will
	      normally be the hostname or the  first  name  specified  by  -n.
	      Note  how	 this  differs from -n: -n tells cman_tool how to find
	      the host address and/or the entry in CCS. -o simply changes  the
	      name the node will have in the cluster and has no bearing on the
	      actual name of the machine. Use this option  will	 extreme  cau‐
	      tion.

       -m <multicast-address>
	      Specifies	 a multicast address to use for cluster communication.
	      This is required for IPv6 operation. You should also specify  an
	      ethernet	interface  to bind to this multicast address using the
	      -i option.

       -w     Join and wait until the node is a cluster member.

       -q     Join and wait until the cluster is quorate.  If the cluster join
	      fails and -w (or -q) is specified, then it will be retried. Note
	      that cman_tool cannot tell whether the cluster join was rejected
	      by  another node for a good reason or that it timed out for some
	      benign reason; so it is strongly recommended that a  timeout  is
	      also given with the wait options to join. If you don't want join
	      to retry on failure but do want to wait, use the cman_tool  join
	      command without -w followed by cman_tool wait.

       -k <keyfile>
	      All  traffic  sent  out by cman/openais is encrypted. By default
	      the security key used is simply the cluster name.	 If  you  need
	      more  security  you can specify a key file that contains the key
	      used to encrypt cluster communications.  Of course, the contents
	      of the key file must be the same on all nodes in the cluster. It
	      is up to you to securely copy the file to the nodes.

       -t <seconds>
	      If -w or -q is also  specified  then  -t	dictates  the  maximum
	      amount  of  time cman_tool is prepared to wait. If the operation
	      times out then a status  of  2  is  returned.   Note  that  just
	      because  cman_tool  has given up, does not mean that cman itself
	      has stopped trying to join a cluster.

NODES OPTIONS
       -f     Shows the date/time the node was last  fenced  (if  it  has  bee
	      fenced), and also the fence system that was used.

       -a     Shows the IP address(es) the nodes are communicating on.

       -n <nodename>
	      Shows  node  information for a specific node. This should be the
	      unqualified node name as it appears in 'cman_tool nodes'.

       -F <format>
	      Specify the format of the output. The format string may  contain
	      one  or  more  format  options, each seperated by a comma. Valid
	      format options include: id, name, type, and addr.

DEBUG OPTIONS
       -d <value>
	      The value is a bitmask of
	      2 Barriers
	      4 Membership messages
	      8 Daemon operation, including command-line interaction
	      16 Interaction with OpenAIS

NOTES
       the nodes subcommand shows a list of nodes known to cman. the state  is
       one of the following:
       M    The node is a member of the cluster
       X    The node is not a member of the cluster
       d    The node is known to the cluster but disallowed access to it.

DISALLOWED NODES
       Occasionally (but very infrequently I hope) you may see nodes marked as
       "Disallowed" in cman_tool status or "d" in cman_tool nodes.  This is  a
       bit  of a nasty hack to get around mismatch between what the upper lay‐
       ers expect of the cluster manager and OpenAIS.

       If a node experiences a momentary lack of connectivity, but one that is
       long enough to trigger the token timeouts, then it will be removed from
       the cluster. When connectivity is restored OpenAIS will happily let  it
       rejoin the cluster with no fuss. Sadly the upper layers don't like this
       very much. They may (indeed probably  will  have)  have	changed	 their
       internal	 state while the other node was away and there is no straight‐
       forward way to bring the rejoined node up-to-date with that state. When
       this  happens  the  node is marked "Disallowed" and is not permitted to
       take part in cman operations.

       If the remainder of the cluster is quorate the the node will be sent  a
       kill  message and it will be forced to leave the cluster that way. Note
       that fencing should kick in to remove the node permanently anyway,  but
       it may take longer than the network outage for this to complete.

       If  the	remainder  of the cluster is inquorate then we have a problem.
       The likelihood is that we will have two (or more) partitioned  clusters
       and  we cannot decide which is the "right" one. In this case we need to
       defer to the system administrator to kill an appropriate	 selection  of
       nodes to restore the cluster to sensible operation.

       The  latter  scenario  should be very rare and may indicate a bug some‐
       where in the code. If the local network is very flaky or busy it may be
       necessary to increase some of the protocol timeouts for OpenAIS. We are
       trying to think of better solutions to this problem.

       Recovering from this state can, unfortunately, be  complicated.	Fortu‐
       nately,	in the majority of cases, fencing will do the job for you, and
       the disallowed state will only be temporary. If it persists, the recom‐
       mended  approach	 it  is	 to do a cman tool nodes on all systems in the
       cluster and determine the largest common subset of nodes that are valid
       members	to each other. Then reboot the others and let them rejoin cor‐
       rectly. In the case of  a  single-node  disconnection  this  should  be
       straightforward,	 with  a  large cluster that has experienced a network
       partition it could get very complicated!

       Example:

       In this example we have a five node cluster that has experienced a net‐
       work partition. Here is the output of cman_tool nodes from all systems:
       Node  Sts   Inc	 Joined		      Name
	  1   M	  2372	 2007-11-05 02:58:55  node-01.example.com
	  2   d	  2376	 2007-11-05 02:58:56  node-02.example.com
	  3   d	  2376	 2007-11-05 02:58:56  node-03.example.com
	  4   M	  2376	 2007-11-05 02:58:56  node-04.example.com
	  5   M	  2376	 2007-11-05 02:58:56  node-05.example.com

       Node  Sts   Inc	 Joined		      Name
	  1   d	  2372	 2007-11-05 02:58:55  node-01.example.com
	  2   M	  2376	 2007-11-05 02:58:56  node-02.example.com
	  3   M	  2376	 2007-11-05 02:58:56  node-03.example.com
	  4   d	  2376	 2007-11-05 02:58:56  node-04.example.com
	  5   d	  2376	 2007-11-05 02:58:56  node-05.example.com

       Node  Sts   Inc	 Joined		      Name
	  1   d	  2372	 2007-11-05 02:58:55  node-01.example.com
	  2   M	  2376	 2007-11-05 02:58:56  node-02.example.com
	  3   M	  2376	 2007-11-05 02:58:56  node-03.example.com
	  4   d	  2376	 2007-11-05 02:58:56  node-04.example.com
	  5   d	  2376	 2007-11-05 02:58:56  node-05.example.com

       Node  Sts   Inc	 Joined		      Name
	  1   M	  2372	 2007-11-05 02:58:55  node-01.example.com
	  2   d	  2376	 2007-11-05 02:58:56  node-02.example.com
	  3   d	  2376	 2007-11-05 02:58:56  node-03.example.com
	  4   M	  2376	 2007-11-05 02:58:56  node-04.example.com
	  5   M	  2376	 2007-11-05 02:58:56  node-05.example.com

       Node  Sts   Inc	 Joined		      Name
	  1   M	  2372	 2007-11-05 02:58:55  node-01.example.com
	  2   d	  2376	 2007-11-05 02:58:56  node-02.example.com
	  3   d	  2376	 2007-11-05 02:58:56  node-03.example.com
	  4   M	  2376	 2007-11-05 02:58:56  node-04.example.com
	  5   M	  2376	 2007-11-05 02:58:56  node-05.example.com
       In  this	 scenario  we  should  kill  the  node node-02 and node-03. Of
       course, the 3 node cluster of node-01, node-04 & node-05 should	remain
       quorate	and be able to fenced the two rejoined nodes anyway, but it is
       possible that the cluster has a qdisk setup that precludes this.

Cluster utilities		  Nov 8 2007			  CMAN_TOOL(8)
[top]

List of man pages available for YellowDog

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net