mhd(7i) Ioctl Requests mhd(7i)NAMEmhd - multihost disk control operations
SYNOPSIS
#include <sys/mhd.h>
DESCRIPTION
The mhdioctl(2) control access rights of a multihost disk, using disk
reservations on the disk device.
The stability level of this interface (see attributes(5)) is evolving.
As a result, the interface is subject to change and you should limit
your use of it.
The mhd ioctls fall into two major categories: (1) ioctls for non-
shared multihost disks and (2) ioctls for shared multihost disks.
One ioctl, MHIOCENFAILFAST, is applicable to both non-shared and shared
multihost disks. It is described after the first two categories.
All the ioctls require root privilege.
For all of the ioctls, the caller should obtain the file descriptor for
the device by calling open(2) with the O_NDELAY flag; without the
O_NDELAY flag, the open may fail due to another host already having a
conflicting reservation on the device. Some of the ioctls below permit
the caller to forcibly clear a conflicting reservation held by another
host, however, in order to call the ioctl, the caller must first obtain
the open file descriptor.
Non-shared multihost disks
Non-shared multihost disks ioctls consist of MHIOCTKOWN, MHIOCRELEASE,
HIOCSTATUS, and MHIOCQRESERVE. These ioctl requests control the access
rights of non-shared multihost disks. A non-shared multihost disk is
one that supports serialized, mutually exclusive I/O mastery by the
connected hosts. This is in contrast to the shared-disk model, in which
concurrent access is allowed from more than one host (see below).
A non-shared multihost disk can be in one of two states:
· Exclusive access state, where only one connected host has I/O
access
· Non-exclusive access state, where all connected hosts have I/O
access. An external hardware reset can cause the disk to enter the
non-exclusive access state.
Each multihost disk driver views the machine on which it's running as
the "local host"; each views all other machines as "remote hosts". For
each I/O or ioctl request, the requesting host is the local host.
Note that the non-shared ioctls are designed to work with SCSI-2 disks.
The SCSI-2 RESERVE/RELEASE command set is the underlying hardware
facility in the device that supports the non-shared ioctls.
The function prototypes for the non-shared ioctls are:
ioctl(fd, MHIOCTKOWN);
ioctl(fd, MHIOCRELEASE);
ioctl(fd, MHIOCSTATUS);
ioctl(fd, MHIOCQRESERVE);
MHIOCTKOWN Forcefully acquires exclusive access rights to the mul‐
tihost disk for the local host. Revokes all access
rights to the multihost disk from remote hosts. Causes
the disk to enter the exclusive access state.
Implementation Note: Reservations (exclusive access
rights) broken via random resets should be reinstated
by the driver upon their detection, for example, in the
automatic probe function described below.
MHIOCRELEASE Relinquishes exclusive access rights to the multihost
disk for the local host. On success, causes the disk
to enter the non- exclusive access state.
MHIOCSTATUS Probes a multihost disk to determine whether the local
host has access rights to the disk. Returns 0 if the
local host has access to the disk, 1 if it doesn't,
and
-1 with errno set to EIO if the probe failed for some
other reason.
MHIOCQRESERVE Issues, simply and only, a SCSI-2 Reserve command. If
the attempt to reserve fails due to the SCSI error
Reservation Conflict (which implies that some other
host has the device reserved), then the ioctl will
return -1 with errno set to EACCES. The MHIOCQRESERVE
ioctl does NOT issue a bus device reset or bus reset
prior to attempting the SCSI-2 reserve command. It
also does not take care of re-instating reservations
that disappear due to bus resets or bus device resets;
if that behavior is desired, then the caller can call
MHIOCTKOWN after the MHIOCQRESERVE has returned suc‐
cess. If the device does not support the SCSI-2
Reserve command, then the ioctl returns -1 with errno
set to ENOTSUP. The MHIOCQRESERVE ioctl is intended to
be used by high-availability or clustering software for
a "quorum" disk, hence, the "Q" in the name of the
ioctl.
Shared Multihost Disks
Shared multihost disks ioctls control access to shared multihost disks.
The ioctls are merely a veneer on the SCSI-3 Persistent Reservation
facility. Therefore, the underlying semantic model is not described in
detail here, see instead the SCSI-3 standard. The SCSI-3 Persistent
Reservations support the concept of a group of hosts all sharing access
to a disk.
The function prototypes and descriptions for the shared multihost
ioctls are as follows:
ioctl(fd, MHIOCGRP_INKEYS, (mhioc_inkeys_t) *k);
Issues the SCSI-3 command Persistent Reserve In Read Keys to the
device. On input, the field k->li should be initialized by the
caller with k->li.listsize reflecting how big of an array the call‐
er has allocated for the k->li.list field and with k->li.listlen ==
0. On return, the field k->li.listlen is updated to indicate the
number of reservation keys the device currently has: if this value
is larger than k->li.listsize then that indicates that the caller
should have passed a bigger k->li.list array with a bigger
k->li.listsize. The number of array elements actually written by
the callee into k->li.list is the minimum of k->li.listlen and
k->li.listsize. The field k->generation is updated with the genera‐
tion information returned by the SCSI-3 Read Keys query.
If the device does not support SCSI-3 Persistent Reservations,
then this ioctl returns -1 with errno set to ENOTSUP.
ioctl(fd, MHIOCGRP_INRESVS, (mhioc_inresvs_t) *r);
Issues the SCSI-3 command Persistent Reserve In Read Reservations
to the device. Remarks similar to MHIOCGRP_INKEYS apply to the
array manipulation. If the device does not support SCSI-3 Persis‐
tent Reservations, then this ioctl returns -1 with errno set to
ENOTSUP.
ioctl(fd, MHIOCGRP_REGISTER, (mhioc_register_t) *r);
Issues the SCSI-3 command Persistent Reserve Out Register. The
fields of structure r are all inputs; none of the fields are modi‐
fied by the ioctl. The field r->aptpl should be set to true to
specify that registrations and reservations should persist across
device power failures, or to false to specify that registrations
and reservations should be cleared upon device power failure; true
is the recommended setting. The field r->oldkey is the key that the
caller believes the device may already have for this host initia‐
tor; if the caller believes that that this host initiator is not
already registered with this device, it should pass the special key
of all zeros. To achieve the effect of unregistering with the
device, the caller should pass its current key for the r->oldkey
field and an r->newkey field containing the special key of all
zeros. If the device returns the SCSI error code Reservation Con‐
flict, this ioctl returns -1 with errno set to EACCES.
ioctl(fd, MHIOCGRP_RESERVE, (mhioc_resv_desc_t) *r);
Issues the SCSI-3 command Persistent Reserve Out Reserve. The
fields of structure r are all inputs; none of the fields are modi‐
fied by the ioctl. If the device returns the SCSI error code Reser‐
vation Conflict, this ioctl returns -1 with errno set to EACCES.
ioctl(fd, MHIOCGRP_PREEMPTANDABORT, (mhioc_preemptandabort_t) *r);
Issues the SCSI-3 command Persistent Reserve Out Preempt-And-Abort.
The fields of structure r are all inputs; none of the fields are
modified by the ioctl. The key of the victim host is specified by
the field r->victim_key. The field r->resvdesc supplies the pre‐
empter's key and the reservation that it is requesting as part of
the SCSI-3 Preempt-And-Abort command. If the device returns the
SCSI error code Reservation Conflict, this ioctl returns -1 with
errno set to EACCES.
ioctl(fd, MHIOCGRP_PREEMPT, (mhioc_preemptandabort_t) *r);
Similar to MHIOCGRP_PREEMPTANDABORT, but instead issues the SCSI-3
command Persistent Reserve Out Preempt.
ioctl(fd, MHIOCGRP_CLEAR, (mhioc_resv_key_t) *r);
Issues the SCSI-3 command Persistent Reserve Out Clear. The input
parameter r is the reservation key of the caller, which should have
been already registered with the device, by an earlier call to
MHIOCGRP_REGISTER.
For each device, the non-shared ioctls should not be mixed with the
Persistent Reserve Out shared ioctls, and vice-versa, otherwise, the
underlying device is likely to return errors, because SCSI does not
permit SCSI-2 reservations to be mixed with SCSI-3 reservations on a
single device. It is, however, legitimate to call the Persistent
Reserve In ioctls, because these are query only. Issuing the MHIOC‐
GRP_INKEYS ioctl is the recommended way for a caller to determine if
the device supports SCSI-3 Persistent Reservations (the ioctl will
return -1 with errno set to ENOTSUP if the device does not).
MHIOCENFAILFAST Ioctl
The MHIOCENFAILFAST ioctl is applicable for both non-shared and shared
disks, and may be used with either the non-shared or shared ioctls.
ioctl(fd, MHIOENFAILFAST, (unsigned int *) millisecs);
Enables or disables the failfast option in the multihost disk
driver and enables or disables automatic probing of a multihost
disk, described below. The argument is an unsigned integer speci‐
fying the number of milliseconds to wait between executions of the
automatic probe function. An argument of zero disables the fail‐
fast option and disables automatic probing. If the MHIOCENFAILFAST
ioctl is never called, the effect is defined to be that both the
failfast option and automatic probing are disabled.
Automatic Probing
The MHIOCENFAILFAST ioctl sets up a timeout in the driver to periodi‐
cally schedule automatic probes of the disk. The automatic probe
function works in this manner: The driver is scheduled to probe the
multihost disk every n milliseconds, rounded up to the next integral
multiple of the system clock's resolution. If
1. the local host no longer has access rights to the multihost
disk, and
2. access rights were expected to be held by the local host,
the driver immediately panics the machine to comply with the failfast
model.
If the driver makes this discovery outside the timeout function, espe‐
cially during a read or write operation, it is imperative that it panic
the system then as well.
RETURN VALUES
Each request returns -1 on failure and sets errno to indicate the
error.
EPERM Caller is not root.
EACCES Access rights were denied.
EIO The multihost disk or controller was unable to success‐
fully complete the requested operation.
EOPNOTSUP The multihost disk does not support the operation. For
example, it does not support the SCSI-2 Reserve/Release
command set, or the SCSI-3 Persistent Reservation com‐
mand set.
ATTRIBUTES
See attributes(5) for a description of the following attributes:
┌─────────────────────────────┬─────────────────────────────┐
│ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
│Availability │SUNWhea │
│Stability │Evolving │
└─────────────────────────────┴─────────────────────────────┘
SEE ALSOioctl(2), open(2), attributes(5), open(2)SunOS 5.10 9 Feb 2004 mhd(7i)