Instruction/ maintenance manual of the product SG24-5131-00 IBM
Go to page of 240
SG24-51 31-00 Internatio na l T echnical Support Organization http://www.redbooks.ibm.com IBM Certification Study Guide AIX HACMP David Thiessen, Achim Reh or, Reinhard Zettler.
.
IBM Certificat ion Study Gui de AIX HACMP May 1999 SG24-5131-00 International T echnical Support Organizatio n.
© Copyright International Busine ss Mac hines Corpora tion 1999. All rights reser ved. Note to U.S Gov ernmen t Users – Do cum entation r elated to r estric ted righ ts – Us e, duplic ation or disclosu re is subject to re stricti ons set forth in GSA ADP Sc hedule Contra ct with IBM Corp .
© Copyright IBM Corp. 1 999 iii Contents Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ix Ta b l e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iv IBM Certificatio n Study Guide A IX HAC MP Chapter 3. Cluster Hardware and Software Preparation . . . . . . . . . . . 51 3.1 Cluster Node Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.1.1 Adapter Slot Placement .
v 5.1.3 Event Notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.1.4 Event Recovery and Ret ry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.1.5 Notes on Cust omizing Event Processing . . . . . . .
vi IBM Certific ation Stu dy Guide A IX HAC MP 8.1.1 The clstat C ommand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.1.2 Monitoring Clusters using HAView . . . . . . . . . . . . . . . . . . . . . . 152 8.1.3 Cluster Log Files .
vii 9.3 VSDs - RVSDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 9.3.1 Virtual Shared Disk (VSDs) . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 0 9.3.2 Recoverable Virtual Shared Disk . . . . . . . .
viii IBM C erti fication S tud y Guide A IX HA CMP.
© Copyright IBM Corp. 1 999 ix Figures 1. Basic SSA Confi guration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 7 2. Hot-Standby Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.
x IBM Certificatio n St udy Gui de AIX H ACMP.
© Copyright IBM Corp. 1 999 xi Ta b l e s 1. AIX Versi on 4 HACMP Installati on and Impl ementatio n . . . . . . . . . . . . . . . 4 2. AIX Versi on 4 HACMP S ystem Admini stration . . . . . . . . . . . . . . . . . . . . . . 5 3. Hardware Requi rements for the Different HAC MP Versio ns .
xii IBM Certifica tion Stud y Gu ide AIX HA CMP.
xiii Pref ace The AIX and RS/6000 Certifications of fer ed through the Professional Certification Program from I BM are designed to validate t he skills required of technical professionals who work in the powerful and o ften complex environments of AIX and RS/6000.
xiv IBM Certification Study Guide AIX HACM P • AIX par ameters that are af fec ted by an HACMP installation, and their correct settings • T he cluster and resource configuration process, including.
xv POWERparallel Systems area, known as the SP1 at that time. In 1997 he began working on HACMP as the Service Groups for HACMP and RS/6000 SP merged into one. He holds a diploma in Computer Science from the University of Frankfurt in Germany . This is his first redbook.
xvi IBM Certification Study Guide AIX HACM P.
© Copyright IBM Corp. 1 999 1 Chapter 1. Certif ication Ov erview This chapter provides an overview of the skill requirements for obtaining an IBM Certified Specialist - AIX HACMP certification. The following chapters are designed to provide a comprehensive review of specific topics that are essential for obtaining the certification.
2 IBM Certificatio n St udy Gui de AIX H ACMP 1.2 Certificati o n Exam Ob jectives The following objectives were used as a basis for what is required when the certification exam was developed. Some of these topics have been regrouped to provide better organization when discussed in this publication.
Certification Overview 3 • Cr eate an application server . • Set up E vent Notification. • Set up ev ent notification and pre/post event scripts. • Set up error notification. • Post Configur ation Activities. • Conf i gure a client notification and ARP update.
4 IBM Certificatio n St udy Gui de AIX H ACMP 1.3 Certific ation Educ ation Courses Courses and publications are of fered to help you prepare for the certification tests. These courses are recommended, but not r equi red, before taking a certification test.
Certification Overview 5 The following table outlines information about the next course. T abl e 2. AIX Version 4 HA CMP S ystem A dmin istration Course Number Q1 150 (USA ); AU50 (Worldwide) Course D.
6 IBM Certificatio n St udy Gui de AIX H ACMP.
© Copyright IBM Corp. 1 999 7 Chapter 2. Cluster Planning The area of cluster planning is a large one. Not only does it include planning for the types of hardware (CPUs, netw or ks, disks) to be used in the cluster , but it also includes other aspects.
8 IBM Certificatio n St udy Gui de AIX H ACMP RISC System/6000 models as nodes in an HACMP 4.1 for AIX, HACMP 4.2 for AIX, or HACMP 4.3 for A IX cluster .
Cluster Planning 9 Much of the decision centers around the following ar eas: • Processor capacity • Application requirements • Anticipated gr owth requirements • I/ O slot requirements These paradigms are certainly not new ones, and are also important considerations when choosing a processor for a single-system environment.
10 IBM Certifica tion Stud y Gu ide AIX HA CMP Y our slot configuration must also allow for the disk I/O adapters you need to support the cluster’ s s hared disk (volume group) configuration.
Cluster Planning 11 2.2 Cl uste r Networks HACMP differentiat es between two major types of networks: T CP/IP networks and non-TCP/IP networks. HACMP utilizes both of them for exchanging heartbeats. HACMP uses these heartbeats to diagnose failures in the cluster .
12 IBM Certifica tion Stud y Gu ide AIX HA CMP • FDDI • SP Switch •S L I P •S O C C • T oken- Ring As an independent, layered component of AIX, the HACMP for AIX software works with most TCP/IP-based networks. HACMP for AIX has been tested with standard Ethernet interfaces (en *) but not with IEEE 802.
Cluster Planning 13 Network types also differentiate themselves in the maximum distance they allow between adapters, and in the maximum number of adapters al lowed on a physical network. • Ethernet supports 10 and 100 Mbps currently , and supports hardware address swapping.
14 IBM Certifica tion Stud y Gu ide AIX HA CMP • SP Switch is a high-speed packet switching network, running on the RS/6000 SP system only . It runs bidirectionally up to 80 MBps, which adds up to 160 MBps of capacity per adapt er . This is node-to-node communication and can be done in parallel between every pair of nodes inside an SP .
Cluster Planning 15 2.2.2.2 Special C onsiderat ions As for TCP/IP networks, there are a number of restrictions on non-TCP/IP networks. These are explained for the three dif ferent types in more detail below . Serial (RS232) A serial (RS232) network needs at least one available serial por t per cluster node.
16 IBM Certifica tion Stud y Gu ide AIX HA CMP 2 a PCI Multiport Async Card is required in an S7X model, no native ports 3 only one serial port available for customer use, i.
Cluster Planning 17 SSA subsystems are built up from loops of adapters and disks. A simple example is shown in Figure 1. Figure 1. Basic SSA Co nfigur ation Here, a single adapter controls one SSA loop of eight disks. Data can be transferred around the lo op, in either direction, at 20 MBps.
18 IBM Certifica tion Stud y Gu ide AIX HA CMP • 7133 Serial Storage Architecture (SSA) Disk Subs ystem Models 010, 500, 020, 600, D40 and T40. The 7133 models 010 and 500 were the fir s t SSA products announced in 1995 with the revolutionary new Serial Storage A rchitecture.
Cluster Planning 19 2.3.1. 1 Disk C apacities T able 8 lists the dif ferent SSA disks, and provides an overview of their characteristics. T abl e 8. SSA Disk s 2.3.1.2 Supported and Non-Supported Adapters T able 9 lists the dif ferent SSA adapters and presents an overview of their characteristics.
20 IBM Certifica tion Stud y Gu ide AIX HA CMP 1 See 2.3.1.3, “Rules for SSA Loops” on page 20 for more information. The following rules apply to SSA Adapters: • Y ou cannot have more than four adapters in a single system.
Cluster Planning 21 • A maximum of 48 devices can be connected in a particular SSA loop. • Only one pai r of adapter connectors can be connected in a particular SSA loop.
22 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.3.1.4 RAID vs . Non-RAID RAID T echnology RAID is an acronym for Redundant Array of Independent Disks. Disk arrays are groups of disk drives that work together to achieve higher data-transfer and I/O rates than those provided by single large drives.
Cluster Planning 23 RAID Leve ls 2 and 3 RAID 2 and RAID 3 are parallel process array mechanisms, where all drives in the array operate in unison. Similar to data st riping, information to be written to disk is s plit into chunks (a fixed amount of data), and each chunk is written out to the same physic al position on separate disks (in parallel).
24 IBM Certifica tion Stud y Gu ide AIX HA CMP As with RAID 3, in the event o f disk f ailure, the information can be r ebuilt from the remaining drives. RAID level 5 array also uses parity information, though it is still important to make regular backups of the data in the array .
Cluster Planning 25 • Array member drives and spares must be on same loop (cannot span A and B loops) on the adapter . • Y ou cannot boot (ipl) from a RAID. 2.3.1. 5 Advan tages Because SSA allows SCSI-2 mapping, all functions associated with initiators, targets, and logical units are translatable.
26 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.3.2 SCSI Disks After the announcement of the 7133 SS A Disk Subsystems, the SCSI Disk subsystems became less common in HACMP clusters. However , the 7135 RAIDiant Array (Model 1 10 and 210) and other SCSI Subsystems are still in use at many customer sites.
Cluster Planning 27 • Enhanced SCSI-2 Diff erential Fast/Wide Adapter/A (MCA, FC: 2412, Adapter Label: 4-C); not us able with 7135-1 10 • SCSI-2 Fast/Wide Differ ential Adapter (PCI, FC: 6209, Adapter Label: 4-B) • DE Ultra SCSI Adapter (PCI, FC: 620 7, Adapter Label: 4-L); not usable with 7135-1 10 2.
28 IBM Certifica tion Stud y Gu ide AIX HA CMP withdraw the 7135 RAID iant Systems from marketing because it is equally possible to configure RAID on the SSA Subsystems.
Cluster Planning 29 • Cascading • Rotating • Concurrent Each of these types describes a differ ent set of relationships between nodes in the cluster , and a dif ferent set of behaviors upon nodes entering and leaving the cluster .
30 IBM Certifica tion Stud y Gu ide AIX HA CMP reintegration, a node remains as a standby and does not take back any of the resources that it had initially served. Concu rrent Resour ce Groups: A concurrent resource group may be shared simultaneously by multiple nodes.
Cluster Planning 31 Figure 2. Hot- Standb y Con figur ation In this configuration, there is one c ascading resource group cons isting of the four disks, hdisk1 to hdisk4, and their constituent volume groups and file systems. Node 1 has a priority of 1 for this resource group while node 2 has a priority of 2.
32 IBM Certifica tion Stud y Gu ide AIX HA CMP the cluster becomes a standby node. Y ou must choose a rotating standby configuration if you do not want a break in service during reintegr ation. Since takeover nodes continue providing s ervices until they have to leave the cluster , you should configure your cluster with nodes of equal power .
Cluster Planning 33 When a failed node reintegrates into the cluster , it takes back the resource group for which it has the highest priority . Therefore, even in this configuration, there is a break in serv ice during reintegration.
34 IBM Certifica tion Stud y Gu ide AIX HA CMP Here the resource groups are the same as the one s in the mutual takeover configuration. Also, similar to the previous configuration, nodes 1 and 2 each have priorities of 1 for one of the r esource groups, A or B.
Cluster Planning 35 • Design the network topology • Define a network mask for your site • Define IP addresses (adapter identifiers) for e ach node’s service an d standby adapters. • Define a boot address for each service adapte r that can be taken over , if you are using IP address takeover or rotating resources.
36 IBM Certifica tion Stud y Gu ide AIX HA CMP Dual Network A dual-network setup has two separ ate networks for communication. Nodes are connected to two network s, and each node has two service adapters available to clients. If one network fails, the remaining network can still function, connecting nodes and providing resource access to clients.
Cluster Planning 37 The following diagram shows a cluster consisting of two nodes and a client. A single public network connects the nodes and the client, and the nodes are linked point-to-point by a private high-speed SOCC connection t hat provides an alternate path for cluster and lock traf fic should the public network fail.
38 IBM Certifica tion Stud y Gu ide AIX HA CMP SLIP are considere d public networ ks. Note th at a SLIP line, ho wever, does not p rovide client a ccess. Private A private network provides communication between nodes only; it typically does not allow client access.
Cluster Planning 39 until it assumes the shared IP address. Consequently , Clinfo makes known the boot address for this adapter . In an HACMP for AIX environm ent on the RS/6000 SP , the SP Ethernet adapters can be configured as service adapters but should not be configured for IP address takeover .
40 IBM Certifica tion Stud y Gu ide AIX HA CMP service label (address) instead of the boot label. If the node should fail, a takeover node acquires the failed node’ s service address on its standby adapter , thus making the failure transparent to clients using that specific service address.
Cluster Planning 41 If you do not use Hardwar e Address T ak eover , the ARP cache of clients can be updated by adding the clients’ IP addresses to the PING_CLIENT_L IST variable in the /usr/sbin/cluster/etc/clinfo.
42 IBM Certifica tion Stud y Gu ide AIX HA CMP application on the takeover node when a fallover occurs. For more information about creating application server resources, see the HACMP for AIX, V ersion 4.
Cluster Planning 43 2.5. 3 Lic ensing Met hods Some vendors require a unique license for each processor that runs an application, which means that you must license-protect the application by incorporating processor-specific information into the application when it is installed.
44 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.6 Cus tomization P lanning The Cluster Manager ’s ability to recognize a specific series of events and subevents permits a very flexible customization scheme. The HACMP for AI X software provides an event customization facility that allows you to tailor cluster event processing to your site.
Cluster Planning 45 event to inform system administrators that t raffic may have to be rerouted. Afterwards, you can use a network_u p notification event to inform system administrators that traf fic can again be serviced thr ough the restored network.
46 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.6.2.1 Single Point-of-Fail ure Hardware Component Recovery As described in 2.2.1.2, “Special Network Considerations” on page 12, the HPS Switch network is one resource that has to be cons idered as a single point of failure.
Cluster Planning 47 The above example screen will add a Notification Method to the ODM, so that upon appearance of the HPS_F AUL T9_ER entry in the error log, the er ror notification daemon will trigger the execution of t he /usr/sbin /cluster/u tiliti es/clstop -gr sy command, which shuts HACMP down gracefully with takeover .
48 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.7 Us er ID Plan ning The following sections describe various aspects of User ID Planning. 2.7. 1 Clus ter User a nd Grou p IDs One of the basic tasks any system administrator must perform is setting up user accounts and groups.
Cluster Planning 49 2.7. 2 Clus ter Passwo rds While user and group management is ver y much facilitated with C-SPO C, the password information still has to be distributed by some other means.
50 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.7.3.3 NFS-Mounted Home Directories on Shared V olumes So, a combined approach is used in most cases. In order to make home directories a highly available resource, they have to be part of a resource group and placed on a shared volume.
© Copyright IBM Corp. 1 999 51 Chapter 3. Cluster Hardware and Softwar e Prepar ation This chapter covers the steps that are required to prepar e the RS/6000 hardware and AIX software for the i nstallation of HACMP and the configuration of the cluster .
52 IBM Certifica tion Stud y Gu ide AIX HA CMP mirroring rootvg in order to a v oid the impact of the failover time involved in a node failure. In terms of maximizing availability , this technique is just as valid for increasing the availability of a cluster as it is for increasing single-sys tem availability .
Cluster Hardware and S oftware Preparation 53 mirrored. If the dump devices are NOT the paging device, that dump logi cal volume will not be mirrored. 3.1. 2.1 Procedur e The following steps assume the user has rootvg contained on hdisk0 and is attempting to mirror the rootvg to a new disk : hdisk1.
54 IBM Certifica tion Stud y Gu ide AIX HA CMP “-m” option. Y ou should consult documentation on the us age of the “-m” option for mklv copy . 4.
Cluster Hardware and S oftware Preparation 55 3.1 .2.2 Ne cessary A P AR Fix es T able 1 1. Necessar y AP AR Fixes T o determine if either fix is installed on a machine, execute the following: 3.1. 3 AIX P rerequis ite LP Ps In order to install HACMP and HACMP/ES the AIX setup must be in a pr oper state.
56 IBM Certifica tion Stud y Gu ide AIX HA CMP • nv6000.database.obj 4.1.0.0 • nv6000.Features.obj 4.1.2.0 • nv6000.client.obj 4.1.0.0 and for HA View 4.3 • xlC.rte 3.1.4.0 • nv6000.base.obj 4.1.2.0 • nv6000.database.obj 4.1.2.0 • nv6000.
Cluster Hardware and S oftware Preparation 57 and low-water marks. If a process tries to wr ite to a file at the high-water mark, it must wait until enough I/O oper ations have finished to make the low-water mar k. Use the smi t chgsys fastpath to set high- and low-water marks on the Change/Show Characteristics of the Operat ing System screen.
58 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.1.4.3 Editing the /e tc /hosts File a nd Nameserver Configuration Make sure all nodes can resolve all cluster addresses. See the chapter on planning TCP/IP networks (the section Using HACMP with NIS and DNS) in the HACMP for AIX, V ersion 4.
Cluster Hardware and S oftware Preparation 59 3.1.4.5 Editing the /.rhosts File Make sure that each node’s service adapters and boot addresses are listed in the /.rhosts file on each cluster node. D oi ng so allows the /usr/sbi n/cluster/uti lities/cl runcmd command and the /usr/sbin/cluster/godm daemon to run.
60 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.2 Networ k Connection and T es ting The following sections describe important aspects of network connection and testing. 3.2.1 TC P/IP Networks Since there are several types of TCP/IP Networ ks available within HACMP, there are several dif ferent characteristics and some restrictions on them.
Cluster Hardware and S oftware Preparation 61 . Figure 9. Con necting Networ ks to a Hub 3.2.1.2 IP Addresses and Subnets The design of the HACMP for AIX software s pecifies that: • All client traff.
62 IBM Certifica tion Stud y Gu ide AIX HA CMP T o comply with these rules, pay careful attention to the IP addresses you assign to standby adapters. Standby adapters mus t be on a separ ate s ubnet from the service adapt ers, even though they are on the same physical network.
Cluster Hardware and S oftware Preparation 63 • Scan the /tmp/hacmp.out file to confirm that the /etc/rc.net script has run successfully . Look for a zero exit status. • If IP address takeover is enabled, confirm that the /etc/rc.net script has run and that the service adapter is on its service address and not on its boot address.
64 IBM Certifica tion Stud y Gu ide AIX HA CMP TMSS A T arget-mode SSA is only supported with the SSA Multi-Initiator RAID Adapters (Feature #6215 and #6219) , Microcode Level 1801 or later . Y ou need at least HACMP V ersion 4.2.2 with AP AR IX75718.
Cluster Hardware and S oftware Preparation 65 3.2.2.4 Configuring T arget Mode SSA The node number on each system needs to be changed fr om the default of zero to a number .
66 IBM Certifica tion Stud y Gu ide AIX HA CMP cat /etc /environment > /dev/tm ssay.im on the corresponding node for wr iting. x and y cor respond to the appropriate opposite nodenumber . Y ou should see the first command hanging unt i l the second command is issued, and then showing its output.
Cluster Hardware and S oftware Preparation 67 For more information regarding adapters and c abling rules see 2.3.1, “SSA Disks” on page 16 or the following documents: • 7133 SSA Disk Subsystems:.
68 IBM Certifica tion Stud y Gu ide AIX HA CMP Adapter Definitions By issuing the following command, you can check the correct adapter configuration. In order to work correctly , the adapt er must be in the “Available” state: The third column in the adapter device line shows the location of the adapter .
Cluster Hardware and S oftware Preparation 69 SSA physical disks: • Are configured as pdisk0, pdisk1,...,pdiskN. • Have errors logged against them in the system error log. • Support a character special file (/dev/pdisk0, /dev/pdisk1,...,/dev/p.diskN).
70 IBM Certifica tion Stud y Gu ide AIX HA CMP Configuration V erification This option enables you to display the relationships between physical (pdisk) and logical (hdisk) di sks . Format Disk T his option enables you to format SSA disk drives. Certify Disk This option enables you to test whether data on an SSA disk drive can be read correctly .
Cluster Hardware and S oftware Preparation 71 12.Run cfgmgr to install the microcode to adapters. 13.T o complete the device driver upgrade, you must now reboot your s ystem. 14.T o confirm that the upgrade was a success, type lscfg -vl s saX whe re X i s 0,1.
72 IBM Certifica tion Stud y Gu ide AIX HA CMP 18.T o confirm that the upgrade was a success, type lscfg -vl pd iskX where X is 0,1... for all SSA disks. Check the ROS Level line to see that each disk has the appropriate microcode level (for the correct microcode level see the above mentioned web-site).
Cluster Hardware and S oftware Preparation 73 3.3.2. 1 Cabli ng The following sections describe important information about cabling. SCSI Ada pters A overview of SCSI adapters that can be used on a shared SCSI bus is given in 2.3. 2.3, “Supported SCSI Adapters” on page 26.
74 IBM Certifica tion Stud y Gu ide AIX HA CMP FC: 2902 or 9202 (2.4m), PN: 67G 1260 - OR - FC: 2905 or 9205 (4.5m), PN: 67G 1261 - OR - FC: 2912 or 9212 (12m), PN: 67G1262 - OR - FC: 2914 or 9214 (14.
Cluster Hardware and S oftware Preparation 75 FC: 2426 (0.94m), PN: 52G4234 • 16-Bit SCSI-2 Differential System-to-System Cable FC: 2424 (0.6m), PN: 52G4291 - OR - FC: 2425 (2.5m), PN: 52G4233 This cable is used only if there are more than two nodes attached to the same shared bus.
76 IBM Certifica tion Stud y Gu ide AIX HA CMP T T T T 6 bit) 6 (16-bit) #2416 (16 - #2424 6-bit) 6 (16-bit ) #2426 #2416 (16- b #2416 (16-bit) #2426 Maximum total cab le length: 25m.
Cluster Hardware and S oftware Preparation 77 Figure 1 1. 71 35-1 10 RAIDi ant Arr ays Con nected on T wo S hared 16-Bit SC SI Buses 3.3.2.3 Adapter SCSI ID and T ermination change The SCSI-2 Diff er ential Controller is used to connect to 8-bit disk devices on a shared bus.
78 IBM Certifica tion Stud y Gu ide AIX HA CMP SCSI-2 Dif ferential Fast/Wide Adapter/A and Enhanced SCSI-2 Dif ferential Fast/Wide Adapter/A) are shown in Figure 12 and F igure 13 respectively . Figure 12. T erminatio n on th e SCSI-2 Differen tial Co ntroller Figure 13.
Cluster Hardware and S oftware Preparation 79 The ID of an SCSI adapter , by default, is 7. Since each device on an SCSI bus must have a unique ID, the ID of at least one of the adapters on a shared SCSI bus has to be changed. The procedure to change the ID of an SCSI-2 Differential Controller is: 1.
80 IBM Certifica tion Stud y Gu ide AIX HA CMP 4. Reboot the machine to bring the change int o effect . The same task can be executed from the command line by entering: Also with this method, a reboot is required to br ing the change into eff ec t.
Cluster Hardware and S oftware Preparation 81 The command line version of this is: As in the case of the SCSI-2 Differential Contr oller, a system reboot is required to bring the change into ef fect.
82 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.4.1 Cre ating Share d VGs The following sections contain information about creating non-concurr ent VGs and VGs for concurrent access. 3.4.1.1 Creating Non-Concurrent VGs This section covers how to create a shar ed volume group on the source node using the SMIT interface.
Cluster Hardware and S oftware Preparation 83 Creating a Concurrent Acce s s V olume Group on Serial Dis k Subsystems T o us e a concurrent access volume group, defined on a serial disk subsystem such as an IBM 7133 disk subsystem, you must create it as a concurrent-capable volume group.
84 IBM Certifica tion Stud y Gu ide AIX HA CMP Use the smit mkvg fastpath to create a shared volu me group. Use the default field values unless your site has other requirements, or unless you are specifically instructed otherwise. T abl e 15. sm it mkvg Optio ns (C oncur rent, RAID) 3.
Cluster Hardware and S oftware Preparation 85 the journaled file sys tem log (jfslog) is a logi cal volume that requires a unique name in the cluster . T o make sur e that logical volumes have unique names, rename the logical volume associated with the file system and the corresponding jfslog logical volume.
86 IBM Certifica tion Stud y Gu ide AIX HA CMP That is, you enter this command for each disk. In the resulting display , locate the line for the logical volume for which you just added copies. For copies placed on separate disks, the numbers in the logical partitions column and the physical partitions column should be equal.
Cluster Hardware and S oftware Preparation 87 The T askG uide uses a graphical interface to guide you thr ough the steps of adding nodes to an existing volume group. For more information on t he T ask Guide, see 3.4.6, “Alternate Method - T askG uide” on page 90.
88 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.4.4.4 V arying Off the V ol ume Group on the De stination Nodes Use the varyoffvg command to deactivate the shared volume group so that it can be imported onto another destination node or activ ated as appropriate by the cluster event scripts.
Cluster Hardware and S oftware Preparation 89 command succeeds. If exactly half the copies are available, as with two of four , quorum is not achieved and the varyonvg command fails.
90 IBM Certifica tion Stud y Gu ide AIX HA CMP Forcing a V aryon A volume group with quorum disabled and one or more physical volumes unavailable can be “forced” to vary on by using the -f flag with the varyonvg command.
Cluster Hardware and S oftware Preparation 91 conflict with the cluster ’s configuration. Online help panels give additional information to aid in each step. 3.4.6.1 T ask Guide Requir ements Before starting the T askGuide, make sur e: • Y ou have a configured HACMP cluster in place.
92 IBM Certifica tion Stud y Gu ide AIX HA CMP.
© Copyright IBM Corp. 1 999 93 Chapter 4. HACMP Installation and Cluster Definition This chapter describes issues concerning the actual installation of HACMP V ersion 4.3 and the definition of a cluster and its r esources. It concentrates on the HACMP part of the installation, so, we will assume AIX is already at the 4.
94 IBM Certifica tion Stud y Gu ide AIX HA CMP cluster. base.server.u tils HACMP Base Server Uti lities • cluster .cspoc This component includes all of the commands and environment for the C-SPOC utility , the Cluster-Single Point Of Control feature.
HACMP Installation and Cluster D efinition 95 • cl ust er .vsm The Visual Systems Management File set contains Icons and bitmaps for the graphical Management of HACMP Resources, as well as the x hacmpm command: cluster. vsm HACMP X11 Depen dent • cluster .
96 IBM Certifica tion Stud y Gu ide AIX HA CMP This fileset contains the Application Heart Beat Daemon, Oracle Parallel Server is an application that makes use of it: cluster.hc.rte Application Heart Be at Daemon The inst allation of CRM req uires th e followin g softwar e: bos.
HACMP Installation and Cluster D efinition 97 HACMP software to HACMP for AIX, V ersion 4.3. The comments on upgrading the Operating System are not included. If you are already running AIX 4.3, see the special note at the end of this section. 4.1.2.1 Upgrading from V ersion 4.
98 IBM Certifica tion Stud y Gu ide AIX HA CMP Install HA CMP 4.3 for AI X on Node A 5. After upgrading AIX and verifying that the disks are correctly configured, install the HACMP 4.3 for AIX software on Node A. For a short description of the filesets, please refer to 4.
HACMP Installation and Cluster D efinition 99 file on Node A using the following command: /usr/sbi n/cluster/uti lities/cl lsif -x >> /.rhos ts This command will append information to the /.rhosts file instead of overwriting it. Then, you can ftp this file to t he other nodes as necessar y .
100 IBM Certific ation Stu dy Guid e AIX HAC MP 2. If you wish to save your cluster configuration, see the chapter Sav ing and Restoring Cluster Configurations in the HACMP for AIX, V ersion 4.3: Administration Guide, SC23-4279. 3. Commit your current HACMP for AIX software on all nodes.
HACMP Installation and Cluster Defini tion 10 1 • The network modules Y ou define the cluster topology by enter ing information about each component into HACMP-specific ODM classes. Y ou enter the HACMP ODM data by using the HACMP SMIT interface or the VSM utility xhacmpm .
102 IBM Certific ation Stu dy Guid e AIX HAC MP Adding or Changing a Node Name a fte r the Initial Config uration If you want to add or change a node name after the initial configuration, use the Change/Show Cluster Node N ame screen. See the chapter on changing the cluster topology of the HACMP for AIX, V ersion 4.
HACMP Installation and Cluster Defini tion 10 3 Network Name Enter an ASCII text string that identifies the network. The network name can include alphabe tic and numeric characters and underscores. Use no more than 31 characters. The network name is arbitrary , but must be used consistently for adapters on the same physical network.
104 IBM Certific ation Stu dy Guid e AIX HAC MP Adapte r Iden tifier Enter the IP address in dotted decimal format or a device file name. IP address information is required for non-serial network adapters only if the node’ s address cannot be obtained from t he domain name server or the local /etc/hosts file (using the adapter IP label given).
HACMP Installation and Cluster Defini tion 10 5 Adding or Changing Ada pters after the Initi al Configuration If you want to change the information about an adapter after the initial configuration, use the Change/Show an Adapter screen. See the chapter on changing the cluster topology in the HACMP for AIX, V ersion 4.
106 IBM Certific ation Stu dy Guid e AIX HAC MP •S L I P • SP Switch •A T M It is highly unlikely that you will add or remove a network module. For information about changing a characteristic of a Network Module, such as the failure detection rate, see the chapter on changing the cluster topology in t he HACMP for AIX, V ersion 4.
HACMP Installation and Cluster Defini tion 10 7 configuration. If the cluster manager is active on some other cluster nodes but not on the lo cal node, the synchronization operation is aborted.
108 IBM Certific ation Stu dy Guid e AIX HAC MP 4.3 Defin ing Reso urces The HACMP for AIX software provides a highly available environment by identifying a set of cluster-wide resources essential to uninterrupted processing, and then by defining relationships among nodes that ensure these resources are available to client processes.
HACMP Installation and Cluster Defini tion 10 9 4.3.1.1 Configuring Resources for Resource Groups Once you have defined resource groups, you further configur e them by assigning cluster resources to one resource group or another . Y ou can configure resource groups even if a node is powered down.
11 0 IBM Certification S tudy Gu ide AIX HACMP These settings also have to be synchronized throughout the cluster . Therefore Synchronize Cluster Resources has to be chosen from the corresponding SMIT Menu. If the Cluster Manager is running on t he local node, synchronizing cluster resources triggers a dynamic reconfiguration event (DARE, see 8.
HACMP Installation and Cluster Definition 111 as the path locations for start and s top scripts for the application. These scripts have to be in the same location on every service node. Just as for pre- and post-events, these scripts can be adapted to specific nodes.
11 2 IBM Certification S tudy Gu ide AIX HACMP 4.4.2 Initia l Startup At this point in time, the cluster is not yet started. So the cluster manager has to be started first.
HACMP Installation and Cluster Definition 11 3 For cascading resource groups the failed node is going to reaquire its resources, once it is up and running again. So, you have to restart HACMP on it through smit ty clstart and check again for the logfile, as well as the clusters status.
11 4 IBM Certification S tudy Gu ide AIX HACMP Essentially , a snapshot saves all the ODM classes HACMP has generated during its configuration. It does not save user customized scripts, such as start or stop scripts for an application server . However , the location and names of these scripts are in an HACMP ODM class, and are therefore saved.
HACMP Installation and Cluster Definition 11 5.
11 6 IBM Certification S tudy Gu ide AIX HACMP.
© Copyright IBM Corp. 1 999 11 7 Chapter 5. Cluster Customization Within an HACMP for AIX cluster , there are several things that are customizable. The following paragraphs explain the customizing features for events, error notification, network modules and topology services.
11 8 IBM Certification S tudy Gu ide AIX HACMP acquire_service_addr (If configured for IP address takeover .) Configures boot addresses to the corresponding service address, and starts TCP/IP servers and network daemons by running the t elinit -a command.
Cluster Customization 11 9 event occurs only after a node_up_remote event has successfully completed. Sequence of node_down Events node_d own This event occ urs when a node intentionally leaves the cluster or fails.
120 IBM Certific ation Stu dy Guid e AIX HAC MP node_down_local_complete Instructs the Cluster Manager to ex it when the local node has left the cluster . This ev ent occurs only after a node_down_local event has successfully compl eted. node_down_remote_complete Starts takeov er application servers.
Cluster Customization 121 no actions since appropriate actions depend on the local network c onfiguration. 5.1.1. 3 Netwo rk Adapte r Event s swap_adapter This event occurs when the service adapter on a node fails.
122 IBM Certific ation Stu dy Guid e AIX HAC MP reconfig_resource_complete This event indicates that a cluster resource dynamic reconfiguration has completed.
Cluster Customization 123 For example, a file system cannot be unmounted, because of a process running on it. Then, you might want to kill that process first, before unmounting the file system, in order to ge t the event scr ipt done.
124 IBM Certific ation Stu dy Guid e AIX HAC MP Each time an error is logged in the system error log, the error notification daemon determines if the error lo g entry matches the selection criteria. If it does, an executable is run. This executable, called a notify method , can range from a simple command to a complex program.
Cluster Customization 125 The failure rate of network s varies, depending on their characteristics. For example, for an Ethernet, the nor mal failure detection rate is two ke epalives per second; fast is about four per second; slow is about one per second.
126 IBM Certific ation Stu dy Guid e AIX HAC MP T o prevent problems with NFS file systems in an HACMP cluster , make sure that each shared volume group has the same major number on all nodes. The lvlstmaj or command lists the free major numbers on a node.
Cluster Customization 127 Figure 14. NFS Cross Mounts When Node A fail s, Node B uses the cl_nfskill utility to close open files in Node A:/afs, unmounts it, mounts it locally , and r e-exports it to waiting clients.
128 IBM Certific ation Stu dy Guid e AIX HAC MP • Ensure that node name and the service adapter label ar e the same on each node in the cluster or • Alias the node name to the service adapter label in the / etc/hosts file.
Cluster Customization 129 ######## A dd for NF S Lock Removal ( start) ### ##### ######## A dd for NF S Lock Removal ( finish) ## ###### ########## ######### ####### ######### ########## ########## ##.
130 IBM Certific ation Stu dy Guid e AIX HAC MP fi /bin/rm -f /etc /sm.bak/$ host /bin/rm -f /etc /sm/$host /bin/rm -f /etc /state fi ######## A dd for NF S Lock Removal ( finish) ## ###### # Send a SIGKILL to all processes having o pen file # descr iptors wit hin th is logical volume t o allow # the u nmount to succee d.
© Copyright IBM Corp. 1 999 131 Chapter 6. Cluster T est ing Before you start to test the HACMP configuration, you need to guar antee that your cluster nodes are in a stable state.
132 IBM Certific ation Stu dy Guid e AIX HAC MP 6.1.2 Sy stem Param eters • T ype date on all nodes to check th at all the nodes in the cluster are running with their clocks on the same time. • Ensure that the number of user licenses has been correctly set (lslicen se ).
Cluster T esting 133 • Check that all interfaces communicate ( ping <ip-addres s> or ping -R <ip-addr ess>). • List the arp table entries with arp -a . • Check the status of the TCP/IP daemons ( lssr c -g tcpip ). • Ensure that there are no bad entries in the /etc/hosts file, especially at the bottom of the file.
134 IBM Certific ation Stu dy Guid e AIX HAC MP • V erify the c luster configuration by running /usr/sb in/clu ster/diag/ clconfig -v ’-tr’ . • T o show clus ter configuration, run: /usr /sbin/cluster /utilit ies/cllsc f . • T o show the clstrmgr version, type: snmpinfo -m dum p -o /usr/sbi n/cluster/hac mp.
Cluster T esting 135 • Use ifconfig to swap the service address back to the original service interface back ( ifcon fig en1 down ). This will cause the service IP address to failover back to the service adapter on N odeF .
136 IBM Certific ation Stu dy Guid e AIX HAC MP • Generate the switch error in the error l og which is being monitored by HACMP Error Notification (for configuration see 2.
Cluster T esting 137 • V erify that all sharedvg file systems and paging spaces are accessi ble ( df -k and lsps -a ). 6.2.2 No de Failure / Reintegra tion The following sections deal with issues of node failure and reintegration.
138 IBM Certific ation Stu dy Guid e AIX HAC MP • V erify tha t failover has occurred ( netstat -i and ping for net works, lsvg -o and vi of a test file for volume groups, and ps -U <ap puid > for application processes). • Power cycle NodeF .
Cluster T esting 139 • Monitor the cluster log files on NodeT . • Disconnect the network cable from the appropriate serv ice and all the standby interfaces at the same time (but not the Administrative SP Ethernet) on NodeF . This will cause HACMP to detect a network_down event.
140 IBM Certific ation Stu dy Guid e AIX HAC MP • Reconnect hdisk0, close the casing, and turn the key to normal mode. • Power on NodeF then verify that t he r ootvg logical volumes are no longer stale ( lsvg - l rootv g ).
Cluster T esting 141 • Monitor cluster logfiles on NodeT if HACMP has been customized to monitor 7133 disk failures. • Since the 7133 disk is hot pluggable, remove a disk from drawer 1 associated with NodeF's shared volume group.
142 IBM Certific ation Stu dy Guid e AIX HAC MP.
© Copyright IBM Corp. 1 999 143 Chapter 7. Cluster T roubleshooting T ypically , a functioning HACMP cluster requires minimal intervention. If a problem occurs, however , diagnostic and recovery s kills are essential.
144 IBM Certific ation Stu dy Guid e AIX HAC MP For a more detailed description of the cluster log files consult Chapter 2 of the HACMP for AIX, V ersion 4.3: T roubleshooting Guide , SC23-4280. 7.2 confi g_too_long If the cluster manager recognizes a state change in the cluster , it acts upon it by executing an event script.
Cluster T roubleshooting 145 hang. After a certain amount of time, by default 360 seconds, the cluster manager will issue a config_too_long message into the /tmp/hacmp.out file. The message issued looks like this: The cluster has been in rec onfigurati on too long;Somethi ng may be wrong.
146 IBM Certific ation Stu dy Guid e AIX HAC MP 7.3.1 Tuning the Syst em Using I/O Pacing Use I/O pacing to tune the system so that system resources are distributed more equitably during large disk writes.
Cluster T roubleshooting 147 7.3.4 Ch anging the Failure Detec tion Rate Use the SMI T Chang e/Show a Cluster Netwo rk Module screen to change the failure detection rate for your networ k module only if enabling I/O pacing or extending the syncd frequency did not resolve deadman problems in your cluster .
148 IBM Certific ation Stu dy Guid e AIX HAC MP and control messages so that the Cluster Manager has accurate information about the status of its partner .
Cluster T roubleshooting 149 7.6 Us er ID Prob lems Within an HACMP cluster , you always have more than one node potentially offer ing the same service to a specific user or a specific user id.
150 IBM Certific ation Stu dy Guid e AIX HAC MP • Go from the simple to the complex. Make the simple tests first. Do not tr y anything complex and complicated until you have ruled out the simple and obvious. • Do not make more than one change at a time.
© Copyright IBM Corp. 1 999 151 Chapter 8. Cluster Management and Administration This chapter covers all aspects of monitoring and managing an existing HACMP cluster .
152 IBM Certific ation Stu dy Guid e AIX HAC MP Consult the HACMP for AIX, V ersion 4.3: T roubleshooting Guide , SC23-4280, for help if you detect a pr oblem with an HACMP cluster . 8.1.1 The clstat C ommand HACMP for AIX provides the /us r/sbin/ cluster/c lstat command for monitoring a cluster and its components.
Cluster Management and Administration 153 More details on how to configure HA View and on how to monitor your cluster with HA View can be found in Chapter 3, “Monitoring an HACMP cluster” in HACMP for AIX, V ersion 4.3: Administration Guide , SC23-4279.
154 IBM Certific ation Stu dy Guid e AIX HAC MP 8.1.3.5 /tmp/cm.log Contains timestamped, formatted messages generated by H A CMP for AIX clstrmgr activ ity . This file is typically used by IBM support personnel. 8.1.3.6 /tmp/cspoc.log Contains timestamped, formatted messages generated by H A CMP for AIX C-SPOC commands.
Cluster Management and Administration 155 (C-SPOC) utility c an be used to start and stop cluster serv ices on all nodes in cluster environments. Starting cluster services refers to the process of starting the HACMP fo r AIX daemons that enable the coordination required betw een nodes in a cluster .
156 IBM Certific ation Stu dy Guid e AIX HAC MP 8.2.1.4 Cluster Information Program daemon (clinfo) This daemon provides status information about the cluster to cluster nodes and clients and invokes the /usr/sbin /cluster/etc/ clinfo.rc sc ript in response to a cluster event.
Cluster Management and Administration 157 are started in sequential order - not in par allel. The output of the command run on the remote node is returned to the originating node. Because the command is executed remotely , there can be a delay before the command output is returned.
158 IBM Certific ation Stu dy Guid e AIX HAC MP node. Because the command is executed remotely , there can be a delay before the command output is ret urned.
Cluster Management and Administration 159 prevents unpredictable behavior from corrupting the data on the shared disks. See the clexit.rc man page for additional information. 8.2. 4 Star ting and S toppi ng Clu ster Serv ices o n Clients Use the /us r/sbin/ cluster/e tc/rc.
160 IBM Certific ation Stu dy Guid e AIX HAC MP 8.3 Rep lacing Failed Components From time to time, it will be necessary to perform hardware maintenance or upgrades on cluster components. Some replacements or upgrades can be performed while the cluster is operative, while others r equi re planned downtime.
Cluster Management and Administration 161 • The new adapter must be of the same type or a compatible type as the replaced adapter . • When replacing or adding an SCSI adapter , remove the resistors for shared buses. Furthermore, set the SCSI ID of the adapter to a value differ ent than 7.
162 IBM Certific ation Stu dy Guid e AIX HAC MP 4. Logically remove the disk from the system ( rmdev -l hdi skX -d; rmdev -l pdiskY - d if a SSA disk ) on all nodes. 5. Physically remove the failed disk and replace it with a new disk. 6. Add the disk to the ODM ( mkde v or cfgmgr) on all nodes.
Cluster Management and Administration 163 8.4 Cha nging Sh ared L VM Com ponents Changes to VG constructs are probably the most frequent kind of changes to be performed in a cluster .
164 IBM Certific ation Stu dy Guid e AIX HAC MP When changing shared L VM components manually , you will usually need to run through the following procedure: 1. Stop HACMP on the node owning the shared volume group (sometimes a stop of the applications using the shared volume group may be suf ficient).
Cluster Management and Administration 165 Lazy Update has some limi tations, which you need to consider when you rely on Lazy Update in general: • If the first disk in a sharedvg has been r eplaced, the impor tvg command will fail as Lazy Update expects to be able to match the hdisk number for the first disk to a valid PVID in the ODM.
166 IBM Certific ation Stu dy Guid e AIX HAC MP • Shared volume groups • List all volume groups in the cluster . • Import a volume group (with HACMP 4.3 only). • Extend a volume group (with HACMP 4.3 only). • Reduce a volume group (with HACMP 4.
Cluster Management and Administration 167 T o us e the SMIT shortcuts to C-SPOC, type smit cl_lvm or smit cl_conl vm for concurrent volume groups. Concurrent volume groups must be varied on in concurrent mode to perform tasks.
168 IBM Certific ation Stu dy Guid e AIX HAC MP T o change the nodes associ ated with a given resource group, or to change the priorities assigned to the nodes in a resource grou p chain, you must redefine the resource group. Y ou must also redefine the resource group if you add or change a resource assigned to the group.
Cluster Management and Administration 169 • If the Cluster M anager is active on the local node, synchronization tr iggers a cluster-wide, dynamic reconfiguration event.
170 IBM Certific ation Stu dy Guid e AIX HAC MP 8.5.3.1 Resource Migration T ypes Before performing a resource migration, decide if you will declare the migration sticky or non-sticky . Stic ky Re sour ce Migr ation A sticky migration permanently attaches a resource group t o a specified node.
Cluster Management and Administration 171 INACTIVE_T AKEOVER flag set to false and has not yet started because its primary node is down. In general, however , only rotating resource groups should be migrated in a non-sticky manner . Such migrations are one-time events and occur simi lar to normal rotating resource group flavors.
172 IBM Certific ation Stu dy Guid e AIX HAC MP If you do not include a location specifier in the location f ield, the DARE Resource Migration utility performs a default migra tion, again making the resources available for reacquisition.
Cluster Management and Administration 173 Note that you cannot add nodes to the resource group list with the DARE Resource Migration utility . This task is performed through SMIT .
174 IBM Certific ation Stu dy Guid e AIX HAC MP Be aware that persistent sticky location markers are saved and restored in cluster snapshots. Y ou can use the clfindre s command to find out if sticky markers are present in a resource group.
Cluster Management and Administration 175 5. Restart the HACMP for AIX software on the node using the smit c lstart fastpath and verify that the node successfully joined the cluster . 6. Repeat Steps 1 through 5 on t he remaining cluster nodes. Figure 15 below shows the procedure: Figure 15.
176 IBM Certific ation Stu dy Guid e AIX HAC MP • Cluster nodes should be running the same HACMP maintenance levels. There might be incompatibilities between various maintenance levels of HACMP, so you must ensure that consistent levels are maintained across all cluster nodes.
Cluster Management and Administration 177 8.7.1.1 How to do a split-mirror backup This same procedure can be used with just one mirrored copy of a logical volume.
178 IBM Certific ation Stu dy Guid e AIX HAC MP 9. After the backup is complete and verified, unmount and delete the new file system and the logical volume you used for it. 10.Use the mklvcopy command to add back the logical volume copy you previously split off to the f s lv logical volume.
Cluster Management and Administration 179 they don’t match, the user won’t get anything done after a failover happened. So, the administrator has to keep definitions equal throughout t he cl uster . Fortunately , the C- SPOC utility , as of HACMP V ersion 4.
180 IBM Certific ation Stu dy Guid e AIX HAC MP T o add a user on one or more nodes in a cluster , you can either use the AIX mkuser command in a rsh to one clusternode after the other , or use the C-SPOC cl_m kuser command or the Add a User to the Cluster SMIT screen.
Cluster Management and Administration 181 T o remove a user account from one or more cluster nodes, yo u can either use the AIX rmuser command on one cluster node after the other , or use the C-SPOC cl_rmuser command or the C-SPO C Remove a User from the Cluster SMIT screen.
182 IBM Certific ation Stu dy Guid e AIX HAC MP.
© Copyright IBM Corp. 1 999 183 Chapter 9. Specia l RS/600 0 SP T opics This chapter will introduce you to some special topics that only apply if you are running HACMP on the SP system.
184 IBM Certific ation Stu dy Guid e AIX HAC MP need to have the frame supervisor s support dual tty lines i n order to get both control workstations connected at the same time. Contact your IBM representative for the neccessary hardware (see F igure 16 on page 184).
Special RS/6000 SP T opics 185 The backup cws has to be installed with the same level of AIX and PSSP . Depending on the kerberos configuration of the primary cws, the backup cws has to be configured .
186 IBM Certific ation Stu dy Guid e AIX HAC MP ordinary HACMP cluster , as it is described in Chapter 7 of the HACM P for AIX, V ersion 4.3: Installation Guide , SC23-4278. Now the cluster environment has to be configured. D efine a clus ter ID and name for your HACWS cluster and define the two nodes to HACMP.
Special RS/6000 SP T opics 187 After that, identify the HACWS event scripts to HACMP by executing the /usr/sbi n/hacws/spcw_add events command, and verify the configuration with the /usr/sbi n/hacws/hacws_ve rify command. Y ou should also check the cabling from the backup cws with the /usr/sbi n/hacws/spcw_ver ify_cabling command.
188 IBM Certific ation Stu dy Guid e AIX HAC MP The following is simply a shortened description on how kerberos works. For more details, the redbook Inside the RS/6000 SP , SG24-5145, covers the subject in much more detail.
Special RS/6000 SP T opics 189 allow the clients to get service ticket s to be used with other servers without the need to give them the p ass word every time they request services. So, given a user has a ticket-granting ticket, if a user requests a kerberized service, he has to get a service ticket for it.
190 IBM Certific ation Stu dy Guid e AIX HAC MP After setting the cluster’s security settings to enhanced for all these nodes, you can verify that it is working as expected, for example, by running clverify , which goes out to the nodes and checks the consistency of files.
Special RS/6000 SP T opics 191 With reference to Figure 17 above, imagine tw o nodes, Node X and Node Y , running the same application. The nodes are connected by the switch and have locally-attached disks. On Node X’s disk resides a volume group containing the raw logical volume lv_X.
192 IBM Certific ation Stu dy Guid e AIX HAC MP The VSDs in this scenario are mapped to the raw logical volumes lv_X and lv_Y . Node X is a client of Node Y’s VSD, and vice versa. Node X is also a direct client of its own VSD (lv_X), and N ode Y is a direct client of VSD lv_Y .
Special RS/6000 SP T opics 193 impact of servicing a local I/O request through VSD relative to the normal VMM/L VM pathway is very small. IBM supports any IP network for VSD, but we recommend the switch for performance. VSD provides distributed data access, but not a locking mechanism to preserve data integrity .
194 IBM Certific ation Stu dy Guid e AIX HAC MP operation that was in progress, as well as new I/O operations against rvsd_X, are suspended until failover is complete. When Node X is repaired and rebooted, R VSD switches the rvsd_X back to its primary , Node X.
Special RS/6000 SP T opics 195 9.4 SP Switc h as an HA CMP Network One of the fascinating things with an RS/6000 SP is the switch network. It has developed over time; so, currently there are two types of switches at customer sites.
196 IBM Certific ation Stu dy Guid e AIX HAC MP 9.4.2 Eprimary Mana gement The SP switch has an internal primary backup concept, where the primary node, known as the Eprimary , is backed up automatically by a backup node.
Special RS/6000 SP T opics 197 In case this node was the Eprimary node on the switch net wor k, and it is an SP switch, then the RS/6000 SP software would have chosen a new Eprimary independently from the HACMP software as well.
198 IBM Certific ation Stu dy Guid e AIX HAC MP.
© Copyright IBM Corp. 1 999 199 Chapter 10. HACMP Classic vs. HACMP/ES vs. HANFS So, why would you prefer to install one version of HACMP instead of another? This chapter summarizes the differences between them, to give you an idea in which situation one or the other best matches your needs.
200 IBM Certific ation Stu dy Guid e AIX HAC MP handling membership and event management by using heartbeats. On the SP , the original High Availability infrastructure was built on this t echnology , and HACMP/ES V ersion 4.3. is now another instance relying on it.
HACMP Classic vs. HACMP/ES vs. HANFS 201 See Part 4 of HACMP for AIX, V ersion 4.3: Enhanced Scalability Installation and Administration Guide , SC23-4284, for more information on t hes e services. 10.2.2 E nhanced Cl uster Secu rity With HACMP V ersion 4.
202 IBM Certific ation Stu dy Guid e AIX HAC MP 10.4 Simila rities and Diffe rences All three products have the basi c structure in common. They all use the same concepts and structures. So, a cluster or a network, in the HACMP context, is the same, no matter what pr oduct is being used.
HACMP Classic vs. HACMP/ES vs. HANFS 203 For switchless RS/6000 SP systems or SPs with the newer SP Switch, the decision will be based on a more functional level.
204 IBM Certific ation Stu dy Guid e AIX HAC MP.
© Copyright IBM Corp. 1 999 205 Appendix A. Special Notic es This publication is intended to help System Administr ators, System Engineers and other System Professionals to pass the IBM HACMP Cer tification Ex am.
206 IBM Certific ation Stu dy Guid e AIX HAC MP been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these t echniques to their own environments do so at their own risk.
Special Notices 207 Java and HotJava are trademarks of Sun Microsystems, Incorporated. Microsoft, Windows, Windows NT , and the Windows 95 logo are trademarks or registered trademarks of Microsoft Corporation. PC Direct is a trademark of Zif f Communications Company and is used by IBM Corporation under license.
208 IBM Certific ation Stu dy Guid e AIX HAC MP.
© Copyright IBM Corp. 1 999 209 Appendix B. Related Publications The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook.
210 IBM Certific ation Stu dy Guid e AIX HAC MP B.3 Other Publicati ons These publications are also relevant as additional sources of information: • IBM RS/6000 SP: Planning, V olume 2, Control Work.
© Copyri ght IBM Corp. 1999 21 1 How to Get ITSO Redbooks This section explains how bot h custome rs and IB M employee s can f ind out a bout ITSO red books, CD-ROMs, worksho ps, an d re sidencies. A for m for or derin g boo ks and C D-ROMs is also provid ed.
212 IBM Certific ation S tudy Gui de AIX H ACM P How C ustome rs Can Get IT SO Redboo ks Customers may request ITSO deliverables (re dbooks, BookManager BOOKs, and CD-ROMs) and informatio n about r ed.
213 IBM Re dbook O rder For m Please send me the following: We accept Amer ican Expr ess, Diners , Eurocar d, Master Ca rd, and Visa. Pay ment by c redit car d not available in all countries .
214 IBM Certific ation S tudy Gui de AIX H ACM P.
© Copyright IBM Corp. 1 999 215 List of Abbreviations AIX Advanced Inter active Executive AP A All Poi nts Addre ssable AP AR Authoriz ed Progr am Analysis Report The descrip tion of a problem to be fixed by IBM defe ct supp ort. This fix is delivered in a PTF (see below).
216 IBM Certific ation Stu dy Guid e AIX HAC MP NETBIO S Network Basic Input/Outp ut System NFS Network File S ystem NIM Netwo rk Inter face Module (Th is is the definition of NIM in the HACMP con text. N IM i n the AIX 4. 1 cont ext stands for Netw ork Installation Manag er) .
© Copyright IBM Corp. 1 999 217 Index Symbols /.rhosts file edit ing 59 /etc/hos ts file and ad apte r labe l 38 /sbin/rc .boot f ile 146 /usr/sbi n/cluster/godm da emon 59 A abbrevi ations 21 5 Abno.
218 IBM Certifica tion St udy Gu ide AIX HACM P DGSP message 148 Disk Capaci ties 19 Disk Fai l ure 139 dual-net work 36 Dynamic Reconfigurat ion 169 E edit ing /.
219 Network Topology 35 netw orks point-to-p oint 36 NFS mount ing fi les ystem s 126 takeov er issues 126 NFS cro ss mount 41 NFS Exports 41 NFS mount 41 NIM 199 NIS 58 Node Even ts 117 Node Failu re.
220 IBM Certifica tion St udy Gu ide AIX HACM P Token-Rin g 13 Topolog y Service 20 0 topsvc sd 156 U Upgrading 96 user ac counts adding 179 changi ng 180 crea ting 179 remo ving 180 User and Group ID.
© Copyright IBM Corp. 1 999 221 ITSO Redbook Eva l uation IBM Certif icatio n St udy Guide AI X HACMP SG24-5131-00 Y our f eedback is ver y impor tant to help u s maint ain the qua lity o f I TSO r edboo ks.
Printed in the U.S.A . SG24-5131-00 IBM Certification Stud y Guide AIX HACMP SG24-5 131-00.
An important point after buying a device IBM SG24-5131-00 (or even before the purchase) is to read its user manual. We should do this for several simple reasons:
If you have not bought IBM SG24-5131-00 yet, this is a good time to familiarize yourself with the basic data on the product. First of all view first pages of the manual, you can find above. You should find there the most important technical data IBM SG24-5131-00 - thus you can check whether the hardware meets your expectations. When delving into next pages of the user manual, IBM SG24-5131-00 you will learn all the available features of the product, as well as information on its operation. The information that you get IBM SG24-5131-00 will certainly help you make a decision on the purchase.
If you already are a holder of IBM SG24-5131-00, but have not read the manual yet, you should do it for the reasons described above. You will learn then if you properly used the available features, and whether you have not made any mistakes, which can shorten the lifetime IBM SG24-5131-00.
However, one of the most important roles played by the user manual is to help in solving problems with IBM SG24-5131-00. Almost always you will find there Troubleshooting, which are the most frequently occurring failures and malfunctions of the device IBM SG24-5131-00 along with tips on how to solve them. Even if you fail to solve the problem, the manual will show you a further procedure – contact to the customer service center or the nearest service center