IBM SG24-5131-00 Laptop manual

Page 1

SG24-51 31-00 Internatio na l T echnical Support Organization http://www.redbooks.ibm.com IBM Certification Study Guide AIX HACMP David Thiessen, Achim Reh or, Reinhard Zettler.

Page 3

IBM Certificat ion Study Gui de AIX HACMP May 1999 SG24-5131-00 International T echnical Support Organizatio n.

© Copyright International Busine ss Mac hines Corpora tion 1999. All rights reser ved. Note to U.S Gov ernmen t Users – Do cum entation r elated to r estric ted righ ts – Us e, duplic ation or disclosu re is subject to re stricti ons set forth in GSA ADP Sc hedule Contra ct with IBM Corp .

Page 5

© Copyright IBM Corp. 1 999 iii Contents Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ix Ta b l e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 6

iv IBM Certificatio n Study Guide A IX HAC MP Chapter 3. Cluster Hardware and Software Preparation . . . . . . . . . . . 51 3.1 Cluster Node Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.1.1 Adapter Slot Placement .

Page 7

v 5.1.3 Event Notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.1.4 Event Recovery and Ret ry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.1.5 Notes on Cust omizing Event Processing . . . . . . .

Page 8

vi IBM Certific ation Stu dy Guide A IX HAC MP 8.1.1 The clstat C ommand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 8.1.2 Monitoring Clusters using HAView . . . . . . . . . . . . . . . . . . . . . . 152 8.1.3 Cluster Log Files .

Page 9

vii 9.3 VSDs - RVSDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 9.3.1 Virtual Shared Disk (VSDs) . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 0 9.3.2 Recoverable Virtual Shared Disk . . . . . . . .

Page 10

viii IBM C erti fication S tud y Guide A IX HA CMP.

Page 11

© Copyright IBM Corp. 1 999 ix Figures 1. Basic SSA Confi guration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 7 2. Hot-Standby Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.

Page 12

x IBM Certificatio n St udy Gui de AIX H ACMP.

Page 13

© Copyright IBM Corp. 1 999 xi Ta b l e s 1. AIX Versi on 4 HACMP Installati on and Impl ementatio n . . . . . . . . . . . . . . . 4 2. AIX Versi on 4 HACMP S ystem Admini stration . . . . . . . . . . . . . . . . . . . . . . 5 3. Hardware Requi rements for the Different HAC MP Versio ns .

Page 14

xii IBM Certifica tion Stud y Gu ide AIX HA CMP.

Page 15

xiii Pref ace The AIX and RS/6000 Certifications of fer ed through the Professional Certification Program from I BM are designed to validate t he skills required of technical professionals who work in the powerful and o ften complex environments of AIX and RS/6000.

Page 16

xiv IBM Certification Study Guide AIX HACM P • AIX par ameters that are af fec ted by an HACMP installation, and their correct settings • T he cluster and resource configuration process, including.

Page 17

xv POWERparallel Systems area, known as the SP1 at that time. In 1997 he began working on HACMP as the Service Groups for HACMP and RS/6000 SP merged into one. He holds a diploma in Computer Science from the University of Frankfurt in Germany . This is his first redbook.

Page 18

xvi IBM Certification Study Guide AIX HACM P.

Page 19

© Copyright IBM Corp. 1 999 1 Chapter 1. Certif ication Ov erview This chapter provides an overview of the skill requirements for obtaining an IBM Certified Specialist - AIX HACMP certification. The following chapters are designed to provide a comprehensive review of specific topics that are essential for obtaining the certification.

Page 20

2 IBM Certificatio n St udy Gui de AIX H ACMP 1.2 Certificati o n Exam Ob jectives The following objectives were used as a basis for what is required when the certification exam was developed. Some of these topics have been regrouped to provide better organization when discussed in this publication.

Page 21

Certification Overview 3 • Cr eate an application server . • Set up E vent Notification. • Set up ev ent notification and pre/post event scripts. • Set up error notification. • Post Configur ation Activities. • Conf i gure a client notification and ARP update.

Page 22

4 IBM Certificatio n St udy Gui de AIX H ACMP 1.3 Certific ation Educ ation Courses Courses and publications are of fered to help you prepare for the certification tests. These courses are recommended, but not r equi red, before taking a certification test.

Page 23

Certification Overview 5 The following table outlines information about the next course. T abl e 2. AIX Version 4 HA CMP S ystem A dmin istration Course Number Q1 150 (USA ); AU50 (Worldwide) Course D.

Page 24

6 IBM Certificatio n St udy Gui de AIX H ACMP.

Page 25

© Copyright IBM Corp. 1 999 7 Chapter 2. Cluster Planning The area of cluster planning is a large one. Not only does it include planning for the types of hardware (CPUs, netw or ks, disks) to be used in the cluster , but it also includes other aspects.

Page 26

8 IBM Certificatio n St udy Gui de AIX H ACMP RISC System/6000 models as nodes in an HACMP 4.1 for AIX, HACMP 4.2 for AIX, or HACMP 4.3 for A IX cluster .

Page 27

Cluster Planning 9 Much of the decision centers around the following ar eas: • Processor capacity • Application requirements • Anticipated gr owth requirements • I/ O slot requirements These paradigms are certainly not new ones, and are also important considerations when choosing a processor for a single-system environment.

Page 28

10 IBM Certifica tion Stud y Gu ide AIX HA CMP Y our slot configuration must also allow for the disk I/O adapters you need to support the cluster’ s s hared disk (volume group) configuration.

Page 29

Cluster Planning 11 2.2 Cl uste r Networks HACMP differentiat es between two major types of networks: T CP/IP networks and non-TCP/IP networks. HACMP utilizes both of them for exchanging heartbeats. HACMP uses these heartbeats to diagnose failures in the cluster .

Page 30

12 IBM Certifica tion Stud y Gu ide AIX HA CMP • FDDI • SP Switch •S L I P •S O C C • T oken- Ring As an independent, layered component of AIX, the HACMP for AIX software works with most TCP/IP-based networks. HACMP for AIX has been tested with standard Ethernet interfaces (en *) but not with IEEE 802.

Page 31

Cluster Planning 13 Network types also differentiate themselves in the maximum distance they allow between adapters, and in the maximum number of adapters al lowed on a physical network. • Ethernet supports 10 and 100 Mbps currently , and supports hardware address swapping.

Page 32

14 IBM Certifica tion Stud y Gu ide AIX HA CMP • SP Switch is a high-speed packet switching network, running on the RS/6000 SP system only . It runs bidirectionally up to 80 MBps, which adds up to 160 MBps of capacity per adapt er . This is node-to-node communication and can be done in parallel between every pair of nodes inside an SP .

Page 33

Cluster Planning 15 2.2.2.2 Special C onsiderat ions As for TCP/IP networks, there are a number of restrictions on non-TCP/IP networks. These are explained for the three dif ferent types in more detail below . Serial (RS232) A serial (RS232) network needs at least one available serial por t per cluster node.

Page 34

16 IBM Certifica tion Stud y Gu ide AIX HA CMP 2 a PCI Multiport Async Card is required in an S7X model, no native ports 3 only one serial port available for customer use, i.

Page 35

Cluster Planning 17 SSA subsystems are built up from loops of adapters and disks. A simple example is shown in Figure 1. Figure 1. Basic SSA Co nfigur ation Here, a single adapter controls one SSA loop of eight disks. Data can be transferred around the lo op, in either direction, at 20 MBps.

Page 36

18 IBM Certifica tion Stud y Gu ide AIX HA CMP • 7133 Serial Storage Architecture (SSA) Disk Subs ystem Models 010, 500, 020, 600, D40 and T40. The 7133 models 010 and 500 were the fir s t SSA products announced in 1995 with the revolutionary new Serial Storage A rchitecture.

Page 37

Cluster Planning 19 2.3.1. 1 Disk C apacities T able 8 lists the dif ferent SSA disks, and provides an overview of their characteristics. T abl e 8. SSA Disk s 2.3.1.2 Supported and Non-Supported Adapters T able 9 lists the dif ferent SSA adapters and presents an overview of their characteristics.

Page 38

20 IBM Certifica tion Stud y Gu ide AIX HA CMP 1 See 2.3.1.3, “Rules for SSA Loops” on page 20 for more information. The following rules apply to SSA Adapters: • Y ou cannot have more than four adapters in a single system.

Page 39

Cluster Planning 21 • A maximum of 48 devices can be connected in a particular SSA loop. • Only one pai r of adapter connectors can be connected in a particular SSA loop.

Page 40

22 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.3.1.4 RAID vs . Non-RAID RAID T echnology RAID is an acronym for Redundant Array of Independent Disks. Disk arrays are groups of disk drives that work together to achieve higher data-transfer and I/O rates than those provided by single large drives.

Page 41

Cluster Planning 23 RAID Leve ls 2 and 3 RAID 2 and RAID 3 are parallel process array mechanisms, where all drives in the array operate in unison. Similar to data st riping, information to be written to disk is s plit into chunks (a fixed amount of data), and each chunk is written out to the same physic al position on separate disks (in parallel).

Page 42

24 IBM Certifica tion Stud y Gu ide AIX HA CMP As with RAID 3, in the event o f disk f ailure, the information can be r ebuilt from the remaining drives. RAID level 5 array also uses parity information, though it is still important to make regular backups of the data in the array .

Page 43

Cluster Planning 25 • Array member drives and spares must be on same loop (cannot span A and B loops) on the adapter . • Y ou cannot boot (ipl) from a RAID. 2.3.1. 5 Advan tages Because SSA allows SCSI-2 mapping, all functions associated with initiators, targets, and logical units are translatable.

Page 44

26 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.3.2 SCSI Disks After the announcement of the 7133 SS A Disk Subsystems, the SCSI Disk subsystems became less common in HACMP clusters. However , the 7135 RAIDiant Array (Model 1 10 and 210) and other SCSI Subsystems are still in use at many customer sites.

Page 45

Cluster Planning 27 • Enhanced SCSI-2 Diff erential Fast/Wide Adapter/A (MCA, FC: 2412, Adapter Label: 4-C); not us able with 7135-1 10 • SCSI-2 Fast/Wide Differ ential Adapter (PCI, FC: 6209, Adapter Label: 4-B) • DE Ultra SCSI Adapter (PCI, FC: 620 7, Adapter Label: 4-L); not usable with 7135-1 10 2.

Page 46

28 IBM Certifica tion Stud y Gu ide AIX HA CMP withdraw the 7135 RAID iant Systems from marketing because it is equally possible to configure RAID on the SSA Subsystems.

Page 47

Cluster Planning 29 • Cascading • Rotating • Concurrent Each of these types describes a differ ent set of relationships between nodes in the cluster , and a dif ferent set of behaviors upon nodes entering and leaving the cluster .

Page 48

30 IBM Certifica tion Stud y Gu ide AIX HA CMP reintegration, a node remains as a standby and does not take back any of the resources that it had initially served. Concu rrent Resour ce Groups: A concurrent resource group may be shared simultaneously by multiple nodes.

Page 49

Cluster Planning 31 Figure 2. Hot- Standb y Con figur ation In this configuration, there is one c ascading resource group cons isting of the four disks, hdisk1 to hdisk4, and their constituent volume groups and file systems. Node 1 has a priority of 1 for this resource group while node 2 has a priority of 2.

Page 50

32 IBM Certifica tion Stud y Gu ide AIX HA CMP the cluster becomes a standby node. Y ou must choose a rotating standby configuration if you do not want a break in service during reintegr ation. Since takeover nodes continue providing s ervices until they have to leave the cluster , you should configure your cluster with nodes of equal power .

Page 51

Cluster Planning 33 When a failed node reintegrates into the cluster , it takes back the resource group for which it has the highest priority . Therefore, even in this configuration, there is a break in serv ice during reintegration.

Page 52

34 IBM Certifica tion Stud y Gu ide AIX HA CMP Here the resource groups are the same as the one s in the mutual takeover configuration. Also, similar to the previous configuration, nodes 1 and 2 each have priorities of 1 for one of the r esource groups, A or B.

Page 53

Cluster Planning 35 • Design the network topology • Define a network mask for your site • Define IP addresses (adapter identifiers) for e ach node’s service an d standby adapters. • Define a boot address for each service adapte r that can be taken over , if you are using IP address takeover or rotating resources.

Page 54

36 IBM Certifica tion Stud y Gu ide AIX HA CMP Dual Network A dual-network setup has two separ ate networks for communication. Nodes are connected to two network s, and each node has two service adapters available to clients. If one network fails, the remaining network can still function, connecting nodes and providing resource access to clients.

Page 55

Cluster Planning 37 The following diagram shows a cluster consisting of two nodes and a client. A single public network connects the nodes and the client, and the nodes are linked point-to-point by a private high-speed SOCC connection t hat provides an alternate path for cluster and lock traf fic should the public network fail.

Page 56

38 IBM Certifica tion Stud y Gu ide AIX HA CMP SLIP are considere d public networ ks. Note th at a SLIP line, ho wever, does not p rovide client a ccess. Private A private network provides communication between nodes only; it typically does not allow client access.

Page 57

Cluster Planning 39 until it assumes the shared IP address. Consequently , Clinfo makes known the boot address for this adapter . In an HACMP for AIX environm ent on the RS/6000 SP , the SP Ethernet adapters can be configured as service adapters but should not be configured for IP address takeover .

Page 58

40 IBM Certifica tion Stud y Gu ide AIX HA CMP service label (address) instead of the boot label. If the node should fail, a takeover node acquires the failed node’ s service address on its standby adapter , thus making the failure transparent to clients using that specific service address.

Page 59

Cluster Planning 41 If you do not use Hardwar e Address T ak eover , the ARP cache of clients can be updated by adding the clients’ IP addresses to the PING_CLIENT_L IST variable in the /usr/sbin/cluster/etc/clinfo.

Page 60

42 IBM Certifica tion Stud y Gu ide AIX HA CMP application on the takeover node when a fallover occurs. For more information about creating application server resources, see the HACMP for AIX, V ersion 4.

Page 61

Cluster Planning 43 2.5. 3 Lic ensing Met hods Some vendors require a unique license for each processor that runs an application, which means that you must license-protect the application by incorporating processor-specific information into the application when it is installed.

Page 62

44 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.6 Cus tomization P lanning The Cluster Manager ’s ability to recognize a specific series of events and subevents permits a very flexible customization scheme. The HACMP for AI X software provides an event customization facility that allows you to tailor cluster event processing to your site.

Page 63

Cluster Planning 45 event to inform system administrators that t raffic may have to be rerouted. Afterwards, you can use a network_u p notification event to inform system administrators that traf fic can again be serviced thr ough the restored network.

Page 64

46 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.6.2.1 Single Point-of-Fail ure Hardware Component Recovery As described in 2.2.1.2, “Special Network Considerations” on page 12, the HPS Switch network is one resource that has to be cons idered as a single point of failure.

Page 65

Cluster Planning 47 The above example screen will add a Notification Method to the ODM, so that upon appearance of the HPS_F AUL T9_ER entry in the error log, the er ror notification daemon will trigger the execution of t he /usr/sbin /cluster/u tiliti es/clstop -gr sy command, which shuts HACMP down gracefully with takeover .

Page 66

48 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.7 Us er ID Plan ning The following sections describe various aspects of User ID Planning. 2.7. 1 Clus ter User a nd Grou p IDs One of the basic tasks any system administrator must perform is setting up user accounts and groups.

Page 67

Cluster Planning 49 2.7. 2 Clus ter Passwo rds While user and group management is ver y much facilitated with C-SPO C, the password information still has to be distributed by some other means.

Page 68

50 IBM Certifica tion Stud y Gu ide AIX HA CMP 2.7.3.3 NFS-Mounted Home Directories on Shared V olumes So, a combined approach is used in most cases. In order to make home directories a highly available resource, they have to be part of a resource group and placed on a shared volume.

Page 69

© Copyright IBM Corp. 1 999 51 Chapter 3. Cluster Hardware and Softwar e Prepar ation This chapter covers the steps that are required to prepar e the RS/6000 hardware and AIX software for the i nstallation of HACMP and the configuration of the cluster .

Page 70

52 IBM Certifica tion Stud y Gu ide AIX HA CMP mirroring rootvg in order to a v oid the impact of the failover time involved in a node failure. In terms of maximizing availability , this technique is just as valid for increasing the availability of a cluster as it is for increasing single-sys tem availability .

Page 71

Cluster Hardware and S oftware Preparation 53 mirrored. If the dump devices are NOT the paging device, that dump logi cal volume will not be mirrored. 3.1. 2.1 Procedur e The following steps assume the user has rootvg contained on hdisk0 and is attempting to mirror the rootvg to a new disk : hdisk1.

Page 72

54 IBM Certifica tion Stud y Gu ide AIX HA CMP “-m” option. Y ou should consult documentation on the us age of the “-m” option for mklv copy . 4.

Page 73

Cluster Hardware and S oftware Preparation 55 3.1 .2.2 Ne cessary A P AR Fix es T able 1 1. Necessar y AP AR Fixes T o determine if either fix is installed on a machine, execute the following: 3.1. 3 AIX P rerequis ite LP Ps In order to install HACMP and HACMP/ES the AIX setup must be in a pr oper state.

Page 74

56 IBM Certifica tion Stud y Gu ide AIX HA CMP • nv6000.database.obj 4.1.0.0 • nv6000.Features.obj 4.1.2.0 • nv6000.client.obj 4.1.0.0 and for HA View 4.3 • xlC.rte 3.1.4.0 • nv6000.base.obj 4.1.2.0 • nv6000.database.obj 4.1.2.0 • nv6000.

Page 75

Cluster Hardware and S oftware Preparation 57 and low-water marks. If a process tries to wr ite to a file at the high-water mark, it must wait until enough I/O oper ations have finished to make the low-water mar k. Use the smi t chgsys fastpath to set high- and low-water marks on the Change/Show Characteristics of the Operat ing System screen.

Page 76

58 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.1.4.3 Editing the /e tc /hosts File a nd Nameserver Configuration Make sure all nodes can resolve all cluster addresses. See the chapter on planning TCP/IP networks (the section Using HACMP with NIS and DNS) in the HACMP for AIX, V ersion 4.

Page 77

Cluster Hardware and S oftware Preparation 59 3.1.4.5 Editing the /.rhosts File Make sure that each node’s service adapters and boot addresses are listed in the /.rhosts file on each cluster node. D oi ng so allows the /usr/sbi n/cluster/uti lities/cl runcmd command and the /usr/sbin/cluster/godm daemon to run.

Page 78

60 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.2 Networ k Connection and T es ting The following sections describe important aspects of network connection and testing. 3.2.1 TC P/IP Networks Since there are several types of TCP/IP Networ ks available within HACMP, there are several dif ferent characteristics and some restrictions on them.

Page 79

Cluster Hardware and S oftware Preparation 61 . Figure 9. Con necting Networ ks to a Hub 3.2.1.2 IP Addresses and Subnets The design of the HACMP for AIX software s pecifies that: • All client traff.

Page 80

62 IBM Certifica tion Stud y Gu ide AIX HA CMP T o comply with these rules, pay careful attention to the IP addresses you assign to standby adapters. Standby adapters mus t be on a separ ate s ubnet from the service adapt ers, even though they are on the same physical network.

Page 81

Cluster Hardware and S oftware Preparation 63 • Scan the /tmp/hacmp.out file to confirm that the /etc/rc.net script has run successfully . Look for a zero exit status. • If IP address takeover is enabled, confirm that the /etc/rc.net script has run and that the service adapter is on its service address and not on its boot address.

Page 82

64 IBM Certifica tion Stud y Gu ide AIX HA CMP TMSS A T arget-mode SSA is only supported with the SSA Multi-Initiator RAID Adapters (Feature #6215 and #6219) , Microcode Level 1801 or later . Y ou need at least HACMP V ersion 4.2.2 with AP AR IX75718.

Page 83

Cluster Hardware and S oftware Preparation 65 3.2.2.4 Configuring T arget Mode SSA The node number on each system needs to be changed fr om the default of zero to a number .

Page 84

66 IBM Certifica tion Stud y Gu ide AIX HA CMP cat /etc /environment > /dev/tm ssay.im on the corresponding node for wr iting. x and y cor respond to the appropriate opposite nodenumber . Y ou should see the first command hanging unt i l the second command is issued, and then showing its output.

Page 85

Cluster Hardware and S oftware Preparation 67 For more information regarding adapters and c abling rules see 2.3.1, “SSA Disks” on page 16 or the following documents: • 7133 SSA Disk Subsystems:.

Page 86

68 IBM Certifica tion Stud y Gu ide AIX HA CMP Adapter Definitions By issuing the following command, you can check the correct adapter configuration. In order to work correctly , the adapt er must be in the “Available” state: The third column in the adapter device line shows the location of the adapter .

Page 87

Cluster Hardware and S oftware Preparation 69 SSA physical disks: • Are configured as pdisk0, pdisk1,...,pdiskN. • Have errors logged against them in the system error log. • Support a character special file (/dev/pdisk0, /dev/pdisk1,...,/dev/p.diskN).

Page 88

70 IBM Certifica tion Stud y Gu ide AIX HA CMP Configuration V erification This option enables you to display the relationships between physical (pdisk) and logical (hdisk) di sks . Format Disk T his option enables you to format SSA disk drives. Certify Disk This option enables you to test whether data on an SSA disk drive can be read correctly .

Page 89

Cluster Hardware and S oftware Preparation 71 12.Run cfgmgr to install the microcode to adapters. 13.T o complete the device driver upgrade, you must now reboot your s ystem. 14.T o confirm that the upgrade was a success, type lscfg -vl s saX whe re X i s 0,1.

Page 90

72 IBM Certifica tion Stud y Gu ide AIX HA CMP 18.T o confirm that the upgrade was a success, type lscfg -vl pd iskX where X is 0,1... for all SSA disks. Check the ROS Level line to see that each disk has the appropriate microcode level (for the correct microcode level see the above mentioned web-site).

Page 91

Cluster Hardware and S oftware Preparation 73 3.3.2. 1 Cabli ng The following sections describe important information about cabling. SCSI Ada pters A overview of SCSI adapters that can be used on a shared SCSI bus is given in 2.3. 2.3, “Supported SCSI Adapters” on page 26.

Page 92

74 IBM Certifica tion Stud y Gu ide AIX HA CMP FC: 2902 or 9202 (2.4m), PN: 67G 1260 - OR - FC: 2905 or 9205 (4.5m), PN: 67G 1261 - OR - FC: 2912 or 9212 (12m), PN: 67G1262 - OR - FC: 2914 or 9214 (14.

Page 93

Cluster Hardware and S oftware Preparation 75 FC: 2426 (0.94m), PN: 52G4234 • 16-Bit SCSI-2 Differential System-to-System Cable FC: 2424 (0.6m), PN: 52G4291 - OR - FC: 2425 (2.5m), PN: 52G4233 This cable is used only if there are more than two nodes attached to the same shared bus.

Page 94

76 IBM Certifica tion Stud y Gu ide AIX HA CMP T T T T 6 bit) 6 (16-bit) #2416 (16 - #2424 6-bit) 6 (16-bit ) #2426 #2416 (16- b #2416 (16-bit) #2426 Maximum total cab le length: 25m.

Page 95

Cluster Hardware and S oftware Preparation 77 Figure 1 1. 71 35-1 10 RAIDi ant Arr ays Con nected on T wo S hared 16-Bit SC SI Buses 3.3.2.3 Adapter SCSI ID and T ermination change The SCSI-2 Diff er ential Controller is used to connect to 8-bit disk devices on a shared bus.

Page 96

78 IBM Certifica tion Stud y Gu ide AIX HA CMP SCSI-2 Dif ferential Fast/Wide Adapter/A and Enhanced SCSI-2 Dif ferential Fast/Wide Adapter/A) are shown in Figure 12 and F igure 13 respectively . Figure 12. T erminatio n on th e SCSI-2 Differen tial Co ntroller Figure 13.

Page 97

Cluster Hardware and S oftware Preparation 79 The ID of an SCSI adapter , by default, is 7. Since each device on an SCSI bus must have a unique ID, the ID of at least one of the adapters on a shared SCSI bus has to be changed. The procedure to change the ID of an SCSI-2 Differential Controller is: 1.

Page 98

80 IBM Certifica tion Stud y Gu ide AIX HA CMP 4. Reboot the machine to bring the change int o effect . The same task can be executed from the command line by entering: Also with this method, a reboot is required to br ing the change into eff ec t.

Page 99

Cluster Hardware and S oftware Preparation 81 The command line version of this is: As in the case of the SCSI-2 Differential Contr oller, a system reboot is required to bring the change into ef fect.

Page 100

82 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.4.1 Cre ating Share d VGs The following sections contain information about creating non-concurr ent VGs and VGs for concurrent access. 3.4.1.1 Creating Non-Concurrent VGs This section covers how to create a shar ed volume group on the source node using the SMIT interface.

Page 101

Cluster Hardware and S oftware Preparation 83 Creating a Concurrent Acce s s V olume Group on Serial Dis k Subsystems T o us e a concurrent access volume group, defined on a serial disk subsystem such as an IBM 7133 disk subsystem, you must create it as a concurrent-capable volume group.

Page 102

84 IBM Certifica tion Stud y Gu ide AIX HA CMP Use the smit mkvg fastpath to create a shared volu me group. Use the default field values unless your site has other requirements, or unless you are specifically instructed otherwise. T abl e 15. sm it mkvg Optio ns (C oncur rent, RAID) 3.

Page 103

Cluster Hardware and S oftware Preparation 85 the journaled file sys tem log (jfslog) is a logi cal volume that requires a unique name in the cluster . T o make sur e that logical volumes have unique names, rename the logical volume associated with the file system and the corresponding jfslog logical volume.

Page 104

86 IBM Certifica tion Stud y Gu ide AIX HA CMP That is, you enter this command for each disk. In the resulting display , locate the line for the logical volume for which you just added copies. For copies placed on separate disks, the numbers in the logical partitions column and the physical partitions column should be equal.

Page 105

Cluster Hardware and S oftware Preparation 87 The T askG uide uses a graphical interface to guide you thr ough the steps of adding nodes to an existing volume group. For more information on t he T ask Guide, see 3.4.6, “Alternate Method - T askG uide” on page 90.

Page 106

88 IBM Certifica tion Stud y Gu ide AIX HA CMP 3.4.4.4 V arying Off the V ol ume Group on the De stination Nodes Use the varyoffvg command to deactivate the shared volume group so that it can be imported onto another destination node or activ ated as appropriate by the cluster event scripts.

Page 107

Cluster Hardware and S oftware Preparation 89 command succeeds. If exactly half the copies are available, as with two of four , quorum is not achieved and the varyonvg command fails.

Page 108

90 IBM Certifica tion Stud y Gu ide AIX HA CMP Forcing a V aryon A volume group with quorum disabled and one or more physical volumes unavailable can be “forced” to vary on by using the -f flag with the varyonvg command.

Page 109

Cluster Hardware and S oftware Preparation 91 conflict with the cluster ’s configuration. Online help panels give additional information to aid in each step. 3.4.6.1 T ask Guide Requir ements Before starting the T askGuide, make sur e: • Y ou have a configured HACMP cluster in place.

Page 110

92 IBM Certifica tion Stud y Gu ide AIX HA CMP.

Page 111

© Copyright IBM Corp. 1 999 93 Chapter 4. HACMP Installation and Cluster Definition This chapter describes issues concerning the actual installation of HACMP V ersion 4.3 and the definition of a cluster and its r esources. It concentrates on the HACMP part of the installation, so, we will assume AIX is already at the 4.

Page 112

94 IBM Certifica tion Stud y Gu ide AIX HA CMP cluster. base.server.u tils HACMP Base Server Uti lities • cluster .cspoc This component includes all of the commands and environment for the C-SPOC utility , the Cluster-Single Point Of Control feature.

Page 113

HACMP Installation and Cluster D efinition 95 • cl ust er .vsm The Visual Systems Management File set contains Icons and bitmaps for the graphical Management of HACMP Resources, as well as the x hacmpm command: cluster. vsm HACMP X11 Depen dent • cluster .

Page 114

96 IBM Certifica tion Stud y Gu ide AIX HA CMP This fileset contains the Application Heart Beat Daemon, Oracle Parallel Server is an application that makes use of it: cluster.hc.rte Application Heart Be at Daemon The inst allation of CRM req uires th e followin g softwar e: bos.

Page 115

HACMP Installation and Cluster D efinition 97 HACMP software to HACMP for AIX, V ersion 4.3. The comments on upgrading the Operating System are not included. If you are already running AIX 4.3, see the special note at the end of this section. 4.1.2.1 Upgrading from V ersion 4.

Page 116

98 IBM Certifica tion Stud y Gu ide AIX HA CMP Install HA CMP 4.3 for AI X on Node A 5. After upgrading AIX and verifying that the disks are correctly configured, install the HACMP 4.3 for AIX software on Node A. For a short description of the filesets, please refer to 4.

Page 117

HACMP Installation and Cluster D efinition 99 file on Node A using the following command: /usr/sbi n/cluster/uti lities/cl lsif -x >> /.rhos ts This command will append information to the /.rhosts file instead of overwriting it. Then, you can ftp this file to t he other nodes as necessar y .

Page 118

100 IBM Certific ation Stu dy Guid e AIX HAC MP 2. If you wish to save your cluster configuration, see the chapter Sav ing and Restoring Cluster Configurations in the HACMP for AIX, V ersion 4.3: Administration Guide, SC23-4279. 3. Commit your current HACMP for AIX software on all nodes.

Page 119

HACMP Installation and Cluster Defini tion 10 1 • The network modules Y ou define the cluster topology by enter ing information about each component into HACMP-specific ODM classes. Y ou enter the HACMP ODM data by using the HACMP SMIT interface or the VSM utility xhacmpm .

Page 120

102 IBM Certific ation Stu dy Guid e AIX HAC MP Adding or Changing a Node Name a fte r the Initial Config uration If you want to add or change a node name after the initial configuration, use the Change/Show Cluster Node N ame screen. See the chapter on changing the cluster topology of the HACMP for AIX, V ersion 4.

Page 121

HACMP Installation and Cluster Defini tion 10 3 Network Name Enter an ASCII text string that identifies the network. The network name can include alphabe tic and numeric characters and underscores. Use no more than 31 characters. The network name is arbitrary , but must be used consistently for adapters on the same physical network.

Page 122

104 IBM Certific ation Stu dy Guid e AIX HAC MP Adapte r Iden tifier Enter the IP address in dotted decimal format or a device file name. IP address information is required for non-serial network adapters only if the node’ s address cannot be obtained from t he domain name server or the local /etc/hosts file (using the adapter IP label given).

Page 123

HACMP Installation and Cluster Defini tion 10 5 Adding or Changing Ada pters after the Initi al Configuration If you want to change the information about an adapter after the initial configuration, use the Change/Show an Adapter screen. See the chapter on changing the cluster topology in the HACMP for AIX, V ersion 4.

Page 124

106 IBM Certific ation Stu dy Guid e AIX HAC MP •S L I P • SP Switch •A T M It is highly unlikely that you will add or remove a network module. For information about changing a characteristic of a Network Module, such as the failure detection rate, see the chapter on changing the cluster topology in t he HACMP for AIX, V ersion 4.

Page 125

HACMP Installation and Cluster Defini tion 10 7 configuration. If the cluster manager is active on some other cluster nodes but not on the lo cal node, the synchronization operation is aborted.

Page 126

108 IBM Certific ation Stu dy Guid e AIX HAC MP 4.3 Defin ing Reso urces The HACMP for AIX software provides a highly available environment by identifying a set of cluster-wide resources essential to uninterrupted processing, and then by defining relationships among nodes that ensure these resources are available to client processes.

Page 127

HACMP Installation and Cluster Defini tion 10 9 4.3.1.1 Configuring Resources for Resource Groups Once you have defined resource groups, you further configur e them by assigning cluster resources to one resource group or another . Y ou can configure resource groups even if a node is powered down.

Page 128

11 0 IBM Certification S tudy Gu ide AIX HACMP These settings also have to be synchronized throughout the cluster . Therefore Synchronize Cluster Resources has to be chosen from the corresponding SMIT Menu. If the Cluster Manager is running on t he local node, synchronizing cluster resources triggers a dynamic reconfiguration event (DARE, see 8.

Page 129

HACMP Installation and Cluster Definition 111 as the path locations for start and s top scripts for the application. These scripts have to be in the same location on every service node. Just as for pre- and post-events, these scripts can be adapted to specific nodes.

Page 130

11 2 IBM Certification S tudy Gu ide AIX HACMP 4.4.2 Initia l Startup At this point in time, the cluster is not yet started. So the cluster manager has to be started first.

Page 131

HACMP Installation and Cluster Definition 11 3 For cascading resource groups the failed node is going to reaquire its resources, once it is up and running again. So, you have to restart HACMP on it through smit ty clstart and check again for the logfile, as well as the clusters status.

Page 132

11 4 IBM Certification S tudy Gu ide AIX HACMP Essentially , a snapshot saves all the ODM classes HACMP has generated during its configuration. It does not save user customized scripts, such as start or stop scripts for an application server . However , the location and names of these scripts are in an HACMP ODM class, and are therefore saved.

Page 133

HACMP Installation and Cluster Definition 11 5.

Page 134

11 6 IBM Certification S tudy Gu ide AIX HACMP.

Page 135

© Copyright IBM Corp. 1 999 11 7 Chapter 5. Cluster Customization Within an HACMP for AIX cluster , there are several things that are customizable. The following paragraphs explain the customizing features for events, error notification, network modules and topology services.

Page 136

11 8 IBM Certification S tudy Gu ide AIX HACMP acquire_service_addr (If configured for IP address takeover .) Configures boot addresses to the corresponding service address, and starts TCP/IP servers and network daemons by running the t elinit -a command.

Page 137

Cluster Customization 11 9 event occurs only after a node_up_remote event has successfully completed. Sequence of node_down Events node_d own This event occ urs when a node intentionally leaves the cluster or fails.

Page 138

120 IBM Certific ation Stu dy Guid e AIX HAC MP node_down_local_complete Instructs the Cluster Manager to ex it when the local node has left the cluster . This ev ent occurs only after a node_down_local event has successfully compl eted. node_down_remote_complete Starts takeov er application servers.

Page 139

Cluster Customization 121 no actions since appropriate actions depend on the local network c onfiguration. 5.1.1. 3 Netwo rk Adapte r Event s swap_adapter This event occurs when the service adapter on a node fails.

Page 140

122 IBM Certific ation Stu dy Guid e AIX HAC MP reconfig_resource_complete This event indicates that a cluster resource dynamic reconfiguration has completed.

Page 141

Cluster Customization 123 For example, a file system cannot be unmounted, because of a process running on it. Then, you might want to kill that process first, before unmounting the file system, in order to ge t the event scr ipt done.

Page 142

124 IBM Certific ation Stu dy Guid e AIX HAC MP Each time an error is logged in the system error log, the error notification daemon determines if the error lo g entry matches the selection criteria. If it does, an executable is run. This executable, called a notify method , can range from a simple command to a complex program.

Page 143

Cluster Customization 125 The failure rate of network s varies, depending on their characteristics. For example, for an Ethernet, the nor mal failure detection rate is two ke epalives per second; fast is about four per second; slow is about one per second.

Page 144

126 IBM Certific ation Stu dy Guid e AIX HAC MP T o prevent problems with NFS file systems in an HACMP cluster , make sure that each shared volume group has the same major number on all nodes. The lvlstmaj or command lists the free major numbers on a node.

Page 145

Cluster Customization 127 Figure 14. NFS Cross Mounts When Node A fail s, Node B uses the cl_nfskill utility to close open files in Node A:/afs, unmounts it, mounts it locally , and r e-exports it to waiting clients.

Page 146

128 IBM Certific ation Stu dy Guid e AIX HAC MP • Ensure that node name and the service adapter label ar e the same on each node in the cluster or • Alias the node name to the service adapter label in the / etc/hosts file.

Page 147

Cluster Customization 129 ######## A dd for NF S Lock Removal ( start) ### ##### ######## A dd for NF S Lock Removal ( finish) ## ###### ########## ######### ####### ######### ########## ########## ##.

Page 148

130 IBM Certific ation Stu dy Guid e AIX HAC MP fi /bin/rm -f /etc /sm.bak/$ host /bin/rm -f /etc /sm/$host /bin/rm -f /etc /state fi ######## A dd for NF S Lock Removal ( finish) ## ###### # Send a SIGKILL to all processes having o pen file # descr iptors wit hin th is logical volume t o allow # the u nmount to succee d.

Page 149

Page 150

132 IBM Certific ation Stu dy Guid e AIX HAC MP 6.1.2 Sy stem Param eters • T ype date on all nodes to check th at all the nodes in the cluster are running with their clocks on the same time. • Ensure that the number of user licenses has been correctly set (lslicen se ).

Page 151

Cluster T esting 133 • Check that all interfaces communicate ( ping <ip-addres s> or ping -R <ip-addr ess>). • List the arp table entries with arp -a . • Check the status of the TCP/IP daemons ( lssr c -g tcpip ). • Ensure that there are no bad entries in the /etc/hosts file, especially at the bottom of the file.

Page 152

134 IBM Certific ation Stu dy Guid e AIX HAC MP • V erify the c luster configuration by running /usr/sb in/clu ster/diag/ clconfig -v ’-tr’ . • T o show clus ter configuration, run: /usr /sbin/cluster /utilit ies/cllsc f . • T o show the clstrmgr version, type: snmpinfo -m dum p -o /usr/sbi n/cluster/hac mp.

Page 153

Cluster T esting 135 • Use ifconfig to swap the service address back to the original service interface back ( ifcon fig en1 down ). This will cause the service IP address to failover back to the service adapter on N odeF .

Page 154

136 IBM Certific ation Stu dy Guid e AIX HAC MP • Generate the switch error in the error l og which is being monitored by HACMP Error Notification (for configuration see 2.

Page 155

Cluster T esting 137 • V erify that all sharedvg file systems and paging spaces are accessi ble ( df -k and lsps -a ). 6.2.2 No de Failure / Reintegra tion The following sections deal with issues of node failure and reintegration.

Page 156

138 IBM Certific ation Stu dy Guid e AIX HAC MP • V erify tha t failover has occurred ( netstat -i and ping for net works, lsvg -o and vi of a test file for volume groups, and ps -U <ap puid > for application processes). • Power cycle NodeF .

Page 157

Cluster T esting 139 • Monitor the cluster log files on NodeT . • Disconnect the network cable from the appropriate serv ice and all the standby interfaces at the same time (but not the Administrative SP Ethernet) on NodeF . This will cause HACMP to detect a network_down event.

Page 158

140 IBM Certific ation Stu dy Guid e AIX HAC MP • Reconnect hdisk0, close the casing, and turn the key to normal mode. • Power on NodeF then verify that t he r ootvg logical volumes are no longer stale ( lsvg - l rootv g ).

Page 159

Cluster T esting 141 • Monitor cluster logfiles on NodeT if HACMP has been customized to monitor 7133 disk failures. • Since the 7133 disk is hot pluggable, remove a disk from drawer 1 associated with NodeF's shared volume group.

Page 160

142 IBM Certific ation Stu dy Guid e AIX HAC MP.

Page 161

© Copyright IBM Corp. 1 999 143 Chapter 7. Cluster T roubleshooting T ypically , a functioning HACMP cluster requires minimal intervention. If a problem occurs, however , diagnostic and recovery s kills are essential.

Page 162

144 IBM Certific ation Stu dy Guid e AIX HAC MP For a more detailed description of the cluster log files consult Chapter 2 of the HACMP for AIX, V ersion 4.3: T roubleshooting Guide , SC23-4280. 7.2 confi g_too_long If the cluster manager recognizes a state change in the cluster , it acts upon it by executing an event script.

Page 163

Cluster T roubleshooting 145 hang. After a certain amount of time, by default 360 seconds, the cluster manager will issue a config_too_long message into the /tmp/hacmp.out file. The message issued looks like this: The cluster has been in rec onfigurati on too long;Somethi ng may be wrong.

Page 164

146 IBM Certific ation Stu dy Guid e AIX HAC MP 7.3.1 Tuning the Syst em Using I/O Pacing Use I/O pacing to tune the system so that system resources are distributed more equitably during large disk writes.

Page 165

Cluster T roubleshooting 147 7.3.4 Ch anging the Failure Detec tion Rate Use the SMI T Chang e/Show a Cluster Netwo rk Module screen to change the failure detection rate for your networ k module only if enabling I/O pacing or extending the syncd frequency did not resolve deadman problems in your cluster .

Page 166

148 IBM Certific ation Stu dy Guid e AIX HAC MP and control messages so that the Cluster Manager has accurate information about the status of its partner .

Page 167

Cluster T roubleshooting 149 7.6 Us er ID Prob lems Within an HACMP cluster , you always have more than one node potentially offer ing the same service to a specific user or a specific user id.

Page 168

150 IBM Certific ation Stu dy Guid e AIX HAC MP • Go from the simple to the complex. Make the simple tests first. Do not tr y anything complex and complicated until you have ruled out the simple and obvious. • Do not make more than one change at a time.

Page 169

Page 170

152 IBM Certific ation Stu dy Guid e AIX HAC MP Consult the HACMP for AIX, V ersion 4.3: T roubleshooting Guide , SC23-4280, for help if you detect a pr oblem with an HACMP cluster . 8.1.1 The clstat C ommand HACMP for AIX provides the /us r/sbin/ cluster/c lstat command for monitoring a cluster and its components.

Page 171

Cluster Management and Administration 153 More details on how to configure HA View and on how to monitor your cluster with HA View can be found in Chapter 3, “Monitoring an HACMP cluster” in HACMP for AIX, V ersion 4.3: Administration Guide , SC23-4279.

Page 172

154 IBM Certific ation Stu dy Guid e AIX HAC MP 8.1.3.5 /tmp/cm.log Contains timestamped, formatted messages generated by H A CMP for AIX clstrmgr activ ity . This file is typically used by IBM support personnel. 8.1.3.6 /tmp/cspoc.log Contains timestamped, formatted messages generated by H A CMP for AIX C-SPOC commands.

Page 173

Cluster Management and Administration 155 (C-SPOC) utility c an be used to start and stop cluster serv ices on all nodes in cluster environments. Starting cluster services refers to the process of starting the HACMP fo r AIX daemons that enable the coordination required betw een nodes in a cluster .

Page 174

156 IBM Certific ation Stu dy Guid e AIX HAC MP 8.2.1.4 Cluster Information Program daemon (clinfo) This daemon provides status information about the cluster to cluster nodes and clients and invokes the /usr/sbin /cluster/etc/ clinfo.rc sc ript in response to a cluster event.

Page 175

Cluster Management and Administration 157 are started in sequential order - not in par allel. The output of the command run on the remote node is returned to the originating node. Because the command is executed remotely , there can be a delay before the command output is returned.

Page 176

158 IBM Certific ation Stu dy Guid e AIX HAC MP node. Because the command is executed remotely , there can be a delay before the command output is ret urned.

Page 177

Cluster Management and Administration 159 prevents unpredictable behavior from corrupting the data on the shared disks. See the clexit.rc man page for additional information. 8.2. 4 Star ting and S toppi ng Clu ster Serv ices o n Clients Use the /us r/sbin/ cluster/e tc/rc.

Page 178

160 IBM Certific ation Stu dy Guid e AIX HAC MP 8.3 Rep lacing Failed Components From time to time, it will be necessary to perform hardware maintenance or upgrades on cluster components. Some replacements or upgrades can be performed while the cluster is operative, while others r equi re planned downtime.

Page 179

Cluster Management and Administration 161 • The new adapter must be of the same type or a compatible type as the replaced adapter . • When replacing or adding an SCSI adapter , remove the resistors for shared buses. Furthermore, set the SCSI ID of the adapter to a value differ ent than 7.

Page 180

162 IBM Certific ation Stu dy Guid e AIX HAC MP 4. Logically remove the disk from the system ( rmdev -l hdi skX -d; rmdev -l pdiskY - d if a SSA disk ) on all nodes. 5. Physically remove the failed disk and replace it with a new disk. 6. Add the disk to the ODM ( mkde v or cfgmgr) on all nodes.

Page 181

Cluster Management and Administration 163 8.4 Cha nging Sh ared L VM Com ponents Changes to VG constructs are probably the most frequent kind of changes to be performed in a cluster .

Page 182

164 IBM Certific ation Stu dy Guid e AIX HAC MP When changing shared L VM components manually , you will usually need to run through the following procedure: 1. Stop HACMP on the node owning the shared volume group (sometimes a stop of the applications using the shared volume group may be suf ficient).

Page 183

Cluster Management and Administration 165 Lazy Update has some limi tations, which you need to consider when you rely on Lazy Update in general: • If the first disk in a sharedvg has been r eplaced, the impor tvg command will fail as Lazy Update expects to be able to match the hdisk number for the first disk to a valid PVID in the ODM.

Page 184

166 IBM Certific ation Stu dy Guid e AIX HAC MP • Shared volume groups • List all volume groups in the cluster . • Import a volume group (with HACMP 4.3 only). • Extend a volume group (with HACMP 4.3 only). • Reduce a volume group (with HACMP 4.

Page 185

Cluster Management and Administration 167 T o us e the SMIT shortcuts to C-SPOC, type smit cl_lvm or smit cl_conl vm for concurrent volume groups. Concurrent volume groups must be varied on in concurrent mode to perform tasks.

Page 186

168 IBM Certific ation Stu dy Guid e AIX HAC MP T o change the nodes associ ated with a given resource group, or to change the priorities assigned to the nodes in a resource grou p chain, you must redefine the resource group. Y ou must also redefine the resource group if you add or change a resource assigned to the group.

Page 187

Cluster Management and Administration 169 • If the Cluster M anager is active on the local node, synchronization tr iggers a cluster-wide, dynamic reconfiguration event.

Page 188

170 IBM Certific ation Stu dy Guid e AIX HAC MP 8.5.3.1 Resource Migration T ypes Before performing a resource migration, decide if you will declare the migration sticky or non-sticky . Stic ky Re sour ce Migr ation A sticky migration permanently attaches a resource group t o a specified node.

Page 189

Cluster Management and Administration 171 INACTIVE_T AKEOVER flag set to false and has not yet started because its primary node is down. In general, however , only rotating resource groups should be migrated in a non-sticky manner . Such migrations are one-time events and occur simi lar to normal rotating resource group flavors.

Page 190

172 IBM Certific ation Stu dy Guid e AIX HAC MP If you do not include a location specifier in the location f ield, the DARE Resource Migration utility performs a default migra tion, again making the resources available for reacquisition.

Page 191

Cluster Management and Administration 173 Note that you cannot add nodes to the resource group list with the DARE Resource Migration utility . This task is performed through SMIT .

Page 192

174 IBM Certific ation Stu dy Guid e AIX HAC MP Be aware that persistent sticky location markers are saved and restored in cluster snapshots. Y ou can use the clfindre s command to find out if sticky markers are present in a resource group.

Page 193

Cluster Management and Administration 175 5. Restart the HACMP for AIX software on the node using the smit c lstart fastpath and verify that the node successfully joined the cluster . 6. Repeat Steps 1 through 5 on t he remaining cluster nodes. Figure 15 below shows the procedure: Figure 15.

Page 194

176 IBM Certific ation Stu dy Guid e AIX HAC MP • Cluster nodes should be running the same HACMP maintenance levels. There might be incompatibilities between various maintenance levels of HACMP, so you must ensure that consistent levels are maintained across all cluster nodes.

Page 195

Cluster Management and Administration 177 8.7.1.1 How to do a split-mirror backup This same procedure can be used with just one mirrored copy of a logical volume.

Page 196

178 IBM Certific ation Stu dy Guid e AIX HAC MP 9. After the backup is complete and verified, unmount and delete the new file system and the logical volume you used for it. 10.Use the mklvcopy command to add back the logical volume copy you previously split off to the f s lv logical volume.

Page 197

Cluster Management and Administration 179 they don’t match, the user won’t get anything done after a failover happened. So, the administrator has to keep definitions equal throughout t he cl uster . Fortunately , the C- SPOC utility , as of HACMP V ersion 4.

Page 198

180 IBM Certific ation Stu dy Guid e AIX HAC MP T o add a user on one or more nodes in a cluster , you can either use the AIX mkuser command in a rsh to one clusternode after the other , or use the C-SPOC cl_m kuser command or the Add a User to the Cluster SMIT screen.

Page 199

Cluster Management and Administration 181 T o remove a user account from one or more cluster nodes, yo u can either use the AIX rmuser command on one cluster node after the other , or use the C-SPOC cl_rmuser command or the C-SPO C Remove a User from the Cluster SMIT screen.

Page 200

182 IBM Certific ation Stu dy Guid e AIX HAC MP.

Page 201

Page 202

184 IBM Certific ation Stu dy Guid e AIX HAC MP need to have the frame supervisor s support dual tty lines i n order to get both control workstations connected at the same time. Contact your IBM representative for the neccessary hardware (see F igure 16 on page 184).

Page 203

Special RS/6000 SP T opics 185 The backup cws has to be installed with the same level of AIX and PSSP . Depending on the kerberos configuration of the primary cws, the backup cws has to be configured .

Page 204

186 IBM Certific ation Stu dy Guid e AIX HAC MP ordinary HACMP cluster , as it is described in Chapter 7 of the HACM P for AIX, V ersion 4.3: Installation Guide , SC23-4278. Now the cluster environment has to be configured. D efine a clus ter ID and name for your HACWS cluster and define the two nodes to HACMP.

Page 205

Special RS/6000 SP T opics 187 After that, identify the HACWS event scripts to HACMP by executing the /usr/sbi n/hacws/spcw_add events command, and verify the configuration with the /usr/sbi n/hacws/hacws_ve rify command. Y ou should also check the cabling from the backup cws with the /usr/sbi n/hacws/spcw_ver ify_cabling command.

Page 206

188 IBM Certific ation Stu dy Guid e AIX HAC MP The following is simply a shortened description on how kerberos works. For more details, the redbook Inside the RS/6000 SP , SG24-5145, covers the subject in much more detail.

Page 207

Special RS/6000 SP T opics 189 allow the clients to get service ticket s to be used with other servers without the need to give them the p ass word every time they request services. So, given a user has a ticket-granting ticket, if a user requests a kerberized service, he has to get a service ticket for it.

Page 208

190 IBM Certific ation Stu dy Guid e AIX HAC MP After setting the cluster’s security settings to enhanced for all these nodes, you can verify that it is working as expected, for example, by running clverify , which goes out to the nodes and checks the consistency of files.

Page 209

Special RS/6000 SP T opics 191 With reference to Figure 17 above, imagine tw o nodes, Node X and Node Y , running the same application. The nodes are connected by the switch and have locally-attached disks. On Node X’s disk resides a volume group containing the raw logical volume lv_X.

Page 210

192 IBM Certific ation Stu dy Guid e AIX HAC MP The VSDs in this scenario are mapped to the raw logical volumes lv_X and lv_Y . Node X is a client of Node Y’s VSD, and vice versa. Node X is also a direct client of its own VSD (lv_X), and N ode Y is a direct client of VSD lv_Y .

Page 211

Special RS/6000 SP T opics 193 impact of servicing a local I/O request through VSD relative to the normal VMM/L VM pathway is very small. IBM supports any IP network for VSD, but we recommend the switch for performance. VSD provides distributed data access, but not a locking mechanism to preserve data integrity .

Page 212

194 IBM Certific ation Stu dy Guid e AIX HAC MP operation that was in progress, as well as new I/O operations against rvsd_X, are suspended until failover is complete. When Node X is repaired and rebooted, R VSD switches the rvsd_X back to its primary , Node X.

Page 213

Special RS/6000 SP T opics 195 9.4 SP Switc h as an HA CMP Network One of the fascinating things with an RS/6000 SP is the switch network. It has developed over time; so, currently there are two types of switches at customer sites.

Page 214

196 IBM Certific ation Stu dy Guid e AIX HAC MP 9.4.2 Eprimary Mana gement The SP switch has an internal primary backup concept, where the primary node, known as the Eprimary , is backed up automatically by a backup node.

Page 215

Special RS/6000 SP T opics 197 In case this node was the Eprimary node on the switch net wor k, and it is an SP switch, then the RS/6000 SP software would have chosen a new Eprimary independently from the HACMP software as well.

Page 216

198 IBM Certific ation Stu dy Guid e AIX HAC MP.

Page 217

© Copyright IBM Corp. 1 999 199 Chapter 10. HACMP Classic vs. HACMP/ES vs. HANFS So, why would you prefer to install one version of HACMP instead of another? This chapter summarizes the differences between them, to give you an idea in which situation one or the other best matches your needs.

Page 218

200 IBM Certific ation Stu dy Guid e AIX HAC MP handling membership and event management by using heartbeats. On the SP , the original High Availability infrastructure was built on this t echnology , and HACMP/ES V ersion 4.3. is now another instance relying on it.

Page 219

HACMP Classic vs. HACMP/ES vs. HANFS 201 See Part 4 of HACMP for AIX, V ersion 4.3: Enhanced Scalability Installation and Administration Guide , SC23-4284, for more information on t hes e services. 10.2.2 E nhanced Cl uster Secu rity With HACMP V ersion 4.

Page 220

202 IBM Certific ation Stu dy Guid e AIX HAC MP 10.4 Simila rities and Diffe rences All three products have the basi c structure in common. They all use the same concepts and structures. So, a cluster or a network, in the HACMP context, is the same, no matter what pr oduct is being used.

Page 221

HACMP Classic vs. HACMP/ES vs. HANFS 203 For switchless RS/6000 SP systems or SPs with the newer SP Switch, the decision will be based on a more functional level.

Page 222

204 IBM Certific ation Stu dy Guid e AIX HAC MP.

Page 223

© Copyright IBM Corp. 1 999 205 Appendix A. Special Notic es This publication is intended to help System Administr ators, System Engineers and other System Professionals to pass the IBM HACMP Cer tification Ex am.

Page 224

206 IBM Certific ation Stu dy Guid e AIX HAC MP been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these t echniques to their own environments do so at their own risk.

Page 225

Special Notices 207 Java and HotJava are trademarks of Sun Microsystems, Incorporated. Microsoft, Windows, Windows NT , and the Windows 95 logo are trademarks or registered trademarks of Microsoft Corporation. PC Direct is a trademark of Zif f Communications Company and is used by IBM Corporation under license.

Page 226

208 IBM Certific ation Stu dy Guid e AIX HAC MP.

Page 227

© Copyright IBM Corp. 1 999 209 Appendix B. Related Publications The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook.

Page 228

210 IBM Certific ation Stu dy Guid e AIX HAC MP B.3 Other Publicati ons These publications are also relevant as additional sources of information: • IBM RS/6000 SP: Planning, V olume 2, Control Work.

Page 229

© Copyri ght IBM Corp. 1999 21 1 How to Get ITSO Redbooks This section explains how bot h custome rs and IB M employee s can f ind out a bout ITSO red books, CD-ROMs, worksho ps, an d re sidencies. A for m for or derin g boo ks and C D-ROMs is also provid ed.

Page 230

212 IBM Certific ation S tudy Gui de AIX H ACM P How C ustome rs Can Get IT SO Redboo ks Customers may request ITSO deliverables (re dbooks, BookManager BOOKs, and CD-ROMs) and informatio n about r ed.

Page 231

213 IBM Re dbook O rder For m Please send me the following: We accept Amer ican Expr ess, Diners , Eurocar d, Master Ca rd, and Visa. Pay ment by c redit car d not available in all countries .

Page 232

214 IBM Certific ation S tudy Gui de AIX H ACM P.

Page 233

© Copyright IBM Corp. 1 999 215 List of Abbreviations AIX Advanced Inter active Executive AP A All Poi nts Addre ssable AP AR Authoriz ed Progr am Analysis Report The descrip tion of a problem to be fixed by IBM defe ct supp ort. This fix is delivered in a PTF (see below).

Page 234

216 IBM Certific ation Stu dy Guid e AIX HAC MP NETBIO S Network Basic Input/Outp ut System NFS Network File S ystem NIM Netwo rk Inter face Module (Th is is the definition of NIM in the HACMP con text. N IM i n the AIX 4. 1 cont ext stands for Netw ork Installation Manag er) .

IBM SG24-5131-00 manual

Share URL

Similar manuals