Instruction/ maintenance manual of the product RS/6000 SP IBM
Go to page of 108
RS / 6 000 SP S P S wi tc h2 S e rvi c e G ui d e GA22-7444-03 IBM.
.
RS / 6 000 SP S P S wi tc h2 S e rvi c e G ui d e GA22-7444-03 IBM.
Note! Before using this information and the product it supports, read the information in “Safety and environmental notices” on page ix and “Notices” on page A-1.
Contents Figures .................................... v T ables .................................... v i i Safety and environmental notices .......................... i x Safety notices (in English) ............................. i x Danger notices ...
Removing and restoring switch resources ....................... 3 - 9 Removing an SP Switch2 from the active configuration ................. 3 - 9 Restoring an SP Switch2 to the active configuration .................. 3 - 9 Using Perspectives to fence and unfence nodes attached to the switch .
Figures 1-1. SP Switch2 Assembly High-Level Diagram .................... 1 - 3 1-2. SP Switch2 W rap Plugs ........................... 1 - 5 1-3. SP Switch2 Chassis Assembly ........................ 1 - 1 1 2-1. Front view of frame locations .........
vi RS/6000 SP: SP Switch2 Service Guide.
T ables 1-1. Switch Problem Diagnostics ......................... 1 - 5 1-2. SP Switch2 supervisor LED indications ..................... 1 - 6 1-3. Perspectives SP Switch2 status indicators .................... 1 - 9 1-4. Fan Failure Diagnostics .....
viii RS/6000 SP: SP Switch2 Service Guide.
Safety and environmental notices Safety notices (in English) For general information concerning safety , refer to Electrical Safety for IBM Customer Engineers (S229-8124). For a copy of this publication, contact your IBM marketing representative or the IBM branch office serving your locality .
Before you connect the power cable of this product to ac power , verify that the power receptacle is correctly grounded and has the correct voltage. ( SPSFD004 ) DANGER During an electrical storm, do not connect or disconnect any cable that has a conductive outer surface or a conductive connector .
DANGER The frame main circuit breaker and the controller must not be switched on again now . Before disconnecting the power cables from the power receptacles, ensure that the customer’s branch distribution circuit breakers (customer power source circuit breakers) are Off and tagged with DO NOT OPERA TE tags, S229-0237.
CAUTION: Due to weight of each thin node (under 18 Kg [40 lbs]), use care when removing and replacing thin nodes above shoulder height. ( SPSFC005 ) CAUTION: The wide node weight may exceed 32 Kg (70.5 lbs). ( SPSFC006 ) CAUTION: Do not open more than one wide node or switch assembly drawer at a time.
CAUTION: When using step ladder or step stool, be sure that the work surface is level and the step ladder or step stool is in good working order . ( SPSFC016 ) CAUTION: Portable ladders present a serious safety hazard if not used properly .
xiv RS/6000 SP: SP Switch2 Service Guide.
About this book This book covers the SP Switch2 only . Refer to RS/6000 SP: SP Switch Service Guide , GA22-7443 for information related to the SP Switch.
User’s responsibilities Before calling IBM ® for service, the system administrator should use the problem determination section of the Parallel System Support Programs for AIX ® : Diagnosis Guide (GA22-7350), for initial problem determination.
Chapter 1. Maintenance Analysis Procedures (MAPs) This chapter provides information for identifying problems and guides you to the most likely failed Field Replaceable Unit (FRU). The MAPs then refer you to the FRU Removal/Replacement procedures for the corrective action.
Switch Connection T ypes Standard Node Processor nodes in 9076 SP frames are attached to the switches with switch cables. Switch-to-Switch Connections between switches. Switch Assembly Description SP Switch2 Each switch chip has its own clock and all clocks communicate through the switch data cables.
There are two LEDs on the front of each switch assembly . For quick reference, their definitions are as follows: Figure 1-1. SP Switch2 Assembly High-Level Diagram Switch Description and Problem Determination (MAP 0590) Chapter 1.
Y ellow (Environment) LED Off No environmental problems detected by switch supervisor card. On W arning of environmental condition out of nominal range. Preventative Maintenance should be scheduled for this switch. Flashing Serious environmental condition detected; power shut off.
Note: In a frame with processor nodes, entries for the switch will refer to “node17” or “slot17”. In a multi-switch frame, switches will be listed as even slot addresses. Notes: 1. SP Switch2 advanced diagnostics use the 8.75 meter data cable provided by the SPS feature bill of material.
T able 1-1. Switch Problem Diagnostics (continued) Priority Message or condition Action 2 (2 of 4) Environmental problems v Control workstation errpt file reports a switch failure, shutdown, or warnin.
T able 1-2. SP Switch2 supervisor LED indications (continued) Green LED Y ellow LED Indication Flashing Flashing Defective supervisor card (replace card) Note: For this indication, the green and yellow LEDs are flashing alternately .
Step 0595-006 The supervisor self-test failed because the yellow LED flashed the wrong address. 1. Make certain that the correct supervisor harness connector is plugged into the supervisor . 2. Is the correct harness plugged in? v If yes, go to the Start MAP (0100) in the RS/6000 SP: System Service Guide and troubleshoot the supervisory bus system.
Switch environment (MAP 0600) Purpose of this MAP This MAP provides diagnostic information for switch problems that are related to the operating environment. Note: Refer to “Service position procedures” on page 3-10 for placing a switch into the service position or for removing the switch from the service position.
Step 0600-004 Perspective display shows ″ Fan X: Failure ″ on a red background. 1. Use T able 1-4 to service components 2. Refer to Chapter 4, “FRU removals and replacements” on page 4-1 for instructions about the component being serviced. T able 1-4.
Step 0600-005 Y ou performed the recommended action in T able 1-4 on page 1-10. 1. Component replaced or reseated. 2. Check the yellow switch supervisor LED for an On or flashing condition.
Step 0600-007 Y ou are here for one of the following reasons: v An over temperature condition exists v Y ou fixed an obvious airflow blockage or removed a high temperature source near the air intakes v A problem with the switch supervisor card sensors may exist For any of the reasons listed above: 1.
3. Is the yellow switch supervisor LED On or flashing? v If yes, return to “Step 0600-010” on page 1-12 and continue service with the next highest priority . v If no, go to “Step 0620-021” on page 1-27. Switch power (MAP 0610) Purpose of this MAP This MAP provides diagnostic steps for resolving problems related to SP Switch2 power .
Step 0610-004 The green switch supervisor LED is Off . 1. Make certain that the switch power cable is properly connected to jack J1 on the switch and on the SEPBU. 2. Place the inline switch on the switch power cable into the On (‘1’) position if it is not already in that position.
Step 0610-009 The switch circuit breaker no longer trips to the Off (‘0’) position when a fan/power supply pair is removed. 1. Reinstall the power supply into the switch. 2. Check circuit breaker . 3. Does the circuit breaker still trip? v If yes, go to “Step 0610-010”.
3. If all interposers have been tested, go to “Step 0610-004” on page 1-14. Step 0610-017 Y ou tested all interposers and the circuit breaker still trips. 1. Replace the circuit breaker assembly . 2. Check the circuit breaker . 3. Does the circuit breaker still trip? v If yes, go to “Step 0610-018”.
Step 0610-022 The switch’s front panel power LED remained Off after replacing the LED power extension cable. 1. Replace the switch planar . v Refer to “Removing the switch planar” on page 4-7 and “Replacing the switch planar” on page 4-8 2.
Y ou should receive a message indicating successful initialization. If you receive any other message, consult the “Diagnosing SP Switch2 Problems” section of Parallel System Support Programs for AIX: Diagnosis Guide (GA22-7350). 3. Determine the primary node number .
b. Return to “Step 0620-001” on page 1-17. T able 1-6. SP Switch2 error conditions Error # Device Message Link Message Description and Action 2 Initialized N/A Description: Initialization detected a wrapped port where a processor node was expected (this may result from isolation procedures), or else a disconnected cable.
T able 1-6. SP Switch2 error conditions (continued) Error # Device Message Link Message Description and Action 0 Uninitialized Uninitialized Description: Switch adapter has not been initialized. Processor node may not recognize adapter due to hardware failure or bad software configuration.
T able 1-6. SP Switch2 error conditions (continued) Error # Device Message Link Message Description and Action −4 Device has been removed from network, faulty Link has been removed from network or miswire, faulty Description: Switch network not wired as specified in switch topology or problem with connection between switch and device.
T able 1-6. SP Switch2 error conditions (continued) Error # Device Message Link Message Description and Action −9 Destination not reachable Link has been removed from network, not connected Description: Possible hardware problem. Action: Go to “Step 0620-004”.
Step 0620-005 The /var/adm/SPlogs/css0/p0/out.top file indicates a problem with a “Primary node” or “Secondary node” connection. 1. Open frame rear cover and check the cable connection from the indicated switch assembly jack to the processor node.
b. Go to ″ Processor node diagnostics and descriptions (MAP 0130) ″ in RS/6000 SP: System Service Guide to fix problem. Step 0620-01 1 The Power (green) LED is lit, indicating the processor node is powered on. 1. Run advanced diagnostics in service mode on device “cssX” (where X=0 or 1) on this processor node and its associated switch port.
T able 1-8. Service Request Numbers (SRNs) for SP Switch2 adapters (continued) Service Request Number SRN Source Failing Component Description Notes Notes: 1. I fy=1 in these SRNs, you must troubleshoot the SP Switch2 adapter software (SRN 765-x1xx) before you follow procedures related to a hardware item.
1. From the node front panel on the control workstation, put the node in the SERVICE mode. 2. Power-on this processor node. 3. Go to “Step 0620-014”. Step 0620-014 At this point, you must run the advanced diagnostics in service mode on the device “cssX” (where X=0 or 1) and its associated switch port.
v If no, go to “Step 0620-019”. Step 0620-019 Some but not all switch data cables appear to be having problems. 1. Depending on whether the problem is a wrapped port or a switch-to-switch connection, perform one of the following steps: v Wrapped port (wrap plug installed): Remove the existing wrap plug.
dsh -a /usr/lpp/ssp/css/rc.switch Note: When working with a two-plane SP Switch2 system, add the adapter name to the command, as follows: dsh -a /usr/lpp/ssp/css/rc.switch -a <adapter_name> Example: dsh -w fr2n03,fr2n04,fr3n01 /usr/lpp/ssp/css/rc.
Chapter 2. Locations Naming standard for RS/6000 SP components ..................... 2 - 1 Format structure ............................... 2 - 1 Example of format structure .......................... 2 - 1 Frame (WWW) ............................... 2 - 1 Major assembly (XXX) .
– 01 - 99 for frames 1-99 (specific to that frame) Notes: 1. E01 designates RS/6000 SP physical frame 1 2. L00 designates any/all RS/6000 SP logical frames 3.
Front and rear views of RS/6000 SP frame Figure 2-1 shows a front view of the RS/6000 SP frame locations. “Frame (FRA)” on page 2-5 describes the assembly designations for the RS/6000 SP frame. Figure notes: 1. Frames equipped with the SP Redundant Power Supply must have four power modules (books) installed in the SEPBU.
Figure 2-2 shows a front view of the RS/6000 SP multi-switch frame. Figure 2-3 on page 2-5 shows a rear view of the RS/6000 SP frame locations. Main Power Switch with LED Left Skirt Right Skirt Switch.
Note: See notes under Figure 2-1 on page 2-3 for processor node/switch assembly numbering. Frame locations Figure 2-1 on page 2-3 shows a front view of the RS/6000 SP frame locations, with numbered processor nodes, and the three phase SEPBU.
G6: Front door ground G7: Rear door ground G8: Ground SW: Power-on switch LD: LED card FC: Front cover RC: Rear cover Example: E01-FRA-G1 2-6 RS/6000 SP: SP Switch2 Service Guide.
Switch assembly locations Connector details Figure 2-5 on page 2-8 shows RS/6000 SP component connector details. Figure 2-4. SP Switch2 high level planar view Chapter 2.
Cable routing Figure 2-6 on page 2-9 and Figure 2-7 on page 2-9 show back views of the RS/6000 SP frame, showing the horizontal and vertical paths of cable routing from connector-to-connector , with the depth amplified on the drawing.
Note: For a multi-switch frame (F/C 2032), refer to Figure 2-6. T able 2-1 on page 2-10 shows external cable routing in a RS/6000 SP frame populated with 16 processor nodes. (Refer to “Cable routing” on page 2-8 to see the routing paths.) Figure 2-6.
T able 2-1. External cable routing Slot Number (Node) Cable Budget millimeters (inches) Frame Entrance (New Style) Frame Entrance (Old Style) V ertical Routing (Old Style) Horizontal Routing (Old Styl.
Switch data cables SP Switch2 data cables T able 2-2 describes the attachment locations and routing for the internal SP Switch2 data cables. T able 2-2.
T able 2-2. SP Switch2 data cable chart (continued) Cable Part Number Plug from Location Plug to Location 05N6351 E00-S00-BH-J15 E00-N16-BH-P A Notes: 1. “P A” refers to connector on SP Switch2 adapter . 2. Only one cable type is used for all switch locations.
Chapter 3. Service procedures Personal ESD requirements ............................ 3 - 1 T ools and files overview .............................. 3 - 1 Using the css.snap script ............................. 3 - 3 css.snap file structure ...........
T able 3-1. Service procedure tools Utility Runs on Description Directory fault_service_Worm_RTG_CS All nodes Monitors the switch for faults. It restarts the switch if a fault is detected.
T able 3-2. Setup output files (continued) File Location Description Directory css.snap.log All nodes Log files created by the switch support code /var/adm/SPlogs/css T able 3-3. T uning output files File Location Description Directory daemon.stdout All nodes Keeps a detailed account of the tuning process initiated by the Estart command.
v -n Assumes that the device driver or daemon has flushed the cache. v -s T akes a soft snap, which does not dump the adapter state. This excludes the col_dump.
T able 3-4. SP Switch2 log files (continued) Log File Information Level File Location File Contents core node nodes Fault service daemon core dump file. css.snap.log node nodes css.snap snapshot command log information. Contains a list of all files gathered in the last snapshot.
T able 3-4. SP Switch2 log files (continued) Log File Information Level File Location File Contents topology .data port primary node System error messages from the distribution of the topology file to the secondary nodes.
Self-test Conditions ?? Pass sequence 1. Both LEDs light for about 10 seconds 2. Both LEDs go off 3. Green LED stays off, while the yellow LED flashes the switch address 4. Y ellow LED goes of f for about two seconds (green LED stays off) 5. Both LEDs light for about one second 6.
Base code verification Note: This is not a Perspectives function. Perform the following procedure to check for supervisor conditions that require action.
Removing and restoring switch resources This procedure can be performed to allow customer to use a switch feature while extended service actions are performed on an individual frame of a multi-frame system with the switch feature.
Using Perspectives to fence and unfence nodes attached to the switch Fencing nodes 1. Bring up Hardware Perspectives for the system 2. Select the node to be fenced by either: v Double clicking the node v Opening the node’s notebook 3.
Replacing an SP Switch2 from service position Note: Make certain the switch has been returned to the active configuration after replacing the switch from the service position. 1. Install the switch by sliding it into the front of the frame. 2. Reinstall the screws holding the switch to the front of the frame.
9. On the control workstation, set the node to boot from disk. For example: spbootins -r disk 2 12 1 10. On the control workstation, use Perspectives to power off the node and then power it back on. The node will now boot from the device that you specified in step 7 with the correct time.
Chapter 4. FRU removals and replacements Handling static-sensitive devices .......................... 4 - 1 SP Switch2 service procedures ........................... 4 - 2 Removing a fan assembly ............................ 4 - 2 Replacing a fan assembly .
SP Switch2 service procedures CAUTION: The unit weight exceeds 18 Kg (40 lbs) and requires two service personnel to lift. ( SPSFC002 ) Note This chapter describes removal and replacement procedures fo.
3. Return to the procedure that directed you here. Removing a power supply Note: The power supply is a card-mounted assembly and is hot-pluggable. Note: Refer to “Handling static-sensitive devices” on page 4-1. 1. Remove the fan assembly blocking access to the power supply assembly .
Removing the LED bracket assembly Note: The LED bracket assembly is hot-pluggable. Note: Refer to “Handling static-sensitive devices” on page 4-1. 1. Remove the fan assembly housing the LED bracket assembly . v Refer to “Removing a fan assembly” on page 4-2.
Removing the switch supervisor card Note: The switch supervisor card is hot-pluggable. Note: Refer to “Handling static-sensitive devices” on page 4-1. 1. Remove the switch supervisor cable from the supervisor card located in slot J2. 2. Loosen the captive screw holding the card retention bracket to the switch chassis.
Replacing the switch supervisor card 1. Position and hold the switch supervisor card at the far end of the hot-plug actuator . 2. Insert the far end of the actuator into the actuator guides of switch slot J2. 3. Push the actuator to the end of the guide track.
Note: Refer to “Handling static-sensitive devices” on page 4-1. Note: If multiple cards are to be removed, label and record the position of each card and its associated I/O cable. 1. If the interposer card is a switch interposer , remove the I/O cable.
Replacing the switch planar Attention: Replacement of the switch planar involves the replacement of the removed switch’s externally accessible plug-ins into a new , partially populated switch planar-in-chassis assembly . 1. Remove all fan assemblies from the new switch planar assembly .
Replacing the 48 V dc circuit breaker assembly 1. Insert the 48 V dc circuit breaker assembly into the switch chassis. 2. Secure the assembly to the chassis with the mounting screws. 3. Plug the circuit breaker assembly power cable into J48V on the switch planar .
8. Remove the front top-cover from the chassis as follows: a. Remove the cover mounting screws. b. Slide the front top-cover toward the center of the chassis. c. Remove the cover when it disengages from the chassis framework. 9. Cut the tie-wrap securing the LED power extension cable to the cover mounted power tray .
Chapter 5. Parts catalog SP Switch2 assembly (view 1) ........................... 5 - 2 SP Switch2 assembly (view 2) ........................... 5 - 4 Switch cables ................................. 5 - 6 SP Switch2 Frame (F/C 2032) ..................
SP Switch2 assembly (view 1) 5-2 RS/6000 SP: SP Switch2 Service Guide.
T able 5-1. SP Switch2 assembly (view 1) Assembly index Part number Units Description SP Switch2 Assembly (reference only) 1 31L7106 4 Fan assembly 2 1 1P1636 4 Power Supply , 3.
SP Switch2 assembly (view 2) 5-4 RS/6000 SP: SP Switch2 Service Guide.
T able 5-2. SP Switch2 assembly (view 2) Assembly index Part number Units Description SP Switch2 Assembly (reference only) 1 05N6603 1 Replacement Assembly , Switch Planar 2 31L71 12 1 Cable, LED Powe.
Switch cables This page intentionally left blank. 5-6 RS/6000 SP: SP Switch2 Service Guide.
T able 5-3. Switch cables Assembly index Part number Units Description SP Switch2 Data Cable 1 1P0006 AR Cable, Switch Data - (2615 mm) ------------------------------------ SP Switch2 External Cables .
SP Switch2 Frame (F/C 2032) FRONT 1 C-3 1 2 3 4 5 6 6 5-8 RS/6000 SP: SP Switch2 Service Guide.
T able 5-4. SP Switch2 Frame (F/C 2032) Assembly index Part number Units Description 1 31L8515 AR Rail, left 77G0599 AR Screw 74F1823 AR Nutclip 2 1 1P0097 AR Bracket, mounting, SP Switch2 77G0559 AR .
F/C 2032 frame extender 3 2 1 5 6 4 5-10 RS/6000 SP: SP Switch2 Service Guide.
T able 5-5. F/C 2032 frame extender Assembly index Part number Units Description 1 44P1029 1 T op cover , frame extender 2 4491031 1 Right side, frame extender 3 Bracket, lower frame 4 54G2943 2 Leveling pad 5 44P1032 16 Cable bracket, frame extender 6 44P1030 1 Left side, frame extender 54G2882 36 Screw , hex head, M5 Chapter 5.
SP Switch2 Frame Model 556 and F/C 2034 REAR 3 2 1 4 6 6 FRONT 5 5 5-12 RS/6000 SP: SP Switch2 Service Guide.
T able 5-6. SP Switch2 Frame Model 556 and F/C 2034 Assembly index Part number Units Description 1 31L8515 AR Rail, left 77G0599 AR Screw 74F1823 AR Nutclip 2 1 1P0097 AR Bracket, mounting, SP Switch2.
Model 556 and F/C 2034 frame extender 3 2 1 5 6 4 5-14 RS/6000 SP: SP Switch2 Service Guide.
T able 5-7. Model 556 and F/C 2034 frame extender Assembly index Part number Units Description 1 21L3091 1 T op cover , frame extender 2 21L3088 1 Right side, frame extender 3 Bracket, lower frame 4 54G2943 2 Leveling pad 5 21L3090 16 Cable bracket, frame extender 6 21L3089 1 Left side, frame extender 54G2882 36 Screw , hex head, M5 Chapter 5.
5-16 RS/6000 SP: SP Switch2 Service Guide.
Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area.
Electronic emissions notices Federal Communications Commission (FCC) statement This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to Part 15 of the FCC Rules.
For installations in Japan: The following is a summary of the VCCI Japanese statement in the box above. This is a Class A product based on the standard of the V oluntary Control Council for Interference by Information T echnology Equipment (VCCI). If this equipment is used in a domestic environment, radio disturbance may arise.
A-4 RS/6000 SP: SP Switch2 Service Guide.
Index Numerics 0375867 5-9, 5-13 05N6603 5-3, 5-5 05N6647 5-3 08J5557 5-9 1 1J4774 5-9 1 1J5155 5-9, 5-13 1 1J5189 5-9 1 1J5191 5-9 1 1J5193 5-9 1 1J5195 5-9 1 1P0006 5-7 1 1P0097 5-9, 5-13 1 1P0492 5.
files created by css.snap 3-4 package names, css.snap 3-6 structure, css.snap 3-4 files overview , service 3-1 format structure 2-1 frame multi-switch 5-8, 5-12 rear view 2-4 frame cable routing path .
removing (continued) LED bracket 4-4 LED power extension cable 4-9 power supply 4-3 RS/6000 components 4-1 switch interposer 4-6 switch planar 4-7 switch supervisor card 4-5 wrap interposer 4-6 removi.
verification, service procedures 3-1 W who should use book xv wrap interposer removing 4-6 replacing 4-7 wrap plugs 1-5 X-4 RS/6000 SP: SP Switch2 Service Guide.
Reader’s comments – W e’d like to hear from you RS/6000 SP SP Switch2 Service Guide Publication No. GA22-7444-03 Overall, how satisfied are you with the information in this book? V ery Satisfied.
Readers’ Comments — We’d Like to Hear from Y ou GA22-7444-03 GA22-7444-03 IBMR Cut or Fold Along Line Cut or Fold Along Line Fold and T ape Please do not staple Fold and T ape Fold and T ape Please do not staple Fold and T ape NO POST AGE NECESSARY IF MAILED IN THE UNITED ST A TES BUSINESS REPL Y MAIL FIRST -CLASS MAIL PERMIT NO.
.
IBMR Printed in the United States of America on recycled paper containing 10% recovered post-consumer fiber . GA22-7444-03.
An important point after buying a device IBM RS/6000 SP (or even before the purchase) is to read its user manual. We should do this for several simple reasons:
If you have not bought IBM RS/6000 SP yet, this is a good time to familiarize yourself with the basic data on the product. First of all view first pages of the manual, you can find above. You should find there the most important technical data IBM RS/6000 SP - thus you can check whether the hardware meets your expectations. When delving into next pages of the user manual, IBM RS/6000 SP you will learn all the available features of the product, as well as information on its operation. The information that you get IBM RS/6000 SP will certainly help you make a decision on the purchase.
If you already are a holder of IBM RS/6000 SP, but have not read the manual yet, you should do it for the reasons described above. You will learn then if you properly used the available features, and whether you have not made any mistakes, which can shorten the lifetime IBM RS/6000 SP.
However, one of the most important roles played by the user manual is to help in solving problems with IBM RS/6000 SP. Almost always you will find there Troubleshooting, which are the most frequently occurring failures and malfunctions of the device IBM RS/6000 SP along with tips on how to solve them. Even if you fail to solve the problem, the manual will show you a further procedure – contact to the customer service center or the nearest service center