Instruction/ maintenance manual of the product StorEdge 3900 Series Sun Microsystems
Go to page of 162
Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U .S.A. 650-960-1300 Send comments about this document to: docfeedback@sun.com Sun StorEdge ™ 3900 and 6900 Ser ies T roub leshooting Guide P ar t No.
Please Recycle Copyright 2002 Sun Microsystems, Inc., 4150 Network Cir cle, Santa Clara, CA 95054 U.S.A. All rights reserved. This product or document is distributed under licenses restricting its use, copying, distribution, and decompilation.
Contents iii For Internal Use Only Contents 1. Introduction 1 Predictive Failur e Analysis Capabilities 2 2. General T roubleshooting Procedures 3 T roubleshooting Overview T asks 3 Multipathing Optio.
Contents iv For Internal Use Only Command Line T est Examples 19 qlctest(1M) 19 switchtest(1M) 20 Storage Automated Diagnostic Environment Event Grid 21 ▼ T o Customize an Event Report 21 3.
Contents v For Internal Use Only ▼ T o V erify Configuration Settings 47 ▼ T o Clear the Lock File 50 5. T roubleshooting Host Devices 53 Host Event Grid 53 ▼ Using the Host Event Grid 53 Replac.
vi Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 V irtualization Engine LEDs 72 Power LED Codes 73 Interpreting LED Service and Diagnostic Codes 73 Back Panel Features 74 Ethe.
Contents vii For Internal Use Only T roubleshooting the T1/T2 Data Path 102 Notes 102 T1/T2 Notification Events 103 Sun StorEdge T3+ Array Storage Service Processor V erif ication 106 T1/T2 FRU T ests.
viii Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002.
List of Figures ix List of Figur es FIGURE 2-1 Sun StorEdge 3900 Series Fibre Channel Link Diagram 16 FIGURE 2-2 Sun StorEdge 6900 Series Fibre Channel Link Diagram 17 FIGURE 3-1 Data Host Notificatio.
List of Figures x FIGURE 7-6 Path Failure —I/O Routed through Both HBAs 94 FIGURE 7-7 Virtualization Engine Event Grid 95 FIGURE 8-1 Storage Service Processor Event 103 FIGURE 8-2 Virtualization Eng.
xi Pr eface The Sun StorEdge 3900 and 6900 Series T roubleshooting Guide pr ovides guidelines for isolating problems in supported conf igurations of the Sun StorEdge TM 3900 and 6900 series. For detailed configuration information, refer to the Sun StorEdge 3900 and 6900 Series Reference Manual .
xii Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 Chapter 7 provides detailed information for tr oubleshooting the virtualization engines. Chapter 8 describes how to troubleshoot the Sun StorEdge T3+ array devices. Also included in this chapter is information about the Explorer Data Collection Utility .
Pref ace xiii T ypographic Conventions Shell Pr ompts T ypeface Meaning Examples AaBbCc123 The names of commands, files, and directories; on-scr een computer output Edit your .
xiv Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 Related Documentation Product Title Part Number Late-breaking News • Sun StorEdge 3900 and 6900 Series Release Notes 816-32.
Pref ace xv Accessing Sun Documentation Online A broad selection of Sun system documentation is located at: http://www.sun.com/products-n-solutions/hardware/docs A complete set of Solaris documentation and many other titles are located at: http://docs.
xvi Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002.
1 CHAPTER 1 Intr oduction The Sun StorEdge 3900 and 6900 series storage subsystems are complete preconf igured storage solutions. The configurations for each of the storage subsystems are shown in T ABLE 1- 1 .
2 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 Pr edictive Failur e Analysis Capabilities The Storage Automated Diagnostic Environment software provides the health and monitoring functions for the Sun StorEdge 3900 and 6900 series systems.
3 CHAPTER 2 General T r oubleshooting Pr ocedur es This chapter contains the following sections: ■ “T r oubleshooting Overview T asks” on page 3 ■ “Multipathing Options in the Sun StorEdge 6.
4 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 1. Discover the error by checking one or more of the following messages or f iles: ■ Storage Automated Diagnostic Environme.
Chapter 2 General T roub leshooting Procedures 5 For Internal Use Only 4. Check the status of the Sun StorEdge FC network switch-8 and switch-16 switches using the following tools: ■ Storage Automat.
6 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 8. V erify the f ix using the following tools: ■ Storage Automated Diagnostic Environment GUI T opology V iew and Diagnostic T ests ■ /var/adm/messages on the data host 9.
Chapter 2 General T roub leshooting Procedures 7 For Internal Use Only Multipathing Options in the Sun StorEdge 6900 Series Using the virtualization engines presents several challenges in how multipathing is handled in the Sun StorEdge 6900 series.
8 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 Note that in the Class and State fields, the virtualization engines ar e presented as two primary/ONLINE devices.
Chapter 2 General T roub leshooting Procedures 9 For Internal Use Only 2. Using Storage Automated Diagnostic Environment T opology GUI, determine which virtualization engine is in the path you need to disable. 3. Use the world wide name (WWN) of the virtualization engine that is in the unconf igure command, as follows: 4.
10 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 ▼ T o Suspend the I/O Use one of the following methods to suspend the I/O while the failover occurs: 1. Stop all customer applications that are accessing the Sun StorEdge T3+ array .
Chapter 2 General T roub leshooting Procedures 11 For Internal Use Only ▼ T o V iew the VxDisk Properties 1. T ype the following: From the VxDisk output, notice that there ar e two physical paths to the LUN: ■ c20t2B000060220041F4d0s2 ■ c23t2B000060220041F9d0s2 Both of these paths are curr ently enabled with VxDMP .
12 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 2. Use the luxadm (1M) command to display further information about the underlying LUN. # luxadm display /dev/rdsk/c20t2B000060220041F4d0s2 DEVICE PROPERTIES for disk: /dev/rdsk/c20t2B000060220041F4d0s2 Status(Port A): O.
Chapter 2 General T roub leshooting Procedures 13 For Internal Use Only ▼ T o Quiesce the I/O on the A3/B3 Link 1. Determine the path you want to disable. 2. Disable the path by typing the following: 3. V erify that the path is disabled: Steps 1 and 2 halt I/O only up to the A3/B3 link.
14 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 ▼ T o Return the Path to Pr oduction 1. T ype: 2. V erify that the path has been re-enabled by typing: # vxdmpadm enable c.
Chapter 2 General T roub leshooting Procedures 15 For Internal Use Only Fibr e Channel Links The following sections provide tr oubleshooting information for the basic components and Fibre Channel links, listed in T ABLE 2-1 .
16 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 Fibr e Channel Link Diagrams FIGURE 2-1 shows the basic components and the Fibre Channel links for a Sun StorEdge 3900 serie.
Chapter 2 General T roub leshooting Procedures 17 For Internal Use Only FIGURE 2-2 shows the basic components and the Fibre Channel links for a Sun StorEdge 6900 series system: ■ A1 to B1— HBA to .
18 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 Host Side T r oubleshooting Host-side troubleshooting r efers to the messages and errors the data host detects.
Chapter 2 General T roub leshooting Procedures 19 For Internal Use Only Command Line T est Examples T o run a single Sun StorEdge diagnostic test fr om the command line rather than through the Storage Automated Diagnostic Environment interface, you must log into the appropriate Host or Slave for testing the components.
20 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 switchtest (1M) switchtest (1M) is used to diagnose the Sun StorEdge network FC switch-8 and switch-16 switch devices. The switchtest process also provides command line access to switch diagnostics.
Chapter 2 General T roub leshooting Procedures 21 For Internal Use Only Storage Automated Diagnostic Envir onment Event Grid The Storage Automated Diagnostic Environment generates component-specific event grids that describe the severity of an Event, whether action is required, a description of the event, and recommended action.
22 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002.
23 CHAPTER 3 T r oubleshooting the Fibr e Channel Links A1/B1 Fibr e Channel (FC) Link If a problem occurs with the A1/B1 FC link: ■ In a Sun StorEdge 3900 series system, the Sun StorEdge T3+ array will fail over .
24 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 FIGURE 3-2 Data Host Notif ication of Severe Link Err or FIGURE 3-3 Storage Service Pr ocessor Notification Note – An A1/B1 FC link error can cause a port in sw1a or sw1b to change state.
Chapter 3 T roubleshooting the Fibre Channel Links 25 For Internal Use Only ▼ T o V erify the Data Host An error in the A1/B1 FC link can cause a path to go of fline in the multipathing software.
26 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 An error in the A1/B1 FC link can also cause a device to enter the “unusable” state in cfgadm . In this case, the output for luxadm -e port will show that a device that was “connected” changed to an “unconnected” state.
Chapter 3 T roubleshooting the Fibre Channel Links 27 For Internal Use Only CODE EXAMPLE 3-3 switchtest (1M) called with options Note – The Storage Automated Diagnostic Environment automatically resets the transfer size if it notes that it is about to test a switch to HBA connection.
28 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 ▼ T o Isolate the A1/B1 FC Link 1. Quiesce the I/O on the A1/B1 FC link path. 2. Run switchtest or qlctest to test the entire link. 3. Break the connection by uncabling the link.
Chapter 3 T roubleshooting the Fibre Channel Links 29 For Internal Use Only A2/B2 Fibr e Channel (FC) Link If a problem occurs with the A2/B2 FC link: ■ In a Sun StorEdge 3900 series system, the Sun StorEdge T3+ array will fail over .
30 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 FIGURE 3-5 A2/B2 FC Link Storage Service Processor Side Event Site : FSDE LAB Broomfield CO Source : diag.xxxxx.xxx.com Severity : Normal Category : Switch Key: switch:100000c0dd0061bb EventType: StateChangeEvent.
Chapter 3 T roubleshooting the Fibre Channel Links 31 For Internal Use Only ▼ T o V erify the Host Side An error in the A2/B2 FC link can r esult in a device being listed as in an “unusable” state in cfgadm , but no HBAs are listed as in the “unconnected” state in luxadm output.
32 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 CODE EXAMPLE 3-4 cfgadm -al # cfgadm -al Ap_Id Type Receptacle Occupant Condition c0 scsi-bus connected configured unknown &.
Chapter 3 T roubleshooting the Fibre Channel Links 33 For Internal Use Only Note – Y ou can f ind procedures for r estoring virtualization engine settings in the Sun StorEdge 3900 and 6900 Series Reference Manual .
34 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 5. If the switch or the GBIC show no errors, replace the remaining components in the following order: a. Replace the virtualization engine-side GBIC, recable the link, and monitor the link for errors.
Chapter 3 T roubleshooting the Fibre Channel Links 35 For Internal Use Only A3/B3 Fibr e Channel (FC) Link If a problem occurs with the A3/B3 FC link: ■ In a Sun StorEdge 3900 series system, the Sun StorEdge T3+ array will fail over .
36 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 FIGURE 3-7 A3/B3 FC Link Storage Service Processor -Side Event FIGURE 3-8 A3/B3 FC Link Storage Service Processor -Side Event Site : FSDE LAB Broomfield CO Source : diag.xxxxx.
Chapter 3 T roubleshooting the Fibre Channel Links 37 For Internal Use Only ▼ T o V erify the Host Side An error in the A3/B3 FC link r esults in a device being listed as in an “unusable” state in cfgadm , but no HBAs are listed as in the “unconnected” state in luxadm output.
38 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 CODE EXAMPLE 3-6 VxDMP Error Message ▼ T o V erify the Storage Service Pr ocessor Y ou can check the A3/B3 FC link using the Storage Automated Diagnostic Environment, Diagnose—T est from T opology functionality .
Chapter 3 T roubleshooting the Fibre Channel Links 39 For Internal Use Only ▼ T o Isolate the A3/B3 FC Link 1. Quiesce the I/O on the A3/B3 FC link path. 2. Break the connection by uncabling the link. 3. Insert the loopback connector into the switch port.
40 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 A4/B4 Fibr e Channel (FC) Link If a problem occurs with the A4/B4 FC link: ■ In a Sun StorEdge 3900 series system, the Sun StorEdge T3+ array will fail over .
Chapter 3 T roubleshooting the Fibre Channel Links 41 For Internal Use Only FIGURE 3-10 Storage Service Processor Notification Site : FSDE LAB Broomfield CO Source : diag Severity : Warning Category : Switch DeviceId : switch:100000c0dd0061bb EventType: LogEvent.
42 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 ▼ T o V erify the Data Host A problem in the A4/B4 FC Link appears dif ferently on the data host, depending on if the array is a Sun StorEdge 3900 series or a Sun StorEdge 6900 seriesdevice.
Chapter 3 T roubleshooting the Fibre Channel Links 43 For Internal Use Only T o verify the failover luxadm display can be used, the failed path will be marked OFFLINE, as shown in CODE EXAMPLE 3-7 .
44 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 CODE EXAMPLE 3-8 Failed Path marked “unusable” FRU tests available for the A4/B4 FC Link Segment ■ The switchtest can only be run from the Storage Service Pr ocessor ■ The linktest will be able to isolate the switch and the GBIC on the switch.
Chapter 3 T roubleshooting the Fibre Channel Links 45 For Internal Use Only 5. Rerun switchtest . a. If switchtest fails, replace the GBIC and rerun switchtest . b. If the test fails again, replace the switch. 6. If switchtest passes, assume that the suspect components are the cable and the Sun StorEdge T3+ array controller .
46 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002.
47 CHAPTER 4 Conf iguration Settings This chapter contains the following sections: ■ “V erifying Configuration Settings” on page 47 ■ “T o Clear the Lock File” on page 50 For a complete listing of SUNWsecfg Error Messages and recommended action, r efer to Appendix B.
48 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 Note – For cluster configurations and systems that ar e attached to W indows NT , the default configurations may not match the current installed conf iguration. Be aware of this when running the verification scripts.
Chapter 4 Configuration Settings 49 For Internal Use Only 10. If anything is marked F AIL, check the /var/adm/log/SEcfglog f ile for the details of the failure. In this example, the mirror setting in the Sun StorEdge T3+ array system settings is “off.
50 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 1 1. Fix the F AIL condition, and then verify the settings again. If you interrupt any of the SUNWsecfg scripts (by typing a Control-C default font, for example), a lock file might r emain in the /opt/SUNWsecfg/etc directory , causing subsequent commands to fail.
Chapter 4 Configuration Settings 51 For Internal Use Only CODE EXAMPLE 4-2 savevemap output When savevemap: <ve-pair> EXIT is displayed, the savevemap process has successfully exited. Tue Jan 29 16:12:34 MST 2002 savevemap: v1 ENTER. Tue Jan 29 16:12:34 MST 2002 checkslicd: v1 ENTER.
52 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002.
53 CHAPTER 5 T r oubleshooting Host Devices This chapter describes how to tr oubleshoot components associated with a Sun StorEdge 3900 or 6900 series Host.
54 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 FIGURE 5-1 Host Event Grid.
Chapter 5 T roubleshooting Host De vices 55 For Internal Use Only T ABLE 5-1 lists all the host events in the Storage Automated Diagnostic Environment.
56 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 host lun.VE Alarm- Red Y [ Info ] The state of lun.VE.c14t50020 F2300003EE5d0s2. statusA on diag. xxxxx.xxx .com changed from OK to ERROR (target=ve:diag244- ve0/90.0.0.40) luxadm display reported a change in the port status of one of its paths.
Chapter 5 T roubleshooting Host De vices 57 For Internal Use Only Replacing the Master , Alternate Master , and Slave Monitoring Host The following procedur es are a high-level overview of the procedur es that are detailed in the Storage Automated Diagnostic Environment User ’ s Guide .
58 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 5. Choose Utilities -> System -> Recover Conf ig. Refer to Chapter 7 of the Storage Automated Diagnostic Environment User ’ s Guide for detailed instructions.
Chapter 5 T roubleshooting Host De vices 59 For Internal Use Only 7. Choose Maintenance -> General Maintenance -> Maintain Hosts. Refer to Chapter 3, “Maintenance,” of the Storage Automated Diagnostic User ’ s Guide for detailed instructions.
60 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002.
61 CHAPTER 6 T r oubleshooting Sun StorEdge FC Switch-8 and Switch-16 Devices This chapter describes how to troubleshoot the switch components associated with a Sun StorEdge 3900 or 6900 series system.
62 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 These switches can be monitored thr ough the SANSurfer GUI, which is available on the Storage Service Processor . Y ou configure and modify the switches using the Configuration Utilities.
Chapter 6 T roubleshooting Sun StorEdge FC Switch-8 and Switch-16 De vices 63 For Internal Use Only FIGURE 6-1 Switch Event Grid.
64 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 T ABLE 6-1 lists the switch events. T ABLE 6-1 Storage Automated Diagnostic Envir onment Event Grid for Switches Cat Compone.
Chapter 6 T roubleshooting Sun StorEdge FC Switch-8 and Switch-16 De vices 65 For Internal Use Only switch enclosure Audit Auditing a new switch called ras d2-swb1 (ip=xxx.0.0.41) 10002000007a609 switch oob Comm_ Established Communication regained with sw1a (ip= xxx .
66 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 switch enclosure Discovery [ Info ] Discovered a new switch called ras d2-swb1 (ip=xxx.0.0.41) 10002000007a609 Discovery events occur the very first time the agent probes a storage device.
Chapter 6 T roubleshooting Sun StorEdge FC Switch-8 and Switch-16 De vices 67 For Internal Use Only switch port StateChange+ [ Info/Action ] port.1 in SWITCH diag185 ( ip= xxx.20.67.185 )i s now A vailable (status- state changed from OFFLINE to ONLINE) Port on switch is now available.
68 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide • March 2002 Replacing the Master Midplane Follow this procedur e when replacing the master midplane in a Sun StorEdge network FC switch-8 or switch-16 switch or a Brocade Silkworm switch.
69 CHAPTER 7 T r oubleshooting V irtualization Engine Devices This chapter describes how to troubleshoot the virtualization engine component of a Sun StorEdge 6900 series system.
70 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 V irtualization Engine Diagnostics The virtualization engine monitors the following components: ■ V irtualization engine r.
Chapter 7 T roubleshooting Virtualization Engine Devices 71 For Internal Use Only ▼ T o Display Log Files and Retrieve SRNs Use the /opt/svengine/sduc/sreadlog command to display log files and retrieve the Service Request Numbers (SRN) for errors that need action.
72 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 ▼ T o Clear the Log ● Use the /opt/svengine/sduc/sclrlog command. V irtualization Engine LEDs T ABLE 7-1 describes the LEDs on the back of the virtualization engine.
Chapter 7 T roubleshooting Virtualization Engine Devices 73 For Internal Use Only Power LED Codes The virtualization engine LEDs are shown in FIGURE 7-1 . FIGURE 7-1 V irtualization Engine Front Panel LEDs Interpr eting LED Service and Diagnostic Codes The Status LED communicates the status of the virtualization engine in decimal numbers.
74 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 Back Panel Featur es The back panel of the virtualization engine contains the Sun StorEdge network FC switch-8 or switch-16 switches and a socket for the AC power input, and various data ports and LEDs.
Chapter 7 T roubleshooting Virtualization Engine Devices 75 For Internal Use Only Fibr e Channel Link Error Status Report The virtualization engine’s host-side and device-side interfaces provide statistical data for the counts listed in T ABLE 7-4 .
76 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 ▼ T o Check Fibr e Channel Link Error Status Manually The Storage Automated Diagnostic Environment, which runs on the Storage Service Processor , monitors the Fibre Channel link status of the virtualization engine.
Chapter 7 T roubleshooting Virtualization Engine Devices 77 For Internal Use Only CODE EXAMPLE 7-1 Fibre Channel Link Err or Status Example Note – v1 repr esents the first virtualization engine pair Note – The SLIC daemon must be running for the /opt/svengine/sduc/svstat -d v1 command to work.
78 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 T ranslating Host Device Names Y ou can translate host device names to VLUN, disk pool, and physical Sun StorEdge T3+ array LUNs.
Chapter 7 T roubleshooting Virtualization Engine Devices 79 For Internal Use Only ▼ T o Display the VLUN Serial Number Devices That Ar e Not Sun StorEdge T raff ic Manager-Enabled 1. Use the format -e command. 2. T ype the disk on which you are working at the format prompt.
80 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 Sun StorEdge T raf fic Manager -Enabled Devices 1. If the devices support the Sun StorEdge T raff ic Manager software, you can use this shortcut. 2. T ype: The /dev/rdsk/c#t# repr esents the Global Unique Identifier of the device.
Chapter 7 T roubleshooting Virtualization Engine Devices 81 For Internal Use Only ▼ T o V iew the V irtualization Engine Map The virtualization engine map is stored on the Storage Service Processor . 1. T o view the virtualization engine map, type: Note – This example uses the virtualization engine map file, which could include old information.
82 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 2. Y ou can optionally establish a telnet connection to the virtualization engine and run the runsecfg utility to poll a live snapshot of the virtualization engine map. Refer to “T o Replace a Failed V irtualization Engine” on page 84 for telnet instructions.
Chapter 7 T roubleshooting Virtualization Engine Devices 83 For Internal Use Only ▼ T o Failback the V irtualization Engine In the event of a Sun StorEdge T3+ array LUN failover , use the following procedur e to fail the LUN back to its original controller .
84 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 For detailed information about the SUNWsecfg scripts, refer to the Sun StorEdge 3900 and 6900 Series Reference Manual . ▼ T o Replace a Failed V irtualization Engine 1. Replace the old (failed) virtualization engine unit with a new unit.
Chapter 7 T roubleshooting Virtualization Engine Devices 85 For Internal Use Only 1 1. Enable the switch port: 12. Reset the virtualization engine: 13. Find the initiator number for the new and old number: The new unit will not have any zones defined.
86 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 ▼ T o Manually Clear the SAN Database It is occasionally necessary to manually clear the SAN database on the virtualization engine routers.
Chapter 7 T roubleshooting Virtualization Engine Devices 87 For Internal Use Only Stopping and Restarting the SLIC Daemon Follow this procedur e to restart the SLIC daemon if the SLIC daemon becomes unresponsive, or if messages such as the following are displayed: connect: Connection refused or Socket error encountered.
88 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 3. Remove the segments by typing the following: Check the ipcrm (1m) man page for details. 4. Restart the SLIC daemon 5. Conf irm that the SLIC daemon is running: The message queues, shared memory , and semaphores have been removed.
Chapter 7 T roubleshooting Virtualization Engine Devices 89 For Internal Use Only Sun StorEdge 6900 Series Multipathing Example One Sun StorEdge T3+ array partner pair with 1 500GB RAID 5 LUN per brick (2 LUNs total) Currently , there is one 10GB VLUN cr eated from each physical LUN, for a total of two VLUNs.
90 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 In the event of a path failure after the second tier of Sun StorEdge network FC switch-8 and switch-16 switches (or in the e.
Chapter 7 T roubleshooting Virtualization Engine Devices 91 For Internal Use Only FIGURE 7-3 Primary Data Paths to the Alternate Master.
92 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 FIGURE 7-4 Primary Data Paths to the Master Sun StorEdge T3+ Array.
Chapter 7 T roubleshooting Virtualization Engine Devices 93 For Internal Use Only FIGURE 7-5 Path Failure—Befor e the Second T ier of Switches.
94 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 FIGURE 7-6 Path Failure —I/O Routed thr ough Both HBAs.
Chapter 7 T roubleshooting Virtualization Engine Devices 95 For Internal Use Only V irtualization Engine Event Grid The Storage Automated Diagnostic Environment Event Grid enables you to sort virtualization engine events by component, category , or event type.
96 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002 T ABLE 7-5 lists the V irtualization Engine Events. T ABLE 7-5 Storage Automated Diagnostic Envir onment Event Grid for V ir.
Chapter 7 T roubleshooting Virtualization Engine Devices 97 For Internal Use Only virtualization engine ve_diag Diagnostic T est- Red ve_diag (diag240) on ve-1 (ip=xxx.20.67.213) failed virtualization engine veluntest Diagnostic T est- Red veluntest (diag240) on ve-1 (ip=xxx.
98 Sun StorEdge 3900 and 6900 Series T roub leshooting Guide — March 2002.
99 CHAPTER 8 T r oubleshooting the Sun StorEdge T3+ Array Devices This chapter contains the following sections: ■ “Explorer Data Collection Utility” on page 99 ■ “Sun StorEdge T3+ Array Even.
100 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 Do not accept automatic emailing of the Explorer Data Collection Utility output, unless the Storage Service Processor is pr operly set up to handle mail correctly .
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 101 For Internal Use Only CODE EXAMPLE 8-2 Editing Sun StorEdge T3+ array information using vi Note – xxxx repr esents Sun StorEdge T3+ array passwords.
102 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 T r oubleshooting the T1/T2 Data Path Notes ■ There ar e two T Port links for redundancy . ■ If one of the two links is lost, no Sun StorEdge T3+ array LUN failover will occur , and no pathing failures will be noted.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 103 For Internal Use Only T1/T2 Notif ication Events The example below shows a typical port failure event FIGURE 8-1 Storage Service Processor Event Site : Lab 3286 - DSQA1 Broomfield Source : diag.
104 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 If both T Ports go off line, you might see messages like the following. Note the virtualization engine Event alerting the LUN failover . Site : Lab 3286 - DSQA1 Broomfield Source : diag.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 105 For Internal Use Only FIGURE 8-2 V irtualization Engine Alert ...continued from previous page... ---------------------------------------------------------------------- Site : Lab 3286 - DSQA1 Broomfield Source : diag.
106 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 Sun StorEdge T3+ Array Storage Service Pr ocessor V erif ication 1. Run port listmap on the Sun StorEdge T3+ array to see the failover event.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 107 For Internal Use Only T1/T2 FRU T ests A vailable ■ Switch - switchtest ■ Link - linktest Running linktest from the Storage Automated Diagnostic Envir onment GUI will guide the Service Engineer to discover the failed FRU.
108 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 Notes ■ When inserting a loopback connector into the T Port, there will be NO green light indicating a proper insertion. However , the test will run and be valid. There is currently an RFE to addr ess this issue.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 109 For Internal Use Only Sun StorEdge T3+ Array Event Grid The Storage Automated Diagnostic Environment Event Grid enables you to sort Sun StorEdge T3+ array events by component, category , or event type.
110 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 The following table lists all of the events for the Sun StorEdge T3+ array . Category Component EventT ype Sev Action Description Information t3 power .temp Alarm+ The state of power .
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 111 For Internal Use Only t3 power . battery Alarm- Red Y [ Info/ Action ] The state of power .u1pcu1.BatStat e on diag213 (ip=xxx.20.67.213) is Fault Possible causes are: 1. V oltage level on power supply and battery have moved out of acceptable thresholds.
112 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 t3 power . output Alarm- Red Y [ Info/ Action ] The state of power .u1pcu1.PowOu tput on diag213 (ip=xxx.20.67.21 3 ) is Fault Information: The state of the power in the Sun StorEdge T3+ array power cooling unit is not optimal.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 113 For Internal Use Only t3 enclosur e Alarm. time Discrepancy Y ello w [ Action ] T ime of T3 diag213 (ip=xxx.
114 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 t3 ib Comm_Lost Down Y [ Info/ Action ] Lost communication (InBandwith diag213 (ip=xxx.20.67.21 3) ( last reboot was 2001-09-27 15:22:00) Information: InBand. This event is established using luxadm .
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 115 For Internal Use Only t3 t3ofdg Diagnostic T est- Red t3ofdg (diag240) on diag213 ( ip= xxx. 20.67.213 ) failed t3 t3test Diagnostic T est- Red t3test ( diag240 )o n diag213 (ip= xxx.
116 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 t3 power Insert Component [ Info ] ’ power.u1pcu2’(TE CTROL-CAN.300- 1454- 01(50).008275 ) was added to T3 diag213 (ip=xxx.20.67.21 3 ) t3 enclosur e Location Change Location of t3 rasd2-t3b0 (ip=xxx.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 117 For Internal Use Only t3 disk Remove Component Red Y [ Info/ Action ] disk.u2d3(SEAGAT E.ST318203FSUN18 G.LRG07139 ) was removed fr om diag158 (ip=xxx. 20.67.158 ) Information: The Sun StorEdge T3+ array has reported a disk has been removed from the chassis.
118 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 t3 disk State Change+ disk.u1d5 in Sun StorEdge T3+ array rasd3-t3b1 (ip=xxx. 0.0.41 )i s now A vailable (status-state changed from fault- disabled to ready- enabled ) t3 interface.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 119 For Internal Use Only t3 contr oller State Change- Red Y [ Info/ Action ] controller.u1ctr in T3 diag213 (ip=xxx. 20.67.213 ) is now Not-A vailable (status-state changed from unknown to ready-disabled ) Information: The Sun StorEdge T3+ array controller has been disabled.
120 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 t3 interface. loopcard StateChange- Red Y [ Info/ Action ] Information: The Sun StorEdge T3+ array has indicated that the loopcard is no longer in an optimal state. Recommended action: 1.
Chapter 8 T roubleshooting the Sun StorEdge T3+ Arr ay De vices 121 For Internal Use Only t3 volume StateChange- Red Y [ Info/ Action ] Information: The Sun StorEdge T3+ array has reported that a power cooling unit has been disabled. Recommended action: 1.
122 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 Replacing the Master Midplane Follow this procedur e when replacing the master midplane in a Sun StorEdge T3+ array . This procedure is detailed in the Storage Automated Diagnostic Environment User ’ s Guide .
123 CHAPTER 9 T r oubleshooting Ethernet Hubs The Sun StorEdge 3900 and 6900 series uses an Ethernet hub as the backbone for the internal service network.
124 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002.
125 APPENDIX For Internal Use Only A V irtualization Engine Refer ences This Appendix contains the following T ables: ■ T able A-1 “SRN and SNMP Reference” ■ T able A-2 “ SRN/SNMP Single Poi.
126 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 70005 W rite error is detected by master . If the initiator is master , then it has detected a write error on a member within a mirr or drive. If a spare drive is available, it will be brought in and used to r eplace the failed drive.
Appendix A Vir tualization Engine References 127 7009A Read degrade recorded . A mirr or drive was written to, causing it to enter the degrade state. Reinsert the missing drive, or r eplace it with a drive of equal or greater capacity . 7009B W rite degrade recorded .
128 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 72005 Failed to check for SAN changes. 72006 Failed to read SAN event log. 72007 SLIC daemon connection is down. W ait for 1-5 minutes for backup daemon to come up. If it doesn’t, check the network connection for virtualization engine halt, or hardwar e failure.
Appendix A Vir tualization Engine References 129 T ABLE A-4 provides service codes for the virtualization engine. T ABLE A-3 Port Communication P or t Port P or t Number Daemon Management Programs 200.
130 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 54 Unauthorized cabling configuration. • Check cabling. 57 T oo many HBAs attempting to log in. • Check cabling. 60 Node mapping table cleared using SW2. • No action r equired.
131 APPENDIX For Internal Use Only B SUNWsecfg Err or Messages The Sun StorEdge 3900 and 6900 Series Reference Manual lists and defines the command utilities that configur e the various components of the Sun StorEdge 3900 and 6900 series storage systems.
132 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 . T ABLE B-1 V irtualization Engine SUNWsecfg Error Messages Message Description and Cause of Error Suggested Action Common to virtualization engines Invalid virtualization engine pair name $vepair , or virtualization engine is unavailable.
Appendix B SUNWsecfg Error Messages 133 Common to virtualization engine 1. Device-side operating mode is not set properly . 2. Device-side UID reporting scheme is not set properly . 3. Host-side operating mode is not set properly . 4. Host-side LUN mapping mode is not set properly .
134 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 createvezone Invalid WWN $wwn on $vepair initiator $init , or virtualization engine is unavailable. WWN that has already been specif ied has a SLIC zone and/or an HBA alias assigned.
Appendix B SUNWsecfg Error Messages 135 T ABLE B-2 Sun StorEdge Network FC Switch-8 and Switch-16 Switch SUNWsecfg Error Messages Message Description and Cause of Error Suggested Action Common Switch Sun StorEdge system type entered, ${cab_type} , does not match system type discovered, ${boxtype }.
136 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 setswitchflash Invalid flash f ile $flashfile . Check the number of ports on switch $switch . Y ou might be attempting to download a flash f ile for an 8-port switch to a 16- port switch.
Appendix B SUNWsecfg Error Messages 137 T ABLE B-3 Sun StorEdge T3+ Array SUNWsecfg Error Messages Message Description and Cause of Error Suggested Action Common to Sun StorEdge T3+ array Present conf.
138 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 checkt3config Snapshot configuration f iles are not present. Unable to check conf iguration. Make sure that the snapshot f iles are saved and have read permissions in the /opt/SUNWsecfg/etc/t3name/ directory .
Appendix B SUNWsecfg Error Messages 139 restoret3config Error while the block size compar e command is executing. The $BRICK_IP{$IPADD} command is aborted. The Sun StorEdge T3+ array block size parameter is differ ent from the snapshot file. The Sun StorEdge T3+ array may have been reconf igured.
140 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002 T ABLE B-4 Other SUNWsecfg Error Messages Message Description and Cause of Error Suggested Action Common to all components If.
Appendix B SUNWsecfg Error Messages 141 setupswitch Exit V alues T ABLE 9-1 lists the setupswitch exit values. The associated messages are logged in the /var/adm/log/SEcfglog log file. T ABLE 9-1 setupswitch Exit V alues Severity Level Message T ype Message Meaning 0 INFO All switch settings are pr operly set.
142 Sun StorEdge 3900 and 6900 Series Troubleshooting Guide • March 2002.
Index 14 3 Index A accessing documentation online, xv C checkswitch used to diagnose and troubleshooting switch, 62 comments sending documentation comments, xv configuration settings, 47 verification.
Index 144 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002 H health functions for Sun StorEdge 3900 and 6900 series, 2 host device names translating, 78 host devices troubleshoo.
Index 145 For Internal Use Only notification events, 103 T1/T2 data path troubleshooting, 102 test examples command line, 1 9 qlctest(1M), 19 switchtest(1M), 20 thresholds used in PF A, 2 troubleshoo.
Index 146 Sun StorEdge 3900 and 6900 Series T roubleshooting Guide • March 2002.
An important point after buying a device Sun Microsystems StorEdge 3900 Series (or even before the purchase) is to read its user manual. We should do this for several simple reasons:
If you have not bought Sun Microsystems StorEdge 3900 Series yet, this is a good time to familiarize yourself with the basic data on the product. First of all view first pages of the manual, you can find above. You should find there the most important technical data Sun Microsystems StorEdge 3900 Series - thus you can check whether the hardware meets your expectations. When delving into next pages of the user manual, Sun Microsystems StorEdge 3900 Series you will learn all the available features of the product, as well as information on its operation. The information that you get Sun Microsystems StorEdge 3900 Series will certainly help you make a decision on the purchase.
If you already are a holder of Sun Microsystems StorEdge 3900 Series, but have not read the manual yet, you should do it for the reasons described above. You will learn then if you properly used the available features, and whether you have not made any mistakes, which can shorten the lifetime Sun Microsystems StorEdge 3900 Series.
However, one of the most important roles played by the user manual is to help in solving problems with Sun Microsystems StorEdge 3900 Series. Almost always you will find there Troubleshooting, which are the most frequently occurring failures and malfunctions of the device Sun Microsystems StorEdge 3900 Series along with tips on how to solve them. Even if you fail to solve the problem, the manual will show you a further procedure – contact to the customer service center or the nearest service center