Instruction/ maintenance manual of the product GC28-1982-02 IBM
Go to page of 216
IBM Parallel Environment for AIX IBM Messages Version 2 Release 4 GC28-1982-02.
.
IBM Parallel Environment for AIX IBM Messages Version 2 Release 4 GC28-1982-02.
Note: Before using this information and the product it supports, be sure to read the general information under “Notices” on page v. | Third Edition (October 1998) | This edition applies to Version.
Contents Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii About This Book .................................... i x Who Should Use This Book .
iv IBM PE for AIX V2R4.0: Messages.
Notices References in this publication to IBM products, programs, or services do not imply that IBM intends to make these available in all countries in which IBM operates. Any reference to an IBM product, program, or service is not intended to state or imply that only IBM's product, program, or service may be used.
vi IBM PE for AIX V2R4.0: Messages.
Trademarks The following terms are trademarks of the International Business Machines Corporation in the United States and/or other countries: | AIX | ESCON | IBM | LoadLeveler | Micro Channel | RISC System/6000 | RS/6000 | SP | Adobe, Acrobat, Acrobat Reader, and PostScript are trademarks of Adobe | Systems, Incorporated.
viii IBM PE for AIX V2R4.0: Messages.
About This Book | This book is designed to help any user of IBM Parallel Environment for AIX (PE) | who needs to know what a message means and what should be done in response | to that message. This book lists all of the error messages generated by the PE | software and components and describes a likely solution.
In addition to the highlighting conventions, this manual uses the following conventions when describing how to perform tasks: User actions appear in uppercase boldface type.
Short Name Full Name STDIN standard input STDOUT standard output US User Space VT Visualization Tool Related Publications Parallel Environment (PE) Publications As an alternative to ordering the individual books, you can use SBOF-8588 to order the entire PE library.
IBM Parallel System Support Programs for AIX: Installation and Migration Guide , GA22-7347 IBM Parallel System Support Programs for AIX: Diagnosis Guide , GA22-7350 IBM Parallel System Sup.
| Online Information Resources | If you have a question about the SP, PSSP, or a related product, the following | online information resources make it easy to find the information: | Access the new SP Resource Center by issuing the command: | /usr/lpp/ssp/bin/resource_center | Note that the ssp.
| Enhanced Job Management Function | In earlier releases of PE, POE relied on the SP Resource Manager for performing | job management functions. These functions included keeping track of which nodes | were available or allocated and loading the switch tables for programs performing | User Space communications.
Chapter 1. Understanding the Diagnostic Message Format The message identifiers for the PE messages in this manual are structured as follows: 0029- nnnn pdbx (the line-oriented debugger) 0030- nnnn ped.
2 IBM PE for AIX V2R4.0: Messages.
0029-0101 0029-1002 Chapter 2. pdbx Messages 0029-0101 Your program has been loaded. Explanation: This message is issued when your program has been loaded into the tasks in the partition. This message indicates all the functions available in pdbx are available for you to use.
0029-1003 0029-2001 0029-1003 Missing or invalid argument following the -d flag. For information on the correct syntax to use when invoking pdbx, type: pdbx -h Explanation: The -d flag requires an integer argument that specifies the nesting depth limit of program blocks.
0029-2002 0029-2014 0029-2002 Could not add the groups events (breakpoints or tracepoints) to task: number , because this task is RUNNING. Explanation: Since the task was RUNNING and not available for debug commands, pdbx could not add the group events (breakpoints or tracepoints) for this task.
0029-2015 0029-2021 0029-2015 Could not open socket for debugger to communicate with poe. Explanation: The socket() call failed when the debugger tried to set up communications with POE. User Response: Debugging can continue except that the information about synchronized exit will not be passed back to the debugger from the POE job.
0029-2022 0029-2029 0029-2022 Task: number has already been loaded with a program. Explanation: The task number that you specified has already been loaded. User Response: Specify another task that has not been loaded. Issue the group list o r tasks command to check the state of the tasks.
0029-2030 0029-2036 0029-2030 The correct syntax is: 'group add group_name member_list'. A member list can contain space or comma-separated task numbers, or ranges of task numbers separated by colons or dashes. Specify the group name as a string of alphanumeric characters that starts with an alphabetic character.
0029-2037 0029-2044 0029-2037 Cannot add task: number , because it is already in group string . Explanation: The task specified on the group add command is already included in the group specified. User Response: Retry the command specifying only task(s) that are not already included within the specified group.
0029-2045 0029-2051 0029-2045 Group string has been renamed to string . Explanation: You have given a new group name to a previously existing group. User Response: Note that the old group name no longer exists. 0029-2046 The correct syntax is: 'group delete group_name [member_list]'.
0029-2052 0029-2059 0029-2052 Internal error in string : number - No action was taken because the group has no members. Explanation: You issued the group list on an empty group.
0029-2060 0029-2068 0029-2060 The correct syntax is: 'source filename'. Explanation: The source command cannot be issued with zero or greater than one arguments. User Response: Re-issue the source command with only one argument. 0029-2061 Cannot open the command file that was specified on the source command.
0029-2069 0029-2075 0029-2069 Reading command file string . Explanation: The debugger has started reading the command file specified by the -c command line flag, the source command or as a result of having a .pdbxinit file in the current working directory or your home directory.
0029-2076 0029-2084 0029-2076 There are no tasks in DEBUG READY state (active). Explanation: The response to the active command is that there are no tasks that are ready to be debugged. This is to say that there are no tasks that are active with respect to the debugger.
0029-2085 0029-2101 0029-2085 The dbx prompt modifier is too long; the maximum length is number . Explanation: The dbx prompt modifier string that you specified using the command line -dbxpromptmod flag or the MP_DBXPROMPTMOD environment variable was too long.
0029-2102 0029-2108 0029-2102 The sh command with no arguments is not allowed. Explanation: You issued the sh command with no arguments, which is not allowed. User Response: Issue the sh command with a specific executable name supplied. For example: sh ls .
0029-2109 0029-2116 0029-2109 No action taken on task(s): string , because they have either been stopped by the debugger, finished executing, or have been unhooked. Explanation: The tasks listed were not RUNNING. These tasks may already be under the control of the debugger because of a breakpoint or step command.
0029-2117 0029-2122 0029-2117 Group string has been deleted. Explanation: You issued the group delete command and the group has been successfully deleted.
0029-2123 0029-2129 0029-2123 This event cannot be set because some task(s) in the group are unhooked. Explanation: You issued a trace or stop command against a group which contains some task(s) that are unhooked. User Response: The hook command can be used to regain debugger control of of previously unhooked tasks.
0029-2130 0029-9036 0029-2130 No action was taken because the group name specified was null. Explanation: You issued one of the group commands, but no group name was provided. User Response: Choose a group name of at most 32 characters that starts with an alphabetic character and is followed by alphanumeric characters.
0029-9039 0029-9040 -F This flag can be used to turn off lazy reading mode. Turning lazy reading mode off forces the remote dbx sessions to read all symbol table information at startup time. By default, lazy reading mode is on. Lazy reading mode is useful when debugging large executable files, or when paging space is low.
0029-9041 0029-9048 0029-9041 Cannot locate attach configuration file " string ". Explanation: pdbx was unable to locate the attach configuration file. User Response: 1. Make sure that the correct POE process id was used when invoking the debugger.
0029-9049 0029-9999 0029-9049 The following environment variables have been ignored since they are not valid when starting the debugger in attach mode - string Explanation: Some of the environment variables the user has set are not valid when starting pdbx in attach mode.
24 IBM PE for AIX V2R4.0: Messages.
0030-0002 0030-0033 Chapter 3. pedb Messages 0030-0002 string < number >: Data Display data is not attached to data window [ number ][ number ]. Explanation: Cannot access information to update the data window. User Response: Further data viewing will be limited.
0030-0034 0030-0044 0030-0034 No source file is available to edit. Explanation: pedb could not locate a source file to edit. Pressing the edit button causes an edit window to be displayed containing the file that is currently displayed in the pedb source window.
0030-0050 0030-0056 0030-0050 An invalid value: string was specified for the Play Delay. Please enter non-negative integer value. If you click on Cancel, the new delay field will be reset to the previous value of number . Explanation: An invalid value for the play delay has been entered.
0030-0057 0030-0064 0030-0057 Task number has been detached. Explanation: A reply was received from the debug engine ( dbe ) that indicated the specified task has been detached. User Response: None. This is an informational message. 0030-0058 Attached to task number .
0030-0065 0030-0071 0030-0065 Could not open socket for debugger to communicate with poe. Explanation: The socket() call failed when the debugger tried to set up communications with POE. User Response: Debugging can continue except that the information about synchronized exit will not be passed back to the debugger from the POE job.
0030-0072 0030-0107 0030-0072 All tasks have exited. Select Ok to detach. Explanation: All the tasks in the partition have completed program execution. Selecting Ok causes pedb to detach from the program and exit. An alternative would be to click on Cancel and then select the Quit option from the File pulldown menu.
0030-0109 0030-0113 0030-0109 string searched to the top/end of the file and did not find string Explanation: This message is formatted dynamically from the string you are searching for, and the direction of the search. Message format is: Searched to the limit of the file and did not find string .
0030-0114 0030-0120 0030-0114 Array string on task string , thread string has a different number of dimensions. It is excluded from the export. Explanation: The array with a matching array name on the specified task and thread does not meet the match criteria and is excluded from the export.
0030-0121 0030-0129 | 0030-0121 The MPI application has not been run in debug mode; therefore, there will | be no data on blocking calls and no timestamp information. | Explanation: Some MPI debugging data is only collected when MPI is run in DEBUG | mode.
0030-0130 0030-2208 | 0030-0130 Could not get message group information. | Explanation: An error occured while attempting to retrieve group information for an MPI | message record. | User Response: If the error persists, cancel and restart the message queue debugging | feature.
0030-2209 0030-2218 0030-2209 Task number has requested exit. Explanation: The indicated task has attempted to exit. The program terminates when all tasks have requested exit.
0030-2219 0030-2227 0030-2219 No members were chosen. Explanation: When attempting to add a new group, you didn't choose any tasks as it's members. User Response: Select members for the new group. 0030-2220 Too many members were specified.
0030-2230 0030-2240 0030-2230 No Items were selected. Explanation: The user selected Apply or Ok on the Variable Selection window without choosing any variables to be displayed. User Response: None. This is an informational message. 0030-2232 Could not locate source file: string for task: number .
0030-2241 0030-2257 0030-2241 Task number loaded with string string . Explanation: Describes what executable and arguments were loaded for a particular task. User Response: None. This is an informational message. 0030-2242 Unable to send command to task ' number ' .
0030-2259 0030-2266 0030-2259 Unable to write to the directory string . Explanation: pedb was not able to write to the directory specified. This is the directory that is used to write the temporary files used in visualization. User Response: Check the permissions of the directory.
0030-2267 0030-2275 0030-2267 HDF Failure: Unable to write array slice. Explanation: An error has occurred while trying to write the array slice to the HDF file. You may encounter this error when the HDF file is corrupted or when your file system is full.
0030-2276 0030-2284 0030-2276 A non-integer value has been entered for the stride. Explanation: A non-integer value was entered in text field the specifies the stride value. User Response: Enter an integer value. 0030-2277 Zero has been entered for the stride.
0030-2285 0030-2291 0030-2285 Task number is not in DEBUG state. It is excluded from the export. Explanation: A task must be in DEBUG state to be able to participate in an export. User Response: If the user does not care that the task was excluded from the multi array export, the message can be ignored.
0030-2292 0030-2296 0030-2292 You cannot Export at this time because the program stack has changed since you created this window. The chosen array is out of scope. Explanation: The array that was chosen in the Export window is no longer within scope.
0030-2297 0030-3008 | 0030-2297 Please specify a filename in the Export Filename field. | Explanation: No file name has been specified the the Export Filename field of the Export | window. It may be that the field is empty or that the field contains only a directory path.
0030-3014 0030-3020 0030-3014 Task number : ReplyExpression(): Internal error returned from unknown callee. Explanation: Received an error code from a routine that ReplyExpression() called but there was no additional information to pass on.
0030-3021 0030-3027 0030-3021 Play mode has been stopped. Explanation: Play mode has been terminated by the halt or stop button. User Response: None. This is an informational message. 0030-3022 Play mode has been started. Explanation: Play mode has been initiated by the play button.
0030-3028 0030-3034 0030-3028 Task number : Remote debug engine was unable to set the initial breakpoint. Explanation: The remote debug engine was unable to set the initial breakpoint. User Response: Check that the file containing the main routine or the program statement has been compiled with the -g option.
0030-3035 0030-3042 0030-3035 Task number : The breakpoint request failed. An invalid source line or invalid condition was specified. Explanation: A source line in the source code window has been selected, and a breakpoint request has been made for that line.
0030-3043 0030-9022 0030-3043 Task number : The executable chosen for debugging did not have execute permission. Explanation: The remote debugger attempted to find the program to execute on a task. User Response: Update the permissions on the program file on the remote node.
0030-9051 0030-9999 Flags: -a Attaches to a running POE job by specifying its process id. The debugger must be executed from the node from which the POE job was initiated. Note that when using the debugger in attach mode there are some debugger command line arguments that should not be used.
0031-001 0031-018 Chapter 4. POE Messages 0031-001 No man page available for poe Explanation: User has requested that the poe man page be displayed (via -h option), but the /usr/man/cat1/poe.1 file does not exist, or some directory in the path leading to the file is not searchable.
0031-019 0031-028 0031-019 pm_contact: connect failed Explanation: The Partition Manager terminates. User Response: The Partition Manager is unable to connect to a remote node. Message 0031-020 follows. Probable PE system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-029 0031-040 0031-029 Caught signal number ( string ), sending to tasks... Explanation: The indicated signal is not used specifically by Partition Manager, and is being passed on to each remote task. User Response: Verify that the signal was intended.
0031-041 0031-051 0031-041 sigaction(SIGIOT) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-042 sigaction(SIGEMT) Explanation: An explanatory sentence follows.
0031-052 0031-062 0031-052 sigaction(SIGCONT) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-053 sigaction(SIGCHLD) Explanation: An explanatory sentence follows.
0031-063 0031-077 0031-063 sigaction(SIGDANGER) Explanation: An explanatory sentence follows. The Partition Manager terminates. Cause: The return from sigaction for the indicated signal is negative. 0031-064 sigaction(SIGVTALRM) Explanation: An explanatory sentence follows.
0031-078 0031-089 0031-078 invalid retrytime Explanation: The -retrytime option was neither a 0 nor a positive number. User Response: Correct the flag. 0031-079 invalid pmlights Explanation: The -pmlights option was neither a 0 nor a positive number.
0031-092 0031-102 0031-092 MP_PROCS not set correctly Explanation: The MP_PROCS environment variable is not a positive number. User Response: Correct the variable. 0031-093 MP_INFOLEVEL not set correctly Explanation: The MP_INFOLEVEL environment variable is neither 0 or a positive number less than 32768.
0031-103 0031-116 0031-103 Invalid MP_TTEMPSIZE Explanation: The MP_TTEMPSIZE environment variable specifies too large a trace file (or an invalid number). User Response: Reduce or correct the size. 0031-104 Incorrect MP_TTEMPSIZE unit Explanation: The MP_TTEMPSIZE environment variable is not of the form number M or number G.
0031-117 0031-124 0031-117 Unable to contact Resource Manager Explanation: The Partition Manager was unable to contact the Resource Manager to allocate nodes of the SP.
0031-125 0031-134 0031-125 Fewer nodes ( number ) specified in string than tasks ( number ). | Explanation: There was a larger number of nodes specified than what is defined in the host.
0031-135 0031-142 0031-135 Invalid labelio option, should be YES or NO Explanation: A labelio other than YES or NO was entered. User Response: Re-specify labelio with either YES or NO. 0031-136 Invalid MP_NOARGLIST option, should be YES or NO Explanation: The Partition Manager terminates.
0031-143 0031-149 0031-143 Could not read message from debug socket. Explanation: The call to read() failed when attempting to read a message from the debug socket. User Response: None. 0031-144 error creating directory for core files, reason: < string > Explanation: A corefile directory could not be created for the given reason.
0031-150 0031-155 0031-150 Unable to load shared objects required for Resource Manager Explanation: The execution environment specified use of the Resource Manager, but one or more of the following shared objects did not exist in /usr/lpp/ssp/lib or /usr/lib : jm_client.
0031-156 0031-171 | 0031-156 Unexpected return code number from ll_get_data ( number ). | Explanation: An internal error has occurred. | User Response: Gather information about the problem and follow local site procedures for | reporting hardware and software problems.
0031-172 0031-203 0031-172 I/O buffer overflow Explanation: The stdout or stderr string overflows the output buffer (8K). The excess is discarded. User Response: Probable internal error. Normally, the output is automatically flushed if it exceeds the buffer length.
0031-207 0031-216 0031-207 pmd: sigaction < string > Explanation: Error when setting up to handle a signal. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-217 0031-251 0031-217 POE ( number ), pmd ( number ), and dbe ( number ) versions are incompatible. Explanation: The versions of POE, pmd, and the debug engine (dbe) are incompatible. User Response: You should check that POE, pmd, and dbe are at compatible PE version levels.
0031-252 0031-259 0031-252 task number stopped: string Explanation: The indicated task has been stopped. The second variable in this message indicates the signal that stopped the task. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-260 0031-306 0031-260 Invalid entry in /etc/poe.priority file for user string , class string ; priority adjustment function not started Explanation: In attempting to start the dispatching priority adjustment function, there was no entry for the user and class found in the /etc/poe.
0031-307 0031-311 0031-307 remote child: error restoring stdin. Explanation: The previously closed stdin cannot be restored. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-312 0031-319 | 0031-312 The checkpoint file string already exists in the working directory. | Explanation: While attempting to checkpoint the program, an existing version of the | checkpoint file was found in the working directory. Execution is terminated.
0031-320 0031-326 | 0031-320 Error occurred saving file information during checkpointing. Return code | is number . | Explanation: An error occurred attempting to save the file information for the data segment | while checkpointing the program. | User Response: Probable system error.
0031-327 0031-334 | 0031-327 Error occurred saving the stack data during checkpointing. Return code is | number . | Explanation: An error occurred saving the stack data while checkpointing the program.
0031-335 0031-342 0031-335 SSM subtype not what was expected Explanation: An internal error was detected where an unexpected message type was returned. The remote node terminates. User Response: Probable PE error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-343 0031-349 | 0031-343 Error occurred opening the checkpoint directory. | Explanation: An error occurred during opening the checkpoint file directory while | checkpointing the program.
0031-350 0031-356 | 0031-350 Error occurred comparing environment variables during restart. Return | code is number . | Explanation: The original POE and MPI environment variables do not match those | contained in the program to be restarted. As a result, the program cannot be restarted.
0031-357 0031-363 | 0031-357 Error occurred opening the checkpoint directory during restore. | Explanation: An error occurred during opening the checkpoint file directory while restoring | the program.
0031-364 0031-370 | 0031-364 Contacting LoadLeveler to string information for string job. | Explanation: Informational message to user to indicate that LoadLeveler is being used for | the interactive or batch job.
0031-371 0031-376 | 0031-371 Conflicting specification for -msg_api, using string . | Explanation: A batch job using POE was submitted to LoadLeveler with a network | statement in the Job Command .
0031-377 0031-402 | 0031-377 Using string for euidevice. | Explanation: Informational message to indicate the messaging passing device being used | for the batch POE job submitted to LoadLeveler.
0031-403 0031-409 0031-403 Forcing dedicated adapter for User Space job Explanation: User explicitly specified User Space job using -euilib us o r MP_EUILIB=us and poe is making sure the adapter usage requested from the Resource Manager is dedicated.
0031-410 0031-415 | 3 -- Could not get hostname | 4 -- Nameserver could not resolve host | 5 -- Socket error | 6 -- Could not connect to host | 7 -- Could not send command to remote startd | User Response: Check pathname and permissions for /etc/pmdv2.
0031-416 0031-604 0031-416 string : no response; rc = number Explanation: An error occurred on reading data from remote node to home node. User Response: This is an IP communication error between home and remote node. No acknowledgement of startup was received from the pmd daemon running on the indicated node.
0031-605 0031-612 0031-605 Unexpected EOF on allocation file for task number Explanation: There were not enough entries in the hostfile for the number of processes specified. User Response: Lower the number of processes or add more entries to the hostfile.
0031-613 0031-619 0031-613 Unable to send command to task number Explanation: An error occurred in sending the poe command to the indicated task. Probably the remote node is no longer accessible. POE terminates. User Response: Verify that the remote node in the partition can be contacted by other means.
0031-620 0031-626 0031-620 pm_SSM_write failed in sending the user/environment for taskid number Explanation: The internal pm_SSM_write function failed. The system terminates. User Response: Probable PE error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-627 0031-631 0031-627 Task number connection blocked. Task will be abandoned. Explanation: While shutting down the partition, POE was unable to write to the indicated task, because the socket was blocked. The socket and task are subsequently ignored and the shutdown continues.
0031-632 0031-638 0031-632 Can't connect to PM Array. errno = number Explanation: POE tried to connect to the Program Marker Array tool, but was unsuccessful. The system error number is returned. Most likely, the Program Marker Array has not been started.
0031-639 0031-645 0031-639 Exit status from pm_respond = number Explanation: The pm_respond function exited with the indicated status. User Response: If other error messages occurred, perform corrective action indicated for the message(s); otherwise, no action is required.
0031-646 0031-652 0031-646 PM Array is trying to tell us something ... Explanation: A message from PM Array is directed to the Home Node. At present there are no Home Node functions responding to the PM Array, so the message text is just printed out.
0031-653 0031-659 0031-653 Couldn't route data from STDIN to task number Explanation: An error occurred routing STDIN to the indicated task. The partition is terminated and POE exits.
0031-660 0031-667 0031-660 Partition Manager stopped ... Explanation: The Home Node (POE) has stopped in response to a SIGTSTOP ( < Ctrl>Z) signal. The remote nodes have been stopped. User Response: To resume the job, issue SIGCONT, or use the shell job control commands f g or bg .
0031-668 0031-675 0031-668 pm_io_command: error in pm_SSM_write, rc = number Explanation: An error occurred while responding to a STDIO MODE QUERY message. The response is abandoned. User Response: Probable POE internal error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-676 0031-687 0031-676 Invalid value string for mp_euidevice Explanation: The mp_euidevice specified on the command line with -euidevice or in the environment with MP_EUIDEVICE is not valid. User Response: Refer to IBM AIX Parallel Environment Operation and Use for valid choices and rerun.
0031-688 0031-702 0031-688 Incorrect subtype number received in structured socket message Explanation: Internal error has occurred. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-703 0031-711 0031-703 invalid nprocs argument Explanation: The nprocs argument received by the pm daemon is invalid. User Response: Probable system error. Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-712 0031-722 0031-712 parent error reading STDIN, rc = number Explanation: pm daemon parent was unable to read STDIN. User Response: Probable system error. 0031-713 pmd parent: error w/ack for sig req to home Explanation: pm daemon parent had error sending ack for sig request.
0031-723 0031-730 0031-723 userid = < number > Explanation: userid is set to the given userid. User Response: No response needed. 0031-724 Executing program: < string > Explanation: The child is executing the given program. User Response: No response needed.
0031-731 0031-803 0031-731 Error getting and setting DFS credentials. Explanation: The PMD called the poe_dce_set function to get and set the current context for establishing the DFS/DCE credentials when it encountered an error. poe_dce_set should have issued additional messages describing the errors.
0031-804 0031-902 0031-804 -pgmmodel string ignored in remote child Explanation: -pgmmodel interpreted only in parent code. User Response: No response needed. 0031-805 Invalid programming model specified: string Explanation: -pgmmodel should be either SPMD or MPMD.
0031-903 0031-A400 0031-903 Can't confirm profiling for task number Explanation: A communication failure has occurred. User Response: Retry; if problem persists, gather information about the problem and follow local site procedures for reporting hardware and software problems.
0031-A401 0031-A409 0031-A401 Error in binding socket Explanation: The program pmarray terminates. An explanatory sentence is appended. User Response: Probable system error. Check the condition(s) given in the explanatory sentence. 0031-A402 Error in listen Explanation: The program pmarray terminates.
104 IBM PE for AIX V2R4.0: Messages.
0032-001 0032-010 Chapter 5. MPI Messages 0032-001 Invalid source task ( number ) in string , task number Explanation: The value of src (source task ID) is out of range. User Response: Make sure that the source task id is within the range 0 to N-1, where N is the number of tasks in the partition.
0032-011 0032-019 0032-011 Invalid qtype value ( number ) in string , task number Explanation: The value specified for qtype is invalid. User Response: Make sure that qtype is either 1, 2, or 3. 0032-012 Invalid nelem value ( number ) in string , task number Explanation: The value specified for nelem is invalid.
0032-020 0032-029 0032-020 Invalid task id ( number ) in string , task number Explanation: The value specified for taskid is out of range. User Response: Make sure that taskid is within the range 0 to N-1, where N is the number of tasks in the partition.
0032-030 0032-050 0032-030 Inconsistent flag value in string , task number Explanation: The same value of flag was not specified by each task in the group.
0032-051 0032-057 0032-051 Invalid count argument Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. User Response: See the entry for the specific error code returned by the MPI function.
0032-058 0032-064 0032-058 Invalid group Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. User Response: See the entry for the specific error code returned by the MPI function.
0032-065 0032-071 0032-065 Known error not in this list Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad description of the type of error that occurred. User Response: See the entry for the specific error code returned by the MPI function.
0032-072 0032-078 | 0032-072 Invalid info. | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad | description of the type of error that occurred. | User Response: See the entry for the specific error code returned by the MPI function.
0032-079 0032-085 | 0032-079 File exists. | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad | description of the type of error that occurred. | User Response: See the entry for the specific error code returned by the MPI function.
0032-086 0032-104 | 0032-086 Data representation already defined. | Explanation: This is an MPI error class, returned by MPI_Error_class. It provides a broad | description of the type of error that occurred. | User Response: See the entry for the specific error code returned by the MPI function.
0032-105 0032-112 0032-105 Invalid group handle ( number ) in string , task number Explanation: The specified group handle is undefined or NULL. User Response: Make sure that the group handle is either predefined or was returned by an MPI function.
0032-113 0032-119 0032-113 Out of memory in string , task number Explanation: There is insufficient memory available to continue. User Response: Reduce the size of user storage required per task. Error Class: MPI_ERR_INTERN 0032-114 Internal error: string in string , task number Explanation: An internal software error occurred during execution.
0032-120 0032-127 0032-120 Declaration has upper bound < lower bound ( number ) in string , task number Explanation: No datatype can be defined with negative extent (upper bound less than lower bound). User Response: Make sure any MPI_LB or MPI_UB argument to MPI_Type_struct is consistent with the layout being defined.
0032-128 0032-135 0032-128 Inconsistent root node ( number ) in string , task number Explanation: The participants in a collective operation did not all specify the same value for root . User Response: Make sure that root is identical for all tasks making the call.
0032-136 0032-142 0032-136 Invalid communicator ( number ) in string , task number Explanation: The value used for communicator is not a valid communicator handle. User Response: Make sure that the communicator is valid (predefined or created by an MPI function) and has not been freed by MPI_Comm_free.
0032-143 0032-149 0032-143 Invalid dimension count ( number ) in string , task number Explanation: The value specified for ndims is invalid. User Response: Make sure that the number of dimensions is greater than zero.
0032-150 0032-156 0032-150 MPI is not initialized in string , task number Explanation: A call to an MPI function other than MPI_Init or MPI_Initialized was made before MPI was initialized. User Response: Call MPI_Init before any other MPI function other than MPI_Initialized.
0032-157 0032-163 0032-157 Invalid request handle ( number ) in string , task number Explanation: The value specified is not a valid request handle.
0032-164 0032-170 0032-164 Delete callback failed in string , task number Explanation: A non-zero return code was returned by the delete callback function associated with an attribute keyval. The specific value returned by the delete callback function is not available via MPI.
0032-172 0032-177 0032-172 Invalid color ( number ) in string , task number Explanation: A negative value was used for color. User Response: Make sure that color is greater than or equal to zero, or is MPI_UNDEFINED.
0032-178 0032-183 0032-178 A negative number of triplets was specified ( number ) in string , task number Explanation: The number of range triplets specified must be positive. A zero is accepted as a valid number though calling the range include or exclude function with zero ranges is probably not useful.
0032-184 0032-188 | 0032-184 MPI was not finalized in string , task number | Explanation: An MPI program exited without calling MPI_Finalize. The parallel job is | terminated with an error exit code.
0032-189 0032-253 | 0032-189 Datatype extent cannot be expressed as an integer or MPI_Aint in string , | task number | Explanation: A call to create a user-defined datatype would create a type with an extent or | true extent set by MPI_LB or MPI_UB whose magnitude is too great to be expressed by an | integer or MPI_AINT.
0032-254 0032-281 | 0032-254 MP_SINGLE_THREAD is set in a multi-threaded program, detected in string , | task number | Explanation: The MP_SINGLE_THREAD environment variable is set, but multiple threads | are executing. | User Response: Unset the MP_SINGLE_THREAD environment variable and rerun the | program.
0032-282 0032-305 | 0032-282 Invalid info key number ( number ) in string , task number . | Explanation: The info key number specified must be between 0 and N-1, where N is the | number of keys currently defined in the info argument. | User Response: Correct the info key number argument.
0032-306 0032-312 | 0032-306 Unclosed files when finalizing string , task number . | Explanation: There are still open files when MPI_FINALIZE is called.
0032-313 0032-319 | 0032-313 Invalid grid size ( number ) in string , task number . | Explanation: The cartesian grid of processes defined by arguments ndims and | array_of_psizes to MPI_TYPE_CREATE_DARRAY() has a size different from argument size.
0032-320 0032-328 | 0032-320 Invalid displacement ( number ) in string , task number . | Explanation: A negative displacement has been specified. | User Response: Modify the value of the disp argument. | Error Class: MPI_ERR_ARG | 0032-321 Permission denied string , task number .
0032-329 0032-336 | 0032-329 Pending I/O operations when setting file size string , task number . | Explanation: The file size is being set while there are still pending I/O operations on the | file. | User Response: Modify the program so that all I/O operations are complete prior to setting | the file size.
0032-338 0032-404 | 0032-338 Inconsistent elementary datatypes string , task number | Explanation: The file view is being set and the elementary datatypes specified by the | participating processes do not have the same extent. | User Response: Modify the elementary datatypes and make sure they have the same | extent on all processes.
0032-405 0032-410 | 0032-405 Internal fsync failed ( number ) in string , task number . | Explanation: An internal call to fsync() failed. | User Response: Check error number and take appropriate action.
136 IBM PE for AIX V2R4.0: Messages.
0033-1001 0033-1007 Chapter 6. VT Messages 0033-1001 Node is inactive Explanation: The node selected for monitoring is not active. Error Class: The selected square does not represent a node that is communicating with the performance monitor. User Response: Select a different square.
0033-1008 0033-1014 0033-1008 Accept failed for the PM data collector Explanation: A connection on the socket for communicating between the Performance Monitor and the dug program could not be accepted.
0033-1015 0033-1022 0033-1015 Error number reading node list file string Explanation: Internal Program error. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems. 0033-1016 Error writing socket during allocation Explanation: Internal Program error.
0033-1023 0033-1029 0033-1023 dug: socket() failed. Error is string Explanation: The socket for communicating between the Performance Monitor and the dug program could not be created by dug. Error Class: The socket() function failed for the indicated reason.
0033-1030 0033-1036 0033-1030 pm_connect_dug() : tmpnam() failed. Unable to get Unix stream socket pathname. Error is string Explanation: The Performance Monitor was not able to establish a communication channel with the performance statistics collection program, dug.
0033-1037 0033-1042 0033-1037 pm_connect_digq() Connection with digq timed out after number seconds. Explanation: The Performance Monitor did not receive a response from the dig query program, digq, after a reasonable amount of time. Error Class: The digq program may not have been started successfully or may have been terminated after starting.
0033-1043 0033-1048 0033-1043 digq():Error in getting the Internet address of the local host string . Error is string Explanation: The performance statistics query program, digq, was unable to get the Internet address of the local host. Error Class: The gethostbyname() function failed for the indicated reason.
0033-1049 0033-1054 0033-1049 digq() was unable to establish a communication channel to the Performance Monitor. Error is string Explanation: The performance statistics query program, digq, was unable to establish a communication channel to the Performance Monitor.
0033-1055 0033-1061 0033-1055 digq()::p;oadcast: Unable to get socket interface flags. Error is string Explanation: The performance statistics query program, digq, was unable to get the interface flags of the socket in order to locate broadcast devices.
0033-1062 0033-2003 0033-1062 dug: socket read for string failed Error is string Explanation: The dug statistics collection program encountered an error reading the socket connection to the monitor program. Error Class: The monitor program has probably terminated.
0033-2004 0033-2010 0033-2004 AddHostname() could malloc number bytes for the first element of the history buffer Explanation: The Performance Monitor could not allocate sufficient storage for the internal representation of the first node to be monitored.
0033-2011 0033-2028 0033-2011 Cannot get color for string spectrum Explanation: A color for the indicated spectrum could not be obtained. Error Class: Either the X server where VT is running does not support a named color or no free color cells remain in the colormap.
0033-2029 0033-2037 0033-2029 Cannot close previous trace file because string Explanation: An error occurred while trying to close the previous trace file. Error Class: The fclose() function failed for the indicated reason. User Response: If the problem appears to be correctable, do so.
0033-2041 0033-2048 0033-2041 Internal Error make_menu_item: Invalid Item Type Explanation: An internal program error occurred. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-2049 0033-2055 0033-2049 Event widgets must accept multiple ports. Configuration cannot be created. Explanation: An event/display chain was being created but the specified event widget cannot handle multiple processes.
0033-2056 0033-2061 0033-2056 Request to open another display failed. Only 20 displays may be open at any time. Close some displays and try again. Explanation: Only 20 displays can be opened at a time. Error Class: An attempt was made to open another display while 20 were already opened.
0033-2062 0033-2067 0033-2062 Command line option string is not recognized or is missing a required parameter. It will be ignored Explanation: The indicated option is not recognized and the Visualization Tool does not know what to do with it other than ignore it.
0033-2068 0033-2073 0033-2068 Unable to map trace file " string " into memory. Explanation: During an attempt to load a tracefile, although the file existed, was a regular file, and was successfully opened, the file could not be mapped into memory.
0033-2074 0033-3003 0033-2074 Post processing of tracefile string . is complete. Details may be found in string . Explanation: During post processing of the named tracefile, information about the tracefile and post processing was logged to the file.
0033-3005 0033-3019 0033-3005 Could not obtain current time for timestamp file because string Explanation: The program could not determine the current time of day to write into the timestamp file. Error Class: The gettimeofday() function failed for the indicated reason.
0033-3022 0033-3029 0033-3022 Client: Cannot open stream socket for Dig Daemon, Err= string Explanation: The parallel application was attempting to create a unix socket with which to talk to the AIX statistics daemon but failed. Error Class: The socket() function failed.
0033-3068 0033-3073 0033-3068 VT_integrate() Could not open output file " string " Error is string Explanation: While integrating the intermediate trace files, the program could not open an intermediate output file. Error Class: The fopen() function failed for the indicated reason.
0033-3074 0033-3080 0033-3074 write_data_to_usd_file(), setgid() failed from root to user_gid= number , Can't create AIX trace file, Err= string Explanation: While writing the AIX statistics .
0033-3081 0033-3087 0033-3081 PMdig::write() on socket failed in sending version to string . errno= number Explanation: dig disconnected client because of version mismatch.
0033-3088 0033-3094 0033-3088 VT_trrtn::write_trc_data(), Tracing continued after reducing the max size for temp file from number to number Explanation: Tracing continued even if write failed on temp disk. Error Class: Temp Disk full. User Response: Increase the space in the temp disk.
0033-3095 0033-3100 0033-3095 VT_trc_capture write_buffusd_data() Insufficient disk space to write, Tracing stopped. Space left is number , required is number Explanation: Not enough disk space left. Error Class: Insufficient disk space. User Response: Clean the disk.
0033-3101 0033-3108 0033-3101 VT_trc_set_params(): Setting Temp File size to threshold size. Set Size = number , Minimum size = number Explanation: The happens when user tries to set the size of temporary file to be less than the minimum threshold size.
0033-3109 0033-3115 0033-3109 connect_dig() Select Failed for the Unix socket. Error is string Explanation: The Tracing routine experienced an internal program error on its communication channel. User Response: Gather information about the problem and follow local site procedures for reporting hardware and software problems.
0033-3116 0033-3123 0033-3116 DIG(), Connection with Application program timed out after number seconds. Explanation: The DIG daemon did not receive a response from the Trace client reasonable amount of time. Error Class: The trace client may have died or system delay.
0033-3124 0033-3129 0033-3124 Internal program error occurred during trace integration Explanation: During the trace integration portion of the poe job, a required structure was not initialized properly. Error Class: A program error occurred that prevents trace integration.
0033-3130 0033-3134 0033-3130 Unable to allocate space to store " string ", which is the name of the temporary kernel statistics trace file. Explanation: The dig program (which is spawned from the parallel application) could not save the name of the file that it was supposed to write kernel trace records into.
0033-3135 0033-4002 0033-3135 Unable to create temporary name for kernel statistics file. Error is string Explanation: VT trace generation could not create a temporary name for the file to record kernel statistics. The message gives the reason the system call failed.
0033-4003 0033-4007 | 0033-4003 string History buffer position should be 0 but is number History buffer will | be reset | Explanation: During visualization, the internal semaphores of the display have become | unsynchronized. | Error Class: Internal program error.
0033-4100 0033-4350 0033-4100 string Internal program error. The meter height of number is less than the minimum meter height of number . Explanation: During visualization, the program attempted to reallocate the pixmaps used to display the processor label numbers.
0033-4351 0033-4355 0033-4351 Pixmap not created for compressed rectangles. Internal Error in StripGraph::SetXhatchGC(). Explanation: During visualization, an internal error occurred in the StripGraph display. Error Class: Internal Program Error when calling XCreateGC.
0033-4356 0033-4575 0033-4356 Time index value incorrect. Internal Error in StripGraph::back_in_time_start_x_pos(). stripgraph->back_time_indx is greater than or equal to HIST_BUFF_LEN. HIST_BUFF_LEN = number . Explanation: During visualization, an internal error occurred in the StripGraph display.
2537-0002 2537-0006 Chapter 7. Xprofiler Messages 2537-0002 No file was specified in Binary Executable File dialog. Explanation: When you are trying to load one or more gmon.out files, you are required to also specify the name of the binary executable file that was executed to produce the gmon.
2537-0007 2537-0010 2537-0007 You must first select a function from the list. Explanation: Before using the Utility->Locate in Graph option in either the Flat Profile or Function Index report window, a function in the report must be selected first.
2537-0011 2537-0016 2537-0011 The selected function's source file name is not available. Explanation: Internal error. There is no source file name associated with the selected function, so no file can be opened.
2537-0017 2537-0023 2537-0017 There must be at least one space separating the runtime option string and its corresponding value. Explanation: At least one space must be typed between an Xprofiler command-line option and its associated value. For example, -e foo o r -e foo are acceptable formats, but -efoo is not.
2537-0024 2537-0030 2537-0024 Cannot open file string for reading. Check for valid path and file specification and permissions. Explanation: An attempt to read data from the file in the directory that you specified failed, due to the fact that the file cannot be opened for reading.
2537-0031 2537-0036 2537-0031 A severe error was detected, and file processing has stopped. Refer to the message window below this window for more details. Explanation: A function involving the symbol tables for your application failed to perform correctly while Xprofiler was trying to process your input files.
2537-0037 2537-0041 2537-0037 The gmon.out file count data in the Xprofiler internal table is incorrect. Explanation: Internal error. Xprofiler has an internal table that contains an entry for each gmon.out file that you specified to be loaded for the application you are analyzing.
2537-0042 2537-0048 | 2537-0042 Number of CPU sampling data records in the gmon.out file string is greater than the value in the associated header. Explanation: AIX error. The gmon.out file contains more records of CPU sampling data for one of the functions in your program than indicated by the value in the header data.
2537-0049 2537-0053 2537-0049 Failed to obtain file information about string . Explanation: Failed to obtain information about the specified directory. This is because either the specified path contains a non-existent directory, or one of the parent directories in the path does not have execute permission.
2537-0054 2537-0060 2537-0054 NarcCorrelate() had a negative return value. Explanation: Internal error. The Function Call Tree was unable to be reconstructed by the NARC library, because the node and arc data that this library uses has either been overwritten, or placed in the wrong shared memory location, by Xprofiler.
2537-0061 2537-0065 2537-0061 An attempt to access call count information in the gmon.out file, string , failed. Explanation: Internal error. Xprofiler was unable to access the call count data in this gmon.out file. This problem should have been detected during an earlier stage of file processing, instead of being encountered at this time.
2537-0066 2537-0072 | 2537-0066 The following string(s) in the specified configuration file do not follow | configuration file syntax key = value : string | Explanation: The string in the specifie.
2537-0073 2537-0073 | 2537-0073 The specified file string is in an un-recognized format, the file can not be | processed further. | Explanation: The specified file is not in a recognized format; the file can not be processed | further. | User Response: Make sure that all files used in Xprofiler are in a supported file format.
186 IBM PE for AIX V2R4.0: Messages.
Glossary of Terms and Abbreviations This glossary includes terms and definitions from: The Dictionary of Computing , New York: McGraw-Hill, 1994. The American National Standard Dictionary for Information Systems , ANSI X3.172-1990, copyright 1990 by the American National Standards Institute (ANSI).
buffer . A portion of storage used to hold input or output data temporarily. C C . A general purpose programming language. It was formalized by ANSI standards committee for the C language in 1984 and by Uniforum in 1983.
distributed shell (dsh) . An Parallel System Support Programs command that lets you issue commands to a group of hosts in parallel. See IBM Parallel System Support Programs for AIX: Command and Technical Reference for details.
gprof . A UNIX command that produces an execution profile of C, Pascal, Fortran, or COBOL programs. The execution profile is in a textual and tabular format. It is useful for identifying which routines use the most CPU time. See the man page on gprof .
latency . The time interval between the instant at which an instruction control unit initiates a call for data transmission, and the instant at which the actual transfer of data (or receipt of data at the remote end) begins.
P package . A number of filesets that have been collected into a single installable image of program products, or LPPs. Multiple filesets can be bundled together for installing groups of software together. See also fileset and Licensed Program Product .
remote shell (rsh) . A command supplied with both AIX and the Parallel System Support Programs that lets you issue commands on a remote host. Report . In Xprofiler, a tabular listing of performance data that is derived from the gmon.out files of an application.
debugger to print information about the state of the program. trace record . In PE, a collection of information about a specific event that occurred during the execution of your program. For example, a trace record is created for each send and receive operation that occurs in your program (this is optional and may not be appropriate).
.
Communicating Your Comments to IBM IBM Parallel Environment for AIX Messages Version 2 Release 4 Publication No. GC28-1982-02 If you especially like or dislike anything about this book, please use one of the methods listed below to send your comments to IBM.
Reader's Comments — We'd Like to Hear from You IBM Parallel Environment for AIX Messages Version 2 Release 4 Publication No. GC28-1982-02 You may use this form to communicate your comments.
Cut or Fold Along Line Cut or Fold Along Line Reader's Comments — We'd Like to Hear from You GC28-1982-02 IBM Fold and Tape Please do not staple Fold and Tape NO POSTAGE NECESSARY IF MAILED IN THE UNITED STATES BUSINESS REPLY MAIL FIRST-CLASS MAIL PERMIT NO.
.
IBM Program Number: 5765-543 Printed in the United States of America on recycled paper containing 10% recovered post-consumer fiber. GC28-1982-ð2.
An important point after buying a device IBM GC28-1982-02 (or even before the purchase) is to read its user manual. We should do this for several simple reasons:
If you have not bought IBM GC28-1982-02 yet, this is a good time to familiarize yourself with the basic data on the product. First of all view first pages of the manual, you can find above. You should find there the most important technical data IBM GC28-1982-02 - thus you can check whether the hardware meets your expectations. When delving into next pages of the user manual, IBM GC28-1982-02 you will learn all the available features of the product, as well as information on its operation. The information that you get IBM GC28-1982-02 will certainly help you make a decision on the purchase.
If you already are a holder of IBM GC28-1982-02, but have not read the manual yet, you should do it for the reasons described above. You will learn then if you properly used the available features, and whether you have not made any mistakes, which can shorten the lifetime IBM GC28-1982-02.
However, one of the most important roles played by the user manual is to help in solving problems with IBM GC28-1982-02. Almost always you will find there Troubleshooting, which are the most frequently occurring failures and malfunctions of the device IBM GC28-1982-02 along with tips on how to solve them. Even if you fail to solve the problem, the manual will show you a further procedure – contact to the customer service center or the nearest service center