Il est possible d'obtenir des erreurs telles que citées ci-dessous :
3D32B80D 0625121014 P S topsvcs NIM thread blocked
DCB47997 0625121014 T H hdisk19 DISK OPERATION ERROR
90D3329C 0625121014 P S topsvcs NIM read/write error
3D32B80D 0625121014 P S topsvcs NIM thread blocked
En détaillant, on peut remarquer que le disque 'hdisk19' est un disque de heartBeat PowerHA.
Pour éviter ce message, sans conséquence sur la production, il est possible de baisser le niveau de détection de panne dans les paramètres réseau du "network" lié au Heartbeat( En général 'diskhb')
Pour, on se rend dans les menus SMIT de Powerha :
# smit hacmp
Extended Configuration + Extended Topology Configuration +Configure HACMP Network Modules +Change a Network Module using Pre-defined Values
Puis, modifier la valeur en rouge.
Network Module Name diskhb
Description Disk Heartbeat Serial protocol
Failure Detection Rate Slow
NOTE: Changes made in this panel must be
propagated to the other nodes by
Verifying and Synchronizing the cluster
DETAIL DES ERREURS Ci-dessous :
---------------------------------------------------------------------------
LABEL: TS_NIM_ERROR_STUCK_
IDENTIFIER: 3D32B80D
Date/Time: Wed Jun 25 12:10:47 2014
Sequence Number: 338681
Machine Id: 00F60AE00
Node Id: hayamq11
Class: S
Type: PERM
WPAR: Global
Resource Name: topsvcs
Description
NIM thread blocked
Probable Causes
A thread in a Topology Services Network Interface Module (NIM) process
was blocked
Topology Services NIM process cannot get timely access to CPU
User Causes
Excessive memory consumption is causing high memory contention
Excessive disk I/O is causing high memory contention
Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persists
Failure Causes
Excessive virtual memory activity prevents NIM from making progress
Excessive disk I/O traffic is interfering with paging I/O
Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persists
Detail Data
DETECTING MODULE
rsct,nim_control.C,1.39.1.41,7916
ERROR ID
6BUfAx.b.eeH/5jA08oWi.1...................
REFERENCE CODE
Thread which was blocked
receive thread
Interval in seconds during which process was blocked
40
Interface name
rhdiskpower2