Plantage de partitions en boucle lors du reboot, avec carte RoCE.

Sans les cartes RoCE, le système démarre bien.

# lscfg -vl roce0
roce0 U2C4E.001.DBJR146-P2-C2-T1-L0 PCIe2 10GbE RoCE Converged Network Adapter

PCIe2 2-Port 10GbE RoCE SFP+ Adapter:
Part Number.................00E1601
EC Level....................D77301
FRU Number..................00E1599
Serial Number...............00E1601YA50AG3CM01Y
Product Specific.(Z0).......IBM0F30001010
Product Specific.(Z1).......EC30
Manufacture ID..............335X2557740966
Product Specific.(RV)......."
ROM Level.(alterable).......000200091326
ROM Level.(ext alterable)...000200091326

$  lsmcode -rd roce0
b315506714106104.000200091326

Un palliatif consiste à redémarrer la partition sans les cartes RoCE, puis d'effacer les définitions des cartes hba par la commande 'rmdev -Rdl hba0' et la même chose avec 'hba1'.

 

Ci-dessous un extrait de la documentation IBM sur le sujet :

AIX NIC + OFED RDMA

As of AIX® 7 with 7100-02 the PCIe2 10 GbE RoCE Adapter can be configured to run in the AIX NIC configuration. As of AIX 7 with 7100-03, the OFED RDMA functionality was also added to the AIX NIC configuration. If you do not have the network-intensive applications that benefit from RDMA, then you can use the adapter as a Network Interface Card (NIC) only.

To use the PCIe2 10 GbE RoCE Adapter in the AIX NIC + OFED RoCE configuration or AIX RoCE configuration, the following file sets are required and are available on the AIX 7 with 7100-03 base operating system CD.
devices.ethernet.mlx
Converged Ethernet Adapter main device driver (mlxentdd) to support the AIX NIC + OFED RoCE configuration.
devices.pciex.b315506b3157265
Packaging support for the NGP ITE Converge Ethernet Adapter ASIC2.
devices.pciex.b3155067b3157365
Packaging support for the NGP ITE Converge Ethernet Adapter ASIC1.
devices.pciex.b315506714101604
Packaging for Mellanox 2 Ports 10 GbE Converge Ethernet Adapter with the small form factor pluggable (SFP+) transceivers.
devices.pciex.b315506714106104
Packaging for Mellanox 2 Ports 10 GbE Converge Ethernet Adapter that supports any SFP+ transceivers.
devices.common.IBM.ib
ICM device driver that is required to use the AIX RoCE configuration.
devices.pciex.b3154a63
Mellanox 10 GbE Converge Ethernet Adapter device driver that is required to use the AIX RoCE configuration.
ofed.core
OFED Core Runtime Environment file set that is needed only if OFED RDMA is required.
After the existing AIX RoCE file sets are updated with the new file sets, both the roce and the ent devices might appear to be configured. If both devices appear to be configured when you run the lsdev command on the adapters, complete the following steps:
  1. Delete the roceX instances that are related to the PCIe2 10 GbE RoCE Adapter by entering the following command:
    # rmdev -dl roce0[, roce1][, roce2,...]
  2. Delete the entX instances that are related to the PCIe2 10 GbE RoCE Adapter by entering the following command:
    # rmdev -dl ent1[,ent2][, ent3...]
  3. If there are one or more converged host bus adapters (hbaX) that are related to the PCIe2 10 GbE RoCE Adapter, delete them by entering the following command:
    # rmdev -dl hba0[, hba1][,hba2...]
  4. Run the configuration manager to incorporate the changes by entering the following command:
    # cfgmgr
Complete the following steps to switch over to the AIX NIC + OFED RoCE configuration from the AIX RoCE configuration:
  1. Stop all RDMA applications that are running on the PCIe2 10 GbE RoCE Adapter.
  2. Delete or redefine the roceX instances by entering one of the following commands:
    • rmdev -d -l roce0
    • rmdev -l roce0
    The rmdev -l roce0 command retains the definition of the roce0 configuration so you can use it the next time to create instances.
  3. Change the attribute of the hba stack_type setting from aix_ib (AIX RoCE) to ofed (AIX NIC + OFED RoCE) by entering the following command:
    # chdev -l hba0 -a stack_type=ofed 
  4. Run the configuration manager tool so that the host bus adapter can configure the PCIe2 10 GbE RoCE Adapter as a NIC adapter by entering the following command:
    # cfgmgr
  5. Verify that the adapter is now running in NIC configuration by entering the following command:
    # lsdev -C -c adapter
    The following example shows the results when you run the lsdev command on the adapter when it is configured in AIX NIC + OFED RoCE mode:
    Figure 1. Example output of lsdev command on an adapter with the AIX NIC + OFED RoCE configuration
    ent1 Available 00-00-01 PCIe2 10GbE RoCE Converged Network Adapter
    ent2 Avaliable 00-00-02 PCIe2 10GbE RoCE Converged Network Adapter
    hba0 Available 00-00 PCIe2 10GbE RoCE Converged Host Bus Adapter (b315506714101604)
Because as of AIX 7 with 7100-03, AIX also supports OFED RDMA in the AIX NIC mode, if OFED RDMA needs to be enabled, you need to complete the following two additional steps:
  1. Install the package ofed.core.
  2. Set the RDMA mode in the ent1, ent2 devices by entering the following command:
         # chdev –l ent1 –a rdma=desired
         # chdev –l ent2 -a rdma=desired
    The RDMA mode is set before the en1 or en2 interfaces are configured.
  3. You can disable the RDMA mode by entering the following command:
         # chdev –l ent1 –a rdma=disabled
         # chdev –l ent2 –a rdma=disabled  
  4. 
    
icon phone
Téléphone/Whatsapp : +33 (0)6 83 84 85 74
icon phone