Skip to content

Instantly share code, notes, and snippets.

@lzghzr
Last active August 10, 2024 11:14
Show Gist options
  • Save lzghzr/d37327c218a09cce3e601cea3ebdbd42 to your computer and use it in GitHub Desktop.
Save lzghzr/d37327c218a09cce3e601cea3ebdbd42 to your computer and use it in GitHub Desktop.
544+ flr 解锁56G直连

License: CC BY-SA 4.0

需要的工具

固件工具 NVIDIA Firmware Tools (MFT)
固件工具 4.3.0.25版本 NVIDIA Firmware Tools (MFT) 4.3.0.25
HPE固件 fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754.tgz
自定义固件 ConnectX3Pro-rel-2_40_5030.tgz
固件ini修改 mft-scripts

准备工作

下载所有需要的工具,并且安装 NVIDIA Firmware Tools (MFT)NVIDIA Firmware Tools (MFT) 4.3.0.25
ConnectX3Pro-rel-2_40_5030.tgz 里的 fw-ConnectX3Pro-rel.mlxMCX354A-FCC_Ax.ini 复制到 NVIDIA Firmware Tools (MFT) 4.3.0.25 安装目录
我安装在了 C:\Program Files\Mellanox\WinMFT_x64_4_3_0_25
再复制一份 MCX354A-FCC_Ax.ini 命名为 MCX354A-FCC_Ax_56G.ini

再将 fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754.tgz 里的 fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754.binmft-scripts 里的 fs2_update_ini.py 复制到 NVIDIA Firmware Tools (MFT) 安装目录
我安装在了 C:\Program Files\Mellanox\WinMFT

生成固件

修改 MCX354A-FCC_Ax_56G.ini[IB] 部分

port1_802_3ap_56kr4_ability = true
port2_802_3ap_56kr4_ability = true

port1_802_3ap_cr4_enable = true
port2_802_3ap_cr4_enable = true
port1_802_3ap_cr4_ability = true
port2_802_3ap_cr4_ability = true

port1_802_3ap_kr4_enable = true
port2_802_3ap_kr4_enable = true
port1_802_3ap_kr4_ability = true
port2_802_3ap_kr4_ability = true

改为

port1_802_3ap_56kr4_enable = true
port2_802_3ap_56kr4_enable = true
port1_802_3ap_56kr4_ability = true
port2_802_3ap_56kr4_ability = true

port1_802_3ap_cr4_enable = true
port2_802_3ap_cr4_enable = true
port1_802_3ap_cr4_ability = true
port2_802_3ap_cr4_ability = true

port1_802_3ap_kr4_enable = true
port2_802_3ap_kr4_enable = true
port1_802_3ap_kr4_ability = true
port2_802_3ap_kr4_ability = true

分别生成两个固件, 开启56G和不开启的

PS C:\Program Files\Mellanox\WinMFT_x64_4_3_0_25> mlxburn -fw fw-ConnectX3Pro-rel.mlx -c MCX354A-FCC_Ax_56G.ini -wrimage MCX354A-FCC_Ax_56G.bin
-W- Removing parameter defined outside a group: "prepMLX version".
-I- Generating image ...
-I- Image generation completed successfully.
PS C:\Program Files\Mellanox\WinMFT_x64_4_3_0_25> mlxburn -fw fw-ConnectX3Pro-rel.mlx -c MCX354A-FCC_Ax.ini -wrimage MCX354A-FCC_Ax.bin
-W- Removing parameter defined outside a group: "prepMLX version".
-I- Generating image ...
-I- Image generation completed successfully.

分析固件

通过 UltraCompare 对比两个固件, 一共有四处改动

第一处在头部

MCX354A-FCC_Ax_56G.bin MCX354A-FCC_Ax.bin
00000020h: 00 00 68 E6 00 00 00 04 F5 00 00 0B FD 00 3B C8 ; 00000020h: 00 00 32 AE 00 00 00 04 F5 00 00 0B FD 00 3B C8 ;
00000030h: 00 0A 99 48 00 00 3B 84 00 10 00 40 00 00 01 ; 00000030h: 00 0A 99 44 00 00 3B 84 00 10 00 40 00 00 01 85 ;

第二处在文件靠后位置

MCX354A-FCC_Ax_56G.bin MCX354A-FCC_Ax.bin
000a7bb0h: 1F 83 F9 00 7F 8F FF 20 00 01 F9 A0 00 8F F0 02 ; 000a7bb0h: 1F 03 F9 00 7F 8F FF 20 00 01 F9 A0 00 8F F0 02 ;
000a7bc0h: 03 8F F0 17 00 01 F9 A4 00 40 00 01 00 D3 01 FF ; 000a7bc0h: 03 8F F0 17 00 01 F9 A4 00 40 00 01 00 D3 01 FF ;
000a7bd0h: 00 01 F9 AC 1F 83 F9 00 7F 8F FF 20 00 01 F9 B0 ; 000a7bd0h: 00 01 F9 AC 1F 03 F9 00 7F 8F FF 20 00 01 F9 B0 ;

第三处在文件末尾前

MCX354A-FCC_Ax_56G.bin MCX354A-FCC_Ax.bin
000a8fe0h: 00 00 96 1F 00 00 00 03 00 00 00 18 00 00 00 00 ; 000a8fe0h: 00 00 22 2D 00 00 00 03 00 00 00 18 00 00 00 ;

第四处为文件末尾全部

第四处改动较多, 实际为ini压缩后数据, 第二处改了两个位置, 通过

port1_802_3ap_56kr4_enable = true
port2_802_3ap_56kr4_enable = true

可以猜测, 此处为两个端口的设置项
而第一处和第三处则为校验位

继续分析 fw-ConnectX3Pro-rel.mlx, 找到 port1_802_3ap_56kr4_enable相关选项

scratchpad.eth.port[0].mode_40g_is_50g 0x1f99c.5 1
scratchpad.eth.port[0].b0_hw_eye_opener_cfg_measure_time 0x1f99c.8 4
scratchpad.eth.port[0].eth_802_3ap_56kr4_ability 0x1f99c.12 1
scratchpad.eth.port[0].eth_802_3ap_cr4_ability 0x1f99c.13 1
scratchpad.eth.port[0].eth_802_3ap_kr4_ability 0x1f99c.14 1
scratchpad.eth.port[0].eth_802_3ap_kr_ability 0x1f99c.15 1
scratchpad.eth.port[0].eth_802_3ap_kx_ability 0x1f99c.16 1
scratchpad.eth.port[0].eth_802_3ap_kx4_ability 0x1f99c.17 1
scratchpad.eth.port[0].eth_802_3ap_kr2_ability 0x1f99c.18 1
scratchpad.eth.port[0].eth_802_3ap_100M_ability 0x1f99c.19 1
scratchpad.eth.port[0].eth_802_3ap_56kr4_enable 0x1f99c.23 1
scratchpad.eth.port[0].eth_802_3ap_cr4_enable 0x1f99c.24 1
scratchpad.eth.port[0].eth_802_3ap_kr4_enable 0x1f99c.25 1
scratchpad.eth.port[0].eth_802_3ap_kr_enable 0x1f99c.26 1
scratchpad.eth.port[0].eth_802_3ap_kx_enable 0x1f99c.27 1
scratchpad.eth.port[0].eth_802_3ap_kx4_enable 0x1f99c.28 1
scratchpad.eth.port[0].eth_802_3ap_kr2_enable 0x1f99c.29 1
scratchpad.eth.port[0].eth_802_3ap_100M_enable 0x1f99c.30 1

可以看到此项位置为 0x1f99c.23
将第二处改动转为二进制进行对比

1F83 F900

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0

1F03 F900

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0

很明显, 第23位为 port1_802_3ap_56kr4_enable, 所以只要修改此位就可以不通过 mlxburn 解锁56G

制作固件

先到 NVIDIA Firmware Tools (MFT) 目录提取一份配置文件

PS C:\Program Files\Mellanox\WinMFT> flint -i fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754.bin dc HP_1380110017.ini

转到 [IB]

;;speed flags for port0
cx3_spec1_3_ib_support_port0 = 1
cx3_spec1_2_ib_support_port0 = 1
spec1_3_fdr10_ib_support_port0 = 1
spec1_3_fdr14_ib_support_port0 = 1
port1_802_3ap_56kr4_ability = 1
port1_802_3ap_cr4_ability = 1
port1_802_3ap_cr4_enable  = 1

可以看到缺少了以下三项

port1_802_3ap_56kr4_enable = true
port1_802_3ap_kr4_enable = true
port1_802_3ap_kr4_ability = true

根据上面分析, 缺失的部分为

scratchpad.eth.port[0].eth_802_3ap_kr4_ability 0x1f99c.14 1
scratchpad.eth.port[0].eth_802_3ap_56kr4_enable 0x1f99c.23 1
scratchpad.eth.port[0].eth_802_3ap_kr4_enable 0x1f99c.25 1

此时设置项应为

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 0 1 1 1 0 1 0 0 0 0 0 0 1 1 1 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0

转换成16进制 1D03 B900
使用 UltraEdit 查找 1D03 B900

fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754.bin
000ed760h: 00 01 F9 9C 1D 03 B9 00 7F 8F FF 20 00 01 F9 A0 ;
000ed770h: 00 8F F0 02 03 8F F0 17 00 01 F9 A4 00 40 00 01 ;
000ed780h: 00 D3 01 FF 00 01 F9 AC 1D 03 B9 00 7F 8F FF 20 ;

说明判断正确, 修改设置项

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0

转换成16进制 1F83 F900, 与 MCX354A-FCC_Ax_56G.bin 一致
使用 UltraEdit 更改 fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754.bin 并重命名为 fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin

fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin
000ed760h: 00 01 F9 9C 1F 83 F9 00 7F 8F FF 20 00 01 F9 A0 ;
000ed770h: 00 8F F0 02 03 8F F0 17 00 01 F9 A4 00 40 00 01 ;
000ed780h: 00 D3 01 FF 00 01 F9 AC 1F 83 F9 00 7F 8F FF 20 ;

此时并不能直接刷固件, 因为前面说了还有校验码

校验固件

PS C:\Program Files\Mellanox\WinMFT> flint -i fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin v

     FS2 failsafe image. Start address: 0x0. Chunk size 0x80000:

     NOTE: The addresses below are contiguous logical addresses. Physical addresses on
           flash may be different, based on the image start address and chunk size

     /0x00000038-0x0000065b (0x000624)/ (BOOT2) - OK
     /0x0000065c-0x0000297f (0x002324)/ (BOOT2) - OK
     /0x00002980-0x00003923 (0x000fa4)/ (Configuration) - OK
     /0x00003924-0x00047f5f (0x04463c)/ (ROM) - OK
     /0x00047f60-0x00047fa3 (0x000044)/ (GUID) - OK
     /0x00047fa4-0x0004812f (0x00018c)/ (Image Info) - OK
     /0x00048130-0x00055513 (0x00d3e4)/ (DDR) - OK
     /0x00055514-0x00056577 (0x001064)/ (DDR) - OK
     /0x00056578-0x00056967 (0x0003f0)/ (DDR) - OK
     /0x00056968-0x00094fab (0x03e644)/ (DDR) - OK
     /0x00094fac-0x00099e2f (0x004e84)/ (DDR) - OK
     /0x00099e30-0x0009e423 (0x0045f4)/ (DDR) - OK
     /0x0009e424-0x0009ef1b (0x000af8)/ (DDR) - OK
     /0x0009ef1c-0x000cf0ef (0x0301d4)/ (DDR) - OK
     /0x000cf0f0-0x000d2c9b (0x003bac)/ (DDR) - OK
     /0x000d2c9c-0x000e820f (0x015574)/ (DDR) - OK
     /0x000e8210-0x000e8317 (0x000108)/ (DDR) - OK
     /0x000e8318-0x000ed39b (0x005084)/ (DDR) - OK
     /0x000ed39c-0x000eeb97 (0x0017fc)/ (Configuration) /0x000ed39c/ - wrong CRC (exp:0x8e4c, act:0x9008)
-E- FW image verification failed: Bad CRC.. AN HCA DEVICE CAN NOT BOOT FROM THIS IMAGE.

使用 UltraEdit 定位到 0x000eeb96

fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin
000eeb90h: 00 00 00 7F 00 00 90 08 00 00 00 03 00 00 00 18 ;

按照提示修改为

fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin
000eeb90h: 00 00 00 7F 00 00 8E 4C 00 00 00 03 00 00 00 18 ;

现在可以把ini文件再更新进去
复制 HP_1380110017.iniHP_1380110017_56G.ini, 修改 [IB] 部分

;;speed flags for port0
cx3_spec1_3_ib_support_port0 = 1
cx3_spec1_2_ib_support_port0 = 1
spec1_3_fdr10_ib_support_port0 = 1
spec1_3_fdr14_ib_support_port0 = 1
port1_802_3ap_56kr4_ability = 1
port1_802_3ap_cr4_ability = 1
port1_802_3ap_cr4_enable  = 1

;;speed flags for port1
cx3_spec1_3_ib_support_port1 = 1
cx3_spec1_2_ib_support_port1 = 1
spec1_3_fdr10_ib_support_port1 = 1
spec1_3_fdr14_ib_support_port1 = 1
port2_802_3ap_56kr4_ability = 1
port2_802_3ap_cr4_ability = 1
port2_802_3ap_cr4_enable  = 1

改为

;;speed flags for port0
cx3_spec1_3_ib_support_port0 = 1
cx3_spec1_2_ib_support_port0 = 1
spec1_3_fdr10_ib_support_port0 = 1
spec1_3_fdr14_ib_support_port0 = 1
port1_802_3ap_56kr4_ability = 1
port1_802_3ap_56kr4_enable = 1
port1_802_3ap_cr4_ability = 1
port1_802_3ap_cr4_enable  = 1
port1_802_3ap_kr4_ability = 1
port1_802_3ap_kr4_enable = 1

;;speed flags for port1
cx3_spec1_3_ib_support_port1 = 1
cx3_spec1_2_ib_support_port1 = 1
spec1_3_fdr10_ib_support_port1 = 1
spec1_3_fdr14_ib_support_port1 = 1
port2_802_3ap_56kr4_ability = 1
port2_802_3ap_56kr4_enable = 1
port2_802_3ap_cr4_ability = 1
port2_802_3ap_cr4_enable  = 1
port2_802_3ap_kr4_ability = 1
port2_802_3ap_kr4_enable = 1

替换固件内的ini文件

PS C:\Program Files\Mellanox\WinMFT> python3 fs2_update_ini.py fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin HP_1380110017_56G.ini

再次校验固件

PS C:\Program Files\Mellanox\WinMFT> flint -i fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin v

     FS2 failsafe image. Start address: 0x0. Chunk size 0x80000:

     NOTE: The addresses below are contiguous logical addresses. Physical addresses on
           flash may be different, based on the image start address and chunk size

     /0x00000038-0x0000065b (0x000624)/ (BOOT2) - OK
     /0x0000065c-0x0000297f (0x002324)/ (BOOT2) - OK
     /0x00002980-0x00003923 (0x000fa4)/ (Configuration) - OK
     /0x00003924-0x00047f5f (0x04463c)/ (ROM) - OK
     /0x00047f60-0x00047fa3 (0x000044)/ (GUID) - OK
     /0x00047fa4-0x0004812f (0x00018c)/ (Image Info) - OK
     /0x00048130-0x00055513 (0x00d3e4)/ (DDR) - OK
     /0x00055514-0x00056577 (0x001064)/ (DDR) - OK
     /0x00056578-0x00056967 (0x0003f0)/ (DDR) - OK
     /0x00056968-0x00094fab (0x03e644)/ (DDR) - OK
     /0x00094fac-0x00099e2f (0x004e84)/ (DDR) - OK
     /0x00099e30-0x0009e423 (0x0045f4)/ (DDR) - OK
     /0x0009e424-0x0009ef1b (0x000af8)/ (DDR) - OK
     /0x0009ef1c-0x000cf0ef (0x0301d4)/ (DDR) - OK
     /0x000cf0f0-0x000d2c9b (0x003bac)/ (DDR) - OK
     /0x000d2c9c-0x000e820f (0x015574)/ (DDR) - OK
     /0x000e8210-0x000e8317 (0x000108)/ (DDR) - OK
     /0x000e8318-0x000ed39b (0x005084)/ (DDR) - OK
     /0x000ed39c-0x000eeb97 (0x0017fc)/ (Configuration) - OK
     /0x000eeb98-0x000eec0b (0x000074)/ (Jump addresses) - OK
     /0x000eec0c-0x000ef7d7 (0x000bcc)/ (FW Configuration) - OK
     /0x00000000-0x000ef7d7 (0x0ef7d8)/ (Full Image) - OK

-I- FW image verification succeeded. Image is bootable.

全部OK

刷入固件

获取网卡名

PS C:\Program Files\Mellanox\WinMFT> mst status
MST devices:
------------
  mt4103_pci_cr0
  mt4103_pciconf0

刷入修改好的固件

PS C:\Program Files\Mellanox\WinMFT> flint -d mt4103_pci_cr0 -i fw-ConnectX3Pro-rel-2_42_5700-764285-B21_Ax-CLP-8025-UEFI-14.11.49-FlexBoot-3.4.754_56G.bin b

    Current FW version on flash:  2.42.5700
    New FW version:               2.42.5700

    Note: The new FW version is the same as the current FW version on flash.

 Do you want to continue ? (y/n) [n] : y

Burning FS2 FW image without signatures - OK
Restoring signature                     - OK

(可选)删除 FlexBoot

PS C:\Program Files\Mellanox\WinMFT> flint -d mt4103_pci_cr0 --allow_rom_change drom

-I- Preparing to remove ROM ...
Removing ROM image    - OK
Restoring signature  - OK

(可选)重置网卡

PS C:\Program Files\Mellanox\WinMFT> mlxconfig -d mt4103_pci_cr0 reset

 Reset configuration for device mt4103_pci_cr0? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.

注意: 重置网卡设置后默认为VPI模式

重启即可开启56G

参考

Mellanox-ConnectX3-MCX353A-QCBT开启56GbE
HowTo Setup 56GbE Back-to-Back on two servers
Looking for a Connectx-3 custom firmware package

@0llieW
Copy link

0llieW commented Aug 6, 2024

Thank you for this guide! I've the same network card (HP 544+FLR-QSFP / Mellanox ConnectX-3 Pro FlexibleLOM) but my connection is still 40Gb using the HP 674852-001 (HP 670759-B25 / Mellanox MC2207128-003) DAC back-to-back between the two ports - the DAC supporte 56Gb (https://www.part-elec.com/datasheet/mellanox-technologies/MC2207130-001.pdf) and in Infiniband mode I will have 56Gb without any modifications in a back-to-back setup. There's an error with the Python script because of the double "mac_enum" entries, but after deleting one of them (tried both "mac_enum 0" and "mac_enum 1"), modifing the configuration file with the missing entries from your guide and updating the firmware with "fs2_update_ini.py" the new configuration file does have much more new lines (flint -i fw.bin dc cfg.ini) than before - could this be the problem?

@lzghzr
Copy link
Author

lzghzr commented Aug 9, 2024

Thank you for this guide! I've the same network card (HP 544+FLR-QSFP / Mellanox ConnectX-3 Pro FlexibleLOM) but my connection is still 40Gb using the HP 674852-001 (HP 670759-B25 / Mellanox MC2207128-003) DAC back-to-back between the two ports - the DAC supporte 56Gb (https://www.part-elec.com/datasheet/mellanox-technologies/MC2207130-001.pdf) and in Infiniband mode I will have 56Gb without any modifications in a back-to-back setup. There's an error with the Python script because of the double "mac_enum" entries, but after deleting one of then (tried both "mac_enum 0" and "mac_enum 1"), modifing the configuration file with the missing entries from your guide and updating the firmware with "fs2_update_ini.py" the new configuration file does have much more new lines (flint -i fw.bin dc cfg.ini) than before - could this be the problem?

I can share a bin
https://www.swisstransfer.com/d/50f2e162-651e-4907-aae0-ff389bf7a45e

@0llieW
Copy link

0llieW commented Aug 9, 2024

Thank you very much! I will give it a try.. ;) Which setup do you use für 56G Ethernet? Are you using transceivers with fiber, DAC or AOC? Can you tell the part numbers of your cables/transceivers in your working 56G Ethernet setup ? I guess that the HP 670759-B25 only supports 56G Infiniband but is limited to 40G Ethernet.. :(

@lzghzr
Copy link
Author

lzghzr commented Aug 10, 2024

Thank you very much! I will give it a try.. ;) Which setup do you use für 56G Ethernet? Are you using transceivers with fiber, DAC or AOC? Can you tell the part numbers of your cables/transceivers in your working 56G Ethernet setup ? I guess that the HP 670759-B25 only supports 56G Infiniband but is limited to 40G Ethernet.. :(

Finisar FTL414QB2N-E5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment