Wednesday, 20 June 2012

Another BIOS setting for IBM X Series ESX Host

I keep getting these errors:

0x806F050CErrorMemory device X (DIMM XStatus) correctable ECC memory error logging limit reached [Note X = 1-12]

The suggestions above aren't all helpful as it takes a long time for these errors to occur, so moving the memory to another slot to confirm whether the problem is with the DIMM or slot is impractical.

A colleague helpfully remembered a problem on HP hosts that sounded similar. He got me looking and I found this BIOS/IMM setting:

Changing "Normal" mode to "Performance" mode affects the way that the DIMMS are refreshed.  This results in a DIMM temperature message occurring at a 10 degree lower temperature.

This article is not about my X3650, but IBM has verbally confirmed it applies to my server:

Change Thermal Mode setting (preferred method)
  1. Boot the blade into the F1 "System Configuration and Boot Management" screen. Highlight "System Settings." Press Enter and select Memory. Select Thermal Mode and change the setting to "Performance."
  2. Press the Esc key twice to get to "System Configuration and Boot Management" and then selectSave Settings and Exit Setup.
  3. Follow the instructions on the next screen to exit the "Setup Utility."
  4. Power the blade off for the changes to take effect and restart.
Changing "Normal" mode to "Performance" mode affects the way that the Dual In-Line Memory Modules (DIMMs) are refreshed. This results in a DIMM temperature warning message occurring at a 10 degree lower temperature. This causes no impact in most industry standard data centers.

Again, I don't have a blade but I seemed to have guessed correctly that they run the same code on the X Series.  Odd I haven't found much about this online.  It should be in a best practices document for IBM servers, maybe even a vSphere document.  Props to "VTSUkanov" for finding and posting about this on the VMware forums.

No comments:

Post a Comment