[Beowulf] Problems with Dell M620 and CPU power throttling
Brice Goglin
brice.goglin at gmail.com
Tue Sep 3 07:40:37 PDT 2013
Hello,
I am seeing messages like this quite often on our R720s in dmesg:
CPU16: Core power limit notification (total events = 1)
Do you think that's related to your problem?
Brice
Le 03/09/2013 14:44, Bill Wichser a écrit :
> The solution appears to be BIOS configuration.
>
> We had:
> < SysProfile=perfoptimized
> < ;ProcPwrPerf=maxperf
> < ;ProcC1E=disable
> < ;ProcCStates=disable
>
> And changed to:
> ---
> > SysProfile=custom
> > ProcPwrPerf=osdbpm
> > ProcC1E=enable
> > ProcCStates=enable
>
> Then added
> modprobe acpi_cpufreq
>
> And we move from BIOS directed power control to OS enabled power
> control. While in the old mode we could set the processor states but
> were unable to see some of the hooks Don had suggested here.
>
> Initial results look good. We have a much better view of what the cores
> are actually doing using the cpupower command, info we were unable to
> obtain completely without this BIOS change.
>
> I'm not sure about the C1E state being enabled though and will
> experiment further.
>
> Thanks to everyone who offered suggestions. An extra thanks to Don
> Holmgren who pointed us down this path.
>
> Bill
>
>
> On 08/30/2013 11:23 AM, Don Holmgren wrote:
>> It might be worth fooling a bit with the cpufreq settings, down in
>>
>> /sys/devices/system/cpu/cpuX/cpufreq
>>
>> (where X=cpu#, one per core) To prevent non-thermal throttling, you can do
>> for each core
>>
>> echo userspace > scaling_governor
>> cat scaling_max_freq
>> echo 2400000 > scaling_setspeed
>>
>> (where substitute the max_freq reported for the 2400000). For this to
>> work you need the specific cpufreq driver for your processor loaded.
>> For our (non-Dell) SB servers it's acpi_cpufreq. In RedHat, the
>> cpuspeed service loads the relevent drivers, not sure if there is a
>> similar service in other distros.
>>
>> The above will lock the the cores at the max_freq, although if they get
>> too hot they will still throttle down in speed. There are statistics
>> available on frequency changes from thermal throttling in
>>
>> /sys/devices/system/cpu/cpu0/thermal_throttle/
>>
>> although I haven't used them, so I'm not sure about their functionality.
>>
>> If you do a
>>
>> modprobe cpufreq_stats
>>
>> then a new directory
>>
>> /sys/devices/system/cpu/cpu0/cpufreq/stats
>>
>> will show up that has statistics about cpu speed changes. I'm not sure
>> whether thermal throttling changes will also show here or not. On one
>> of our large Opteron clusters, we had a handful of nodes with somewhat
>> similar slowdown problems as you are seeing on your SB's. We now lock
>> their frequencies, and we monitor
>> /sys/devices/system/cpu/cpu0/cpufreq/stats/total_trans (which give total
>> number of speed changes), alarming when total_trans is non-zero.
>>
>> Don Holmgren
>> Fermilab
>>
>>
>>
>>
>>
>> On Fri, 30 Aug 2013, Bill Wichser wrote:
>>
>>> Since January, when we installed an M620 Sandybridge cluster from Dell,
>>> we have had issues with power and performance to compute nodes. Dell
>>> apparently continues to look into the problem but the usual responses
>>> have provided no solution. Firmware, BIOS, OS updates all are fruitless.
>>>
>>> The problem is that the node/CPU is power capped. We first detected
>>> this with the STREAM benchmark, a quick run, which shows memory
>>> bandwidth around 2000 instead of the normal 13000 MB/s. When the CPU is
>>> in the C0 state, this drops to around 600.
>>>
>>> The effect appears randomly across the entire cluster with 5-10% of the
>>> nodes demonstrating some slower performance. We don't know what
>>> triggers this. Using "turbostat" we can see that the GHz of the cores
>>> is >= 1 in most cases, dropping to about 0.2 in some of the worst cases.
>>> Looking at the power consumption by either the chassis GUI or using
>>> "impitool sdr list" we see that there is only about 80 watts being used.
>>>
>>> We run the RH 6.x release and are up to date with kernel/OS patches.
>>> All firmware is up to date. Chassis power is configured as
>>> non-redundant. tuned is set for performance. Turbo mode is
>>> on/hyperthreading is off/performance mode is set in BIOS.
>>>
>>> A reboot does not change this problem. But a power cycle returns the
>>> compute node to normal again. Again, we do not know what triggers this
>>> event. We are not overheating the nodes. But while applications are
>>> running, something triggers an event where this power capping takes
>>> effect.
>>>
>>> At this point we remain clueless about what is causing this to happen.
>>> We can detect the condition now and have been power cycling the nodes in
>>> order to reset.
>>>
>>> If anyone has a clue, or better yet, solved the issue, we'd love to hear
>>> the solution!
>>>
>>> Thanks,
>>> Bill
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list