Nov. 10th, 2021

dennisgorelik: 2020-06-13 in my home office (Default)
Yesterday we thought that the temperature that "sensors" command showed - is a "PCI adapter" temperature.
We also thought that Centos8 drivers, currently, do not support CPU temperature measurement on AMD EPYC.

Today we learned that "Tdie" is, actually, CPU temperature.
We also discovered that AMD EPYC 7313 does not heat much: 55C maximum (compare vs Intel Xeon CPU that reaches 70C).

More technical details

1) "sensors" command shows CPU temperature even under regular user permissions.
But in order to see energy consumption per core/socket - "sensors" command needs superuser permissions [on AMD EPYC CPU].
[centos@esovh ~]$ sudo -s sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +25.0°C (high = +70.0°C)
Tctl: +25.0°C

amd_energy-isa-0000
Adapter: ISA adapter
Ecore000: 206.91 J
Ecore001: 61.11 J
Ecore002: 38.78 J
Ecore003: 40.49 J
Ecore004: 27.27 J
Ecore005: 24.25 J
Ecore006: 23.04 J
Ecore007: 23.77 J
Ecore008: 23.50 J
Ecore009: 22.67 J
Ecore010: 43.86 J
Ecore011: 22.50 J
Ecore012: 27.76 J
Ecore013: 24.55 J
Ecore014: 23.84 J
Ecore015: 24.82 J
Esocket0: 20.47 kJ
Esocket1: 20.47 kJ
Esocket2: 20.47 kJ
Esocket3: 20.47 kJ


2) "Tdie" is CPU temperature:
https://forums.gentoo.org/viewtopic-t-1098716-start-0.html
-CPU (Tctl): This is the T_control temperature available on AMD CPUs only. On several generations before Zen (Ryzen), this is not a reliable representation of the temperature. On AMD Zen series this is the temperature used to control cooling and is a fixed offset from the real CPU temperature. Offset is used mostly on X-series and some Threadripper CPUs; in such case two values are shown: Tctl and Tdie. If no offset is used, then only a single value is shown as Tctl/Tdie, which equals the real temperature.

-CPU (Tdie): This value is shown in case the CPU uses an offset from Tctl and represents the real temperature (Tdie = Tctl - Tctl_offset).

3) We ran stress test on AMD EPYC 7313 (esovh).
stress --cpu 24 --timeout 20m

4) Under 20 minutes stress - CPU temperature reached +45.0°C maximum.
Every 2.0s: sensors esovh: Wed Nov 10 14:06:17 2021

k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +45.0°C (high = +70.0°C)
Tctl: +45.0°C

amd_energy-isa-0000
Adapter: ISA adapter
Ecore000: 2.51 kJ
Ecore001: 2.44 kJ
Ecore002: 2.93 kJ
Ecore003: 2.73 kJ
Ecore004: 2.51 kJ
Ecore005: 2.78 kJ
Ecore006: 2.98 kJ
Ecore007: 2.32 kJ
Ecore008: 2.71 kJ
Ecore009: 2.23 kJ
Ecore010: 2.38 kJ
Ecore011: 2.86 kJ
Ecore012: 2.69 kJ
Ecore013: 2.36 kJ
Ecore014: 2.78 kJ
Ecore015: 2.74 kJ
Esocket0: 1.02 MJ
Esocket1: 1.02 MJ
Esocket2: 1.02 MJ
Esocket3: 1.02 MJ

5) After ~10 minutes under stress - CPU temperature fell -0.2°C (to +44.8°C):
Every 2.0s: sensors esovh: Wed Nov 10 14:15:49 2021

k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +44.8°C (high = +70.0°C)
Tctl: +44.8°C

amd_energy-isa-0000
Adapter: ISA adapter
Ecore000: 4.83 kJ
Ecore001: 4.76 kJ
Ecore002: 5.99 kJ
Ecore003: 5.81 kJ
Ecore004: 4.87 kJ
Ecore005: 5.89 kJ
Ecore006: 6.11 kJ
Ecore007: 4.66 kJ
Ecore008: 5.61 kJ
Ecore009: 4.94 kJ
Ecore010: 4.75 kJ
Ecore011: 5.41 kJ
Ecore012: 5.13 kJ
Ecore013: 4.70 kJ
Ecore014: 5.63 kJ
Ecore015: 5.77 kJ
Esocket0: 1.09 MJ
Esocket1: 1.09 MJ
Esocket2: 1.09 MJ
Esocket3: 1.09 MJ

6) After stress ended - CPU temperature fell [with approximate speed -1°C per second] down to +26.2°C:
Every 2.0s: sensors esovh: Wed Nov 10 14:29:50 2021

k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +26.2°C (high = +70.0°C)
Tctl: +26.2°C

amd_energy-isa-0000
Adapter: ISA adapter
Ecore000: 5.92 kJ
Ecore001: 5.65 kJ
Ecore002: 6.97 kJ
Ecore003: 6.99 kJ
Ecore004: 5.93 kJ
Ecore005: 7.08 kJ
Ecore006: 7.15 kJ
Ecore007: 5.57 kJ
Ecore008: 6.49 kJ
Ecore009: 6.09 kJ
Ecore010: 5.89 kJ
Ecore011: 6.29 kJ
Ecore012: 6.03 kJ
Ecore013: 5.58 kJ
Ecore014: 6.78 kJ
Ecore015: 6.93 kJ
Esocket0: 1.15 MJ
Esocket1: 1.15 MJ
Esocket2: 1.15 MJ
Esocket3: 1.15 MJ

More temperature tests:
Intel Xeon E-2136 CPU Temperature
Ryzen 5900X CPU temperature
dennisgorelik: 2020-06-13 in my home office (Default)
Earlier (2021-10-30) we also measured Intel Xeon CPU temperature.
Intel Xeon CPU reaches much higher temperatures than AMD EPYC.
Healthy Intel Xeon (Adv-2-LE) reached around +70C under stress.
Unhealthy Intel Xeon reached 97+C and started CPU throttling

Below temperature measurements are for "Healthy" Intel Xeon CPU in "Idle", "Stress Starts" and "Stress Continues" modes.

1) Idle
[centos@esovh ~]$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +30.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +27.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +28.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +28.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +27.0°C (high = +80.0°C, crit = +100.0°C)
Core 4: +30.0°C (high = +80.0°C, crit = +100.0°C)
Core 5: +27.0°C (high = +80.0°C, crit = +100.0°C)
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +119.0°C)
power_meter-acpi-0
Adapter: ACPI interface
power1: 4.29 MW (interval = 1.00 s)
pch_cannonlake-virtual-0
Adapter: Virtual device
temp1: +39.0°C

2) Stress
stress --cpu 24 --timeout 20m
a) Stress Starts (measured temperature ~30 seconds after the stress start)
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +70.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +70.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +67.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +69.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +68.0°C (high = +80.0°C, crit = +100.0°C)
Core 4: +67.0°C (high = +80.0°C, crit = +100.0°C)
Core 5: +67.0°C (high = +80.0°C, crit = +100.0°C)
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +119.0°C)
power_meter-acpi-0
Adapter: ACPI interface
power1: 4.29 MW (interval = 1.00 s)
pch_cannonlake-virtual-0
Adapter: Virtual device
temp1: +39.0°C
b) Stress Continues
[centos@esovh ~]$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +66.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +66.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +65.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +66.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +64.0°C (high = +80.0°C, crit = +100.0°C)
Core 4: +64.0°C (high = +80.0°C, crit = +100.0°C)
Core 5: +63.0°C (high = +80.0°C, crit = +100.0°C)
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +119.0°C)
power_meter-acpi-0
Adapter: ACPI interface
power1: 4.29 MW (interval = 1.00 s)
pch_cannonlake-virtual-0
Adapter: Virtual device
temp1: +40.0°C

Profile

dennisgorelik: 2020-06-13 in my home office (Default)
Dennis Gorelik

June 2025

S M T W T F S
1234 567
891011 12 13 14
15161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 14th, 2025 05:32 pm
Powered by Dreamwidth Studios