"GPU0 reported as dead.." but it's running

I’ve gotten this alert a few times, started yesterday. But when I check everything looks fine.

The message log shows:
*2022-03-04 00:55:20] GPU 0: detected DEAD (0d:00.0), will execute restart script watchdog.sh [2022-03-04 00:55:20] *Watchdog script executor thread executing script ‘watchdog.sh’

Card seems fine when I check… Not running too hot or anything. Could it be a glitch or could the card be failing?

amd-info:
=== GPU 0, 0d:00.0 Radeon RX 5500 XT 8176 MB ===
Bios: 113-D3322003-O05, UUID: T3MW210012110303
Core: 1232 MHz 843mV, Mem: 990 MHz
PerfCtrl: manual, Load: 99%, MemLoad: 90%, Power: 69.0 W, Cap: 135 W
Core: 47°C, HotSpot: 54°C, Mem: 76°C, Fan: 56%, RPM: 1881
Core state: 2, clocks: 300 775 1250*
Mem state: 3, clocks: 100 500 625 990*
SOC state: 2, clocks: 304 785 1266*
DCEF state: 0, clocks: 304* 785 1266
F state: 2, clocks: 304 785 1266*
PCIE Link speed: GEN2 (5.0GT/s), PCIE Link width: x1
Memory total: 8176.00 MB, used: 8011.14 MB, free: 164.86 MB, type: Micron GDDR6
VDDGfx: 856mV, VDDCI: 850mV, VDDCR_SOC: 893mV, MVDD: 1350mV, MVDDQ: 1350mV

990 is a high memory setting for 5500XTs. I suspect you are on the edge. Been there done that. Mine will run 28+ as well, but start to get invalids, dead GPU, etc. I back it down a bit, much happier:

Man, thank you for sharing that screenshot! I have duplicated your settings on my 5500XTs… running a little cooler, less power… will monitor and hopefully no more hiccups… It has run for over a month at the other settings w/o a problem so I thought I was good to go.

The RX580 is still running smooth but at 28.33 MH/s. I had hoped for more out of that card. My old 1660Ti is still the best in my lot lol!

Thanks for all the advice!

I hope all of the Hive team keeps safe over there in Ukraine, we are praying them all.

1 Like

Post your settings, including amdmemtweak :slight_smile:

Here it is


=== GPU 1, 0e:00.0 Radeon RX 580 8192 MB ===
Bios: 113-58085SHC1-W90, UUID: M74LQ0050415
Core: 1150 MHz 1106mV, Mem: 2150 MHz 950mV, VDDCI: 950mV, REF: 20
PerfCtrl: manual, Load: 100%, MemLoad: 98%, Power: 96.201 W, Cap: 152 W
Core: 60°C, Fan: 74%, RPM: 2438
Core state: 1, clocks: 300 1150*
Mem state: 2, clocks: 300 2050 2150*
PCIE Link speed: GEN2 (5.0GT/s), PCIE Link width: x1
Memory total: 8192.00 MB, used: 5031.88 MB, free: 3160.12 MB, type: Samsung K4G80325FB
VDDC: 1106mV, VDDCI: 950mV

Looks like the cast of regular settings.

You might find a “less is more” as you tweak some variables.

Guessing you are running stock BIOS?

Yeah… I tried a few tweaks the other day and it went haywire… now I’m a little reluctant to mess with it.

You know, if it ain’t broke, don’t fix it… :laughing: :rofl:

1 Like

This topic was automatically closed 416 days after the last reply. New replies are no longer allowed.