Not only that - I can take and swap risers with working cards and those cards continue to work while the same cards that fail IB RING 12 mine just fine in a different system (it is only RING 12).
Both are XFX RX 480s without modded BIOS (didn’t get to mod them yet).
I’ve proven it isn’t a riser issue and it isn’t with the cards (nor with which PCIe slot they are using).
I can confirm what Rootless said. I’m facing the same problem on my 12 gpu rig, running rx570 / rx580. Changed/switched risers, changed mobos (h110/tb250), same random IB RING errors on boot (with different numbers). This affects Ethos, Smos and Hive as well, though I had more luck running Ethos (different kernel+firmware I suppose, but this did not solve the problem completely). It’s so freaking annoying, cause it requires a physical restart of the PSU.
Help, please…?
I can confirm it too. But i’ve flash the stok rom to cards, make 20+ reboots, and don’t have this error. I think kernel in HIVEOS can’t work with moded bios. Maybe it’s ROCM works bad. RX580 4Gb Elpida cards have this problem. But Radeon RX 580 8gb workes well. Dima, need help, what we can do ?
The farm did not work stable - it hung every couple of days, sometimes restart (so that all video cards would start normally) took up to half an hour. Motherboard was Biostar TB250-BTC. I changed it to Asrock H110 Pro BTC +. “Amdgpu: ring” errors started to pop up. I solved the problem by replacing several fail risers.
Ферма работала не стабильно - раз в пару дней висла, перезагружалась иногда на перезапуск (чтобы нормально запустились все видеокарты) уходило до полу часа. Материнка Biostar TB250-BTC. Поменял на Asrock H110 Pro BTC+. Посыпались ошибки “amdgpu: ring”. Причина оказалась сразу в нескольких в райзерах.
I am getting the following errors, could these also be related to risers? I ordered some new risers but they’re coming in a month or so
Jan 10 12:25:23 hive5700XT kernel: [58483.988705] amdgpu: Failed to export SMU metrics table!
Jan 10 12:25:28 hive5700XT kernel: [58488.988954] amdgpu: Msg issuing pre-check failed and SMU may be not in the right state!
I have the same problem.
So frustrating.
I tried many things:
connecting the card directly into the motherboard - same error
reinstalled BIOS for the mobo
changed to working risers
added RAM
even had a chat with the hiveos assistant (he thought it is the BIOS)
if anyone can share more experience on how to make the GPU’s UVD to be working again - that will be great!
Thanks!