AMD 6800 Voltage is not set at startup. Need to apply new OC for it to be set

Hello. I’m setting up some RX 6800 rigs with Asrock Phantom Gaming.

I usually do 700mv for the core. But I’ve noticed amd-info shows that the cores are at around 750mv.
This happens after the rig boots up and stays like that. Then if I change at least a bit the core voltage, it gets applied properly.

Is it possible that I’m doing something wrong or could it be a HIVEOS related issue?


Possibly related I have another issue with one of this cards. 50% of the times I boot a rig with one card, the card crashes and all other amd cards go into default oc. But when the card boots up, it can mine for hours and hours perfectly. I swapped it with another one until I further test this card in windows to see if its a bad card or not so much. (risers and everything else have been tested and its 100% related with this specific card)

UPDATE: phantom gaming 6800 rigs crash after a few days (high LA). At this point I’m thinking is 100% related to hiveos. Will try a different mining OS

Hi,
I’ve the same Problem. My rig also restarts every ~20hours with a GPU error. In this topic some user also talked about this: RX6800 - Efficient Overclocking

Any ideas how to solve it?

no solution yet for me

Which error, specifically?

sry I was wrong. I’ve checked all logs (syslog, minerlog kernlog) I set up logs-on for a while.

In conclusion: I cannot find any error. The rig just freezes and my iobroker script reboots the power plug because it recognizes that there is not enough power consumption. Today after 22h and again after 2hours. Since 2hours is it now running…

But everytime after a reboot I have to change something in the oc (like the fan setting) to get the right core voltage. There are some other people who speak about this: RX6800 - Efficient Overclocking - #687 by i77ac10

There is obviously a rig issue with hardware, compatibility, settings, etc.

Assuming for discussion 10watts+ per GPU is happening, any system in use should have the designed over head to handle a 20% increase without failing.

I’d put a direct monitor and keyboard on the rig, disable the rebooting script and see if you can catch the hang condition.

Too many folks are running 6800 rigs weeks and months on end without what your rig is experiencing.

Thanks for your advice. I’ve already done this yesterday. Monitor, Keyboard and an old digital camera on the screen. It just stopped with “New Job: xxx” and freezes. The logs also show nothing. I’ve two rigs with same psu, ram, hdd, riser, cables… The one with the 3070 runs like a charm. Only the 6800 rig has this issue.

I think that there could be more than one Problem. I think the freezes are an other than the fact that the core voltage oc is not applied during the boot.

Maybe a bios setting but I’ve already switched 4g on, set Gen1 etc. I’m using this board: BTC B250C OEM (5.12 03/26/2021)

Hate to suggest “breaking” something that works, but have you moved swapped GPUs between the rigs to eliminate that portion of the system?

Edit: I am big fan of standardizing on gear to support such troubleshooting/sparing. Kudos to you for your plan :slight_smile:

thank you and yes this is good as long as you do not have the same faulty hardware in both systems :wink:

I did not swapped the gpus yet. Temporally its running since some hours. The strange thing is that the longest run was almost 26 hours. Then sometimes it only runs for 4 hours or just an hour. I’m not sure if a kernel panic would be shown in the terminal when miner runs. Now I’m only on the console without any programm in the foreground.

So far I have only known such random crashes in connection with thermal problems (but the gpus are good and 8 GPUS with 2x 1000watt PSU should also be ok) or kernel panics when the driver crashes.

grafik

The best thing I could do is to seperate the gpus and run the rig only with two at the same time to isolate the error when it is a GPU error.

Today almost at the same time (4.30am) the rig was frozen and rebootet again.

I’m acutaly updating the kernel. I saw that although I had reinstalled hive, I was using a relatively old version. grafik

Now I am on 5.10.0-hiveos #83

I noticed that the 3070 Rig runs on Kernel 72, even I used the same installation image. I hope that this was the crucial point.

I still have rigs on #72, which was the prior stable Kernel from November. With at least 0.6-212@211124 added, you still get updated nvtool:

I have the same issue with the core voltage being higher than my overclock setting after a reboot. Did you solve this other than manually resetting overclocks?

Also have a 6800 xt that keeps starting with reduced hashrate and have to reboot up to 2 to 3 times to get it running normally

This topic was automatically closed 416 days after the last reply. New replies are no longer allowed.