I’m using HiveOS and it’s been giving me alot of problems. Sometimes a specific GPU stops mining, and it shows “err” I also lowered the OC from 2400 to 2100 and still hasn’t helped. My temps are pretty low as well. My main issue currently is a few times a day the rig and the OS completely freezes and I need to power cycle it, and then it happens again a few hours later. I have set up Watchdog but it doesn’t seem to help with this. New Ram did nothing too.
Specs:
Asus TUF z590
8gb ddr4 2666mhz
120 GB Patriot Burst
2 x 1300 watt Seagate (each powering 2 gpus, and one of them is powering the mobo, cpu)
4 x RTX 3090 EVGA FTW3 (OC: 1200 core, 2400 mem)
HiveOS 0.6-212@211130
Ethermine TREX miner
My risers are powered by PCIE as well, and it says im pulling 1.1 - 1.3kW
All I want is it to run 24/7 with no issues, any help is appreciated.
in the future you can just paste the screenshot into the reply box instead of using a 3rd party image hosting site, your core clocks are a bit high, i would lower them to 1140mhz, and increase fan speed to 100%. that should improve stability a lot.
Yes I have a display hooked up to it, where the entire screen is frozen, until I power cycle. Also on the hiveos website it just shows as rig offline. Also the ethermine website says I have stopped mining too.
type or paste code h0220301 21:08:29 GPU #3: using kernel #2
20220301 21:08:30 GPU #2: using kernel #5
20220301 21:08:31 GPU #0: using kernel #4
20220301 21:08:33 GPU #1: using kernel #2
20220301 21:08:35 [ OK ] 1/1 - 493.96 MH/s, 45ms ... GPU #3 | 4.31 G
20220301 21:08:41 TREX: Can't find nonce with device [ID=2, GPU #2], cuda exception: CUDA_ERROR_LAUNCH_FAILED, try to reduce overclock to stabilize GPU state
20220301 21:08:41 WARN: Miner is going to shutdown...
20220301 21:08:41 Main loop finished. Cleaning up resources...
20220301 21:08:41 ApiServer: stopped listening on 127.0.0.1:4059
20220301 21:08:43 T-Rex finished. ere
I just reset the OC on GPU 2 and i’ll let you know what happens. I will try and increase it from there. I set it from 2400 to 2100 with still the same issue. What do you think would be an acceptable clock speed?
So I solved the issue by replacing the riser and trying a new USB port, I have been testing for about a week for stability also gradually increasing the OC, it can now reach 2400 stable. Thanks for all the help