I’m hoping the forum can provide some guidance on my 8 GPU rig going offline a few times throughout the day and the only error that shows is “nvtool error (3)”.
I’ve updated HiveOS to the latest version and using the nvidia driver below. I’ve lowered my overclock settings, but the problem still happens. Where it mines for a few hours then it just stops and goes offline.
Could it be I’m using the wrong nvdia driver? Or my overclock settings too high?
Are there log files I can look at within HiveOS that might show more detail on why it’s going offline? I can’t seem to find where log files are stored.
Below are my specs.
HiveOS → 0.6-218@220615
nvdia driver → 510.60.02
Miner → lolMiner
Mining ETH on Hiveon Pool
Motherboard → TB360-BTC PRO 2.0 BIOSTAR Group (5.13 06/08/2021)
CPU → 6 × Intel(R) Core™ i5-9600K CPU @ 3.70GHz AES
GPU’s
4 x 1660 Supers (flashed the bios to get higher mhs)
1 x Non-LHR 3060ti
2 x LHR 3060ti
1 x LHR 3070
Are you on the latest stable image (kernel 110)? If not start there.
Switch your 3060ti to a locked core clock, and fine tune the other 3060tis, 1600 core is more than you need. The goal is to find the lowest locked core clock that maintains full hashrate. You do not need a power limit with locked core clocks either.
Thank you for the guidance. I updated the kernel to the latest stable version (#110) and setup locked core clocks on my GPU’s. The mhs is a bit lower, but so far so good in terms of stability.
It hasn’t gone offline since I did the actions last night. I’ll keep monitoring, but if it stays online for 24 to 48 hours I’ll start playing with the overclocks again to see if I can get a bit more mhs.
I spoke too soon on the stability. The rig is still going off-line and it seems to happen more during the day when the temperature is hot. It’s been in the 90’s here in southern California. In the evenings the mining rig seems to be more stable.
I’ve played with the overclocks, but it still goes off-line. This is so strange as the rig was pretty stable up to a few weeks ago. I don’t recall what changed since then.
Would you happen to know how to look at the logs to see if that reveals anything? I might also try unplugging the GPU’s and adding them back one at a time to see if that fixes it.
Unfortunately, no error messages. It simply goes offline anywhere from 10 minute to 20 minutes after start.
Last night, I reverted back to the original BIOS on the 1660S in hopes that would get it stable, but no luck.
My next troubleshooting step will be to unplug my 8 GPU’s and add them add them back one at a time to see if that somehow fixes it. Otherwise my next step is to get a new SSD drive and reinstall HiveOS on it. I’ve seen some other miners do this last step with good results.
I would start with more conservative memory clocks to test, maybe around 1000. Leave the core clocks where they’re at. If it’s stable work their way up.
I changed my memory clocks to 1000 on my 3060ti’s and 3070 and seemed to work as it was mining for about 12 to 13 hours straight, but around 10am this morning it went off-line again. No error message or anything.
I’m going to turn on the logs to see if that captures anything in the background that could tell me what’s going on.