Hello,
I have 2 problems, first for some reason after 15days normal work, every half hour i have error gpu driver error…
Tried to change clocks, but same.
And after that error tig won’t restart,just die from hive, cards working so hard, temp going to 90-100and more but i can’t see at hive.
For some resaon need 2 times to restsrt to show again at hive.
Please if someone know what to do.
Cards 3060ti
B450-A PRO MAX (MS-7B86) Micro-Star International Co., Ltd (M.E3 09/28/2021)
CPU
4 × AMD Ryzen 3 1200
Can you post a screenshot of your worker overview screen?
Don’t use core offsets for modern cards, switch your -502 to 1500 and remove power limits.
If any cards are crashing after reduce mem clcoks from there, but locked core clocks and no power limit should help a lot with stability.
Ok will try.
But why after error and reboot i can’t see and gpus so hot ?
Must restart once or twice to show at hive
If you crash the driver from poor ocs anything can happen, are the fans not spinning after it crashes?
When you get the gpu driver error messages did you click on them to see which card was causing it?
Fans spining but still high temps.
Yes click but can’t find which gpu exactly has a problem
Here js error.
6m 28s - GPU driver error, no temps, rebooting
GPU Health Data: 21:00.0 Temp: 62C Fan: 51% Power: 135W 25:00.0 Temp: 62C Fan: 69% Power: 135W 26:00.0 Temp: 62C Fan: 61% Power: 135W 27:00.0 Temp: 62C Fan: 83% Power: 135W 28:00.0 Temp: 62C Fan: 64% Power: 135W 29:00.0 Temp: 62C Fan: 90% Power: 135W Latest GPU driver errors list: Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000010 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000011 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000012 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000013 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000014 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000015 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000016 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000017 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000011 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000012 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000013 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000014 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000015 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000016 Apr 18 14:19:52 Djuro2 kernel: NVRM: Xid (PCI:0000:26:00): 45, pid=31173, Ch 00000017
alczaty:
(PCI:0000:26:00)
Here’s your problem gpu, reduce oc on this one
How to know which one is that ?
Edit: found. Thank you. Will try now and will see
You’re using locked core clocks now right?
And problem still continues.
Changed riser and same. What to doo ?
reduce and reboot, repeat until stable
Tried, but i think i found. I rotate riser, and always at different card. So will try with third riser.
Some old Memories Hynix doesn’t accept very well high OC and need to go around 1750-1800. The new versions can accept lot more OC, some people call them Hynix V2
How i can test best 3060ti 2 cards?
Tried every clock,changed risers mutiple, but generate dag errors and after 10min crash.
Still crashes at 0 memory?
Will try with 0 mem, but tried with 1500 and same problems.
What to set for core clock?
Find the highest stable mem clock
Then find the lowest locked core clock that maintains full hashrate
If it’s not stable, lower mem clock and reboot
Repeat until stable