Yes, all my AMD cards are PowerColors.
HaloGenius, I think you can tell to devteam to put a comment on 5.54 upgrade, about this specific problem? At leat write it in the changelog maybe
@Bagster
Iām already reported about this nuance at RU-community with announce new version - telegram channel, forums, etc and also at bitcointalk.org forum
Russian community stick my announce (in Russian ofcourse) in RU channel. Yes, Iām agreed it would be much better if this information contains in Announce channel.
Bitcointalk anounce was such:
v0.5-54
- IMPORTANT! New AMD undervolting technique, more reliable core voltage states now. Though itās slower to apply.
- Invalid shares reported by Claymore for each GPU
Additional notice:
In version 0.5-54, a new mechanism is used to reduce power consumption. Along with this GPU core voltage,which previously set, may need to be adjusted.
If this Ok we can add this info to Announce channel also. Or you can propose more informative variant
How about reverting it or offering the old version in addition to the new version given it has issues for at least those with PowerColors?
Maybe just a note like this:
WARNING: New power consumption mechanism seems to introduce fan settings problems on some Powercolor (to be determined) cards.
Is over 40 cards personally enough to validate that? I have 570ās and 580ās, Red Devil, Red Dragonās, Golden and non. I have a whole DC full of these cards, so several hundred, and I canāt in good faith upgrade to the latest HiveOS until this issue is resolved. Iām trying various bios changes, over/under clock changes, etc. And keep getting GPU hangs or 0% fan speeds. Revert back to the previous amd-oc, no issues.
@seanjnkns, can you try moving fan (lines 200-201 on the v0.5-54 amd-oc) above core/voltage (lines 174-193)? If it doesnāt work - can you delete or put # before the fan lines?
Something is causing wolfamdctrl to break on your cards, but I donāt know what.
Testing moving the lines now.
For information (if others have similar problem) rebooting was not sufficient for me to correct the problem. Have to shutdown and remove power
@brnfex, I tried your suggest modification and that worked across 4x10 RD570/580 mixes of Elpida, Hynix and Micron ram.
@Bagster, simply powering off temporarily would fix the issue, but a subsequent reboot could and typically did, undo it.
Glad to hear that, @seanjnkns So you moved the fan lines before core/voltage ones and it worked? Are you using fixed fan speed, or left it on auto?
Fixed fan speeds of 80%. However, if you re-apply the OC settings after boot, the system is prone to crashing with a null pointer dereference kernel stack trace, or, the miner will get stuck in unterruptible D or Z state.
Nope, I spoke too soon. After further testing, the original problem of a 0% fan speed is back.
For now, reverting back to the old amd-oc. Had enough downtime guinea pigging my servers.
I think we should scratch 0.5-54, i donāt think it is applicable to the āgeneralā users
I donāt even think itās a āgeneralā user issue. The āupdatedā amd-oc should be left as an option for users to use at their own risk as itās experimental, while keeping the stable one present. Just like was done with dual claymore for example. You can have latest, or you can have a prior version. Those that want to be guinea pigs can, those that opt for stability, can use a prior version.
Can you try to increase the overclock to 950mV/1150MHz and check if the cards are stable?
Iāll let someone else guinea pig at this rate. Changing the settings doesnāt change the fact that you still get null pointer derefs with this amd-oc, even with the lines moved around in the code, youāll get 0% fan speeds, etc. If someone else wants to spend a few hours playing with this, more power to them I just spent 3 hrs on this trying all sorts of settings and the only thing that was stable, and Iām not just talking about lack of GPU soft lockups, was the older amd-oc.
It could be caused by a crashed card after too low voltage or too high overclock. The cards ran at roughly 950mV before the patch and didnāt go lower, regardless of the value in the web dashboard. The only thing that is changed is that all voltage and core states are overridden - it shouldnāt affect the fan at all. So Iām trying to understand why your cards are crashing while everyone elseās arenāt.
Yeap, this should really be left as an option for the mean time, then for v0.6, OC changes can be managed in the web UI as well,ā¦
Iāll try to mess around with amd-oc this weekend,ā¦ my 6-gpu test bench should be enough, heh, hopefully it wonāt take soo much time to get it right, time is $