I am pretty confident that I do. I am able to use the nvidia-smi command with no issues and I actually now that I reran nvidia-smi I do see python so I’m wondering if it’s using the gpu, but the fans just aren’t running because of the colder environment?
I get this from nvcc --version I am just going to call it good unless it seems like things are still running slowly.
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
For a couple of months after purchase of 1080Ti (which I had got shipped from US), I was unaware of the fact that the fan was in a non-working condition out of the box. I too was under the impression that the fan would kick in only if the GPU felt the need, so never checked it.
On testing after a couple of months, found that the fan would not trigger in any condition, and the GPU would trip if temperature exceeded 90C. Getting it replaced would have cost me 1/6th to 1/5th the price of the GPU due to shipping and customs duty (I had to bear the cost of sending it to them and paying the octroi on return - this is EVGA Taiwan).
As a hack, I opened up the top acrylic cover to expose the internal heat sinks. Then set up an arrangement of 3 fans (2 blowers and 1 exhaust) for cooling. Looks badass now.
With my ambient temperature being around 15C.
I don’t know what model of GTX 1080Ti you have but keep in mind that the higher voltage you set (either by default of manually) on your card will inevitably get you higher temps.
I don’t know how to retrieve the core voltage of the GPU on linux but here are the results of the power draw in watt from the nvidia-smi -i 0 -q -d POWER command:
Timestamp : Wed Nov 8 16:38:18 2017
Driver Version : 384.59
Attached GPUs : 1
Power Management : Supported
Power Draw : 10.46 W
Power Limit : 250.00 W
Default Power Limit : 250.00 W
Enforced Power Limit : 250.00 W
Min Power Limit : 125.00 W
Max Power Limit : 300.00 W
Duration : 116.01 sec
Number of Samples : 119
Max : 11.23 W
Min : 9.20 W
Avg : 10.14 W
Also when the card reaches a threshold temperature (which you can get with nvidia-smi -i 0 -q -d TEMPERATURE) your performances will inevitably decrease as the GPU will automatically downclock itself to not reach the “Shutdown temperature”. Here are my results:
~ ➜ nvidia-smi -i 0 -q -d TEMPERATURE
Timestamp : Wed Nov 8 16:41:47 2017
Driver Version : 384.59
Attached GPUs : 1
GPU Current Temp : 26 C
GPU Shutdown Temp : 96 C
GPU Slowdown Temp : 93 C
Memory Temp : N/A
I already stress tested my card (which is the first thing I do when I receive a new one) and I never went above 80C even during summer when ambient temperatures were around 30C.
What you can do instead is replacing the cooling block on your card by a watercooling one. Like this one. Very clever finding this hacky solution btw but I don’t think it will hold if you use your GPU at 100% for days (well you tell me aha ).
The deterrent was that I saw videos of how these are fit. It involves opening up the GPU (50 to 80 screws, big small and tiny) and exposing the chip. Given that there are no local service centers in my country, this was a very risky proposition. Very high chance of making mistakes
This solution holds to 85C most of the time, though I am making use of the AWS credits first
Oh okay, sad to hear that But 50 to 80 screws? That looks insane. Usually when you unmount a GPU you have 3 parts:
The “radiators” (the metal grid)
The plastic mount (the thing on top of the metal grid)
You only need to unmount the part between the board and the meta grid to put a new one. See here (I’ve set the right timer). But well if your solution works for you then kudos for that . I’ll keep it in mind for when my own card will die.
Btw just a random thought: Maybe your fans aren’t spinning because the power connector between your GPU board and your plastic mount (where you have the fans) aren’t properly connected? (you can see on the video link above what I’m speaking about). Did you check that you used the right power stripes from your power supply on your card? Having 1 fan not working is ok, but having the 2 not working looks like a “power shortage” issue.
Yes connection can be a problem. Unfortunately the one I have (EVGA 1080 Ti FE) has a blower style fan, and the connection goes inside the GPU. I will have to open all the 50 screws to check this, unfortunately.
So probably I will take a chance once this gets a bit old