I figured out a solution to a hardware troubleshooting problem I’ve had going off and on for at least a couple of years. I feel like others might run across it and, not knowing anywhere better to note it, suppose I might as well dump it on my blog and hope search engines can make it available to those who need it.
Anyways, the problem is that sometimes my computer wouldn’t boot. Particularly after power outages (in retrospect, clue 1). Sometimes resetting the CMOS would work. Sometimes resetting CMOS wouldn’t work but pulling various cards / memory modules around and reseating them would work.
Once when I encountered this issue, I discovered that plugging in my monitor using HDMI instead of DisplayPort worked (clue 2). This, however, lead to a separate problem which took me months to figure out (HDMI only supports 30 fps on my monitor’s native resolution, an issue fixed an a later rev of my monitor’s hardware).
When the issue recently started to recur (after a weather-related power outage), I discovered that plugging in an alternate GPU (an incredibly cheap emergency backup discrete unit I have) also worked. This one only had HDMI but it couldn’t drive native resolution anyways, so at least I didn’t have to worry about 30fps.
The issue had gotten significantly worse when I upgraded my CPU and motherboard a year ago, so I spent weeks thinking that my GPU simply didn’t work well with the new CPU/motherboard and that it was time to find a different GPU.
The new GPU recently showed up and had the exact same problem as the previous one (oops). Everything finally dawned on me when I tried the HDMI output on the off chance that it wouldn’t be also affected by the 30fps issue (it was, but at least the computer booted!).
It turned out that the problem was my monitor this whole time. I went with a full power cycle to my computer probably dozens of time in the course of debugging this issue. But I didn’t power cycle my monitor once… doing so fixed my issue fully.
I guess I’ve been thinking that monitors are still “dumb”, that they just show pixels coming in over the wire. But that is no longer the case, hasn’t been for years, and so if you’re having issues that you think are GPU issues, don’t forget to do the “turn it off and then back on again” routine with your monitor as well!
In my case, what I think the issue was, is that the monitor had its display connection state corrupted slightly (whether due to power cycling or something else), and that the GPU was unable to complete a valid connection to the monitor during computer boot. Without a GPU in a valid state to drive a display, the motherboard would apparently abort the boot process (it had a GPU debug LED lit up, which I couldn’t figure out the cause of since the GPU always worked fine if I could get the computer to boot).
I could never find this problem described in any of my searches, but hopefully this will help someone else if they encounter something similar.