Earlier this week, I bought myself a new 2TB NVMe drive on sale, and had planned today to install it. Knowing that the computer, built in Dec 2021 had an issue with PCIe at the time that prevented 4x drives from running at 4x, but now had a BIOS update that fixed that (and introduced support for two additional CPU generations), I thought I should go ahead and flash the BIOS too.

Sparing unnecessary details, everything went fine. I unseated the 3060Ti to get at the M.2 ports, swapped out the old drive for the new one. Plugged in a spinner I had laying around while I was in there. Machine wouldn’t boot - so I reinstalled GRUB (yes, I’m a Linux user) and that was fixed. The disk had everything copied over from the old one, so that was fine too, other than some power management spam in the journal that was fixed with a kernel param. Seems fine, right?

So I spin up Minecraft to have a quick session. Frame rates are like, 3-20 fps for about 2 full minutes, then they get up to 60fps (Vsync’d). But every few minutes, they keep dropping and creep back up. Sometimes they bounce up.

There are a dozen potential problems at least that could cause something like this. So I went back into the newly flashed BIOS knowing that all my settings were reset, and started poking around and fixing settings. This was good either way, but it didn’t fix my sporadic fps dips. The game ALWAYS started at sub-10 fps, but it would get to 60 eventually. Sometimes in a handful of seconds, sometimes in minutes. One time, all the textures refused to load. Another time, the GPU fell off the bus.

This was not a problem at all prior to the work, so I started reviewing what I’d done so far. I’d added a disk with all copied data… but there’s a nvidia cache in there… cleared that. No dice. I looked at heat. No problem. Same drivers as before, but maybe I have to roll back drivers, or kernel, or BIOS. I dunno. Frustrated as fuck.

Could also be a power issue. Or a reseating the card issue. I hadn’t added any devices with more power consumption and I have ~100W of buffer on a good PSU, so it really should be a problem, but I reseat the card and replug the power cables anyway.

The problem is worse.

So I spend another hour and a half poking around forums looking at every possible thing I could do on the software side to fix this, before I decide to reseat the card one more time. I mean, the first time changed SOMETHING, even if it was for the worse.

Popping out the card, I blew into the slot as I thought I saw a tiny speck of dust. I used canned air the first time but this was really an afterthought by this point…

The body of a tiny moth popped out of the slot.

I reseated the card, spun up Minecraft, and everything is fine. Perfect frame rates, just like before.

I’ll never get the hours back that I spent troubleshooting this today. At least several other items were improved along the way, such as RAM timings. But in the event this story helps anyone in the future, it’s worth the time it took to type it.

    • funkless@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      literally the origin of the term - that bugs (literally) used to get into computers and mess them up

      • KevonLooney@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        No, it isn’t the origin of the term.

        The term “bug” to describe defects has been a part of engineering jargon since the 1870s[7] and predates electronics and computers; it may have originally been used in hardware engineering to describe mechanical malfunctions. For instance, Thomas Edison wrote in a letter to an associate in 1878:[8]

        … difficulties arise—this thing gives out and [it is] then that “Bugs”—as such little faults and difficulties are called—show themselves[9]

        https://en.wikipedia.org/wiki/Software_bug.