15 Jul

In need of rest

I recently had some odd issues with Windows 10 Pro (22H2) at home. For the longest while, I had working sleep. Then, all of a sudden, I was deprived of this essential talent. After a career in IT spanning three decades, I still sometimes have to stop and be amazed at how simple things manage to get messed up. Things like printing, connecting peripherals to computers, docking laptops and now sleep. How does one break sleep? Well, first of all we can talk about the virtues of S0 sleep vs. S3 sleep. The vogue is that computers don’t sleep anymore – not really. Like a lidless eye that never rests, they enter a limbo of sorts, where they make noises, connect to bluetooth devices or even establish VPN connections, but they are still ostensibly sleeping. Is the word gaslighting?

I hate it with a passion. A sleeping person does not interact with the world in any meaningful fashion, so why should a computer? Why would I want headphones to connect to a sleeping laptop, so that I then have to open the lid, rousing the black slate of misery from its half-hearted slumber only to disconnect them or turn off bluetooth, and then returning it to that blessed state. Most of my bluetooth headphones and earbuds are incapable of connecting to two devices at once, so whoever snaps them up first holds on to them. I do get that you could sometimes want a peripheral to wake a computer up. Usually you do this by wiggling the mouse, or hitting a key on the keyboard. But bluetooth audio devices? Who does that?

Anyway, my Windows 10 desktop spends most of the day asleep. I wake it on occasion to play some WoW Classic, or perhaps to read emails that I don’t sync to my phone, because it’s already beeping every time a leaf falls somewhere. Two days ago it stopped doing that. I will attempt to describe the symptoms first, before delving into the more complex epidemiology.

  • Screens are in standby, so they are working as intended
  • Power led is not blinking as is the case when it’s sleeping. It’s burning solid like when it’s on
  • Hitting a key on the keyboard does nothing
  • Wiggling the mouse does nothing
  • Connecting a USB peripheral does nothing
  • Pressing the power button shuts it down instantly.
  • Manually placing the machine into sleep (start menu -> sleep) worked! But waiting for the idle timeout caused the above issues!

What I did

I did five distinct things, one of which solved my problem. I realize this may not help you, dear reader, but maybe it points you in the right direction. The things I did were:

  • Replace the USB cable between my computer and my screen. My mouse and keyboard are hooked up to my screen to reduce cables going to my case. The reasoning here was that maybe it wasn’t sending the right wakeup signals to the PC so it could wake up. The flipside of this is that the computer is probably not going to sleep – as indicated by the solid power LED. So this is probably not it.
    • Edit: Changed back to the original USB cable. Unless it’s some funky thing like a twisted/bent cable somewhere I can’t see like inside my monitor arm, I don’t think this is it. It could be the port in the PC tho, so if this repeats I will try a different port
  • Uninstall Brother iPrint and associated software. In the technical ho-hum below, you will see that the Brother software is responsible for some odd Events in the Event Log. But: Since they have been present for months, this is probably not it either. Also, I’ve had this software for as long as Windows has been installed. So another unlikely.
  • I set each Universal Serial Bus Controller device to not go to sleep via Device Manager. I.e. as Administrator, properties on each USB controller, Power Management tab, and make sure Allow the computer to turn off this device to save power” is not checked.
  • Reset the power options via admin command prompt: powercfg -h off, reboot, powercfg -h on, reboot.
  • Ran sfc.exe /scannow, which found some issues and fixed them, but annoyingly searching for the word corrupted or corruption in the associated CBS log didn’t conclusively say anything, and the amount of fixed issues according to the log is zero?

Out of all these, the last two or three seems the most probable. I’ll try to replicate this and come back with an update post later. The end result is that I have sleep functionality back – both when I click sleep manually, and when I let it idle for 30 minutes and go to sleep on its own.

Only the mad should proceed

This led me down such a rabbit hole, without a clear solution, that only the discerning individual should proceed beyond this point. Speculation and decreasing understanding follows. You’ve been warned.

Obviously all this is less than optimal behavior for a machine. Troubleshooting ensued. I started by looking at logs. In the system log, the following errors can be seen for every instance of failed sleep:

  • Event ID 6008 – The previous system shutdown at 20.13.28 on ‎14/‎07/‎2023 was unexpected.
  • Event ID 161 – Dump file creation failed due to error during dump creation.
  • Event ID 41 – The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
  • Event ID 1001 – The computer has rebooted from a bugcheck. The bugcheck was: 0x000000a0 (0x00000000000000f0, 0x0000000000000004, 0x000000000000000e, 0xffffaa8748cc1040). A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn.

This doesn’t get me much, so I continued over to Power Options, but they were unchanged. Screen turns off in 15 mins, and the machine sleeps at 30 mins.

My next guess was that something in device manager had changed. I previously had some issues with the Intel Management Engine device, which was essentially restarting the computer instead of letting it go to sleep. But no, that setting was as I left it as well. For anyone stumbling on my page with this issue, the setting for the Intel(R) Management Engine Interface, under Device Manager, and the Power Management-tab is to not check the “Allow the computer to turn off this device to save power”. If I check that, sleep does not work.

My gaze then turned to updates, as that’s the perennial enemy of Windows users the world over. Something had changed to cause sleep to now crash my computer hard. Two updates had been installed recently:

  • KB5028166
  • KB5028849
  • KB5028937
  • Some Windows Defender Updates

I mostly discounted the last one, but I also didn’t find any mentions of sleep issues with those KBs, or in the Reddit Patch Tuesday megathread. Is Reddit still a thing by the way? Not being a big reddit guy, I haven’t followed the recent kerfuffle except in passing.

Into the logs we go. Looking at the System and Application logs in Event Viewer, I had two initial suspects:

  • Event ID 7009 – A timeout was reached (45000 milliseconds) while waiting for the Intel(R) TPM Provisioning Service service to connect. (in the system log)
  • Event ID 65535 (in the application log) from either USBAppControl and WorkflowAppControl. All Critical, but with varying error messages such as:
    • Stop Server
    • Stop Broadcast Receiver Server
    • A blocking operation was interrupted by a call to WSACancelBlockingCall
    • Stop Server

There was also a third suspect in the Applications log, Event ID 63 from the WMI source, with the warning: “A provider, IntelMEProv, has been registered in the Windows Management Instrumentation namespace root\Intel_ME to use the LocalSystem account. This account is privileged and the provider may cause a security violation if it does not correctly impersonate user requests.”. A quick google showed that this should be safe to ignore. This left the other two errors.

The TPM one has shown up since the 12th of May this year, funnily also co-inciding with the Patch Tuesday week, and the exact day that my computer installed KB5026361, and KB4023057. Probably just a big coincidence. But since it started happening earlier, when sleep was still working, I discounted that one as well. This left just the oddly familiar Event id 65535. Ah yes, that’s the maximum size of a 16-bit unsigned integer. The highest possible Event ID. Probably just a coincidence.

By my awesome deductive skills we’ve arrived at that last error, which points at Brother’s printing stuff – their iPrint software thing. But this error has been in my logs since I installed thar particular piece of software! What I’ve now come to regard as the distinctly oxymoronic normal errors were things like:

  • Host.AddressList.Length: 2
  • Wait Workflow Commands request from device.
  • Start Server… [also somehow an error level event…]
  • Value cannot be null – Parameter name: ipString
  • Start Broadcast Receiver Server… [again, how is this an error?]
  • Host.AddressList[1]: xxx.xxx.xxx.xxx [the IP of this PC]
  • Host.AddressList[0]: fe80::xxxx:xxxx:xxx:xxxx%12 [the link local ipv6 address of this PC]

That should be a complete list, at least for my PC. But, as I mentioned, scrolling back I see that these “errors” have been present since I installed Windows and this software/driver. So what the heck is going on! Seems this isn’t the culprit either.

I will caution the reader that we are entering terra incognito on my part here. I do not have a good grasp of windows debugging except for recognizing words and patterns.

Onto the memory dump then… Open up WinDbg, which you can get here. Browse to the C:\Windows\MEMORY.DMP file and hit the !analyze -v link to get more details. The salient bits were:

Start by looking at the bugcheck_code, which in my case is a0, or 0xa0. Looking it up on Microsoft’s site, we get: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0xa0–internal-power-error

Not terribly useful as I already gathered we’re dealing with a power state error. The first and only argument is Arg1, which is 0xf0. On the same page we can see it means: “The system failed to complete(suspend) a power transition in a timely manner.” We can also see this similar text in Event Viewer, so not much in the way of new information. The faulting thread in my case was ffffaa8748cc1040. After that we get “TAG_NOT_DEFINED_202b: *** Unknown TAG in analysis list 202b”, which doesn’t tell me anything straight away. Neither does the STACK_TEXT:

We can see that it contains some expected bits like Swap-related stuff, Transition to Sleep and so on. If we do !thread ffffaa8748cc1040 we get:

So this tells me uh.. Well. The Owning process looks interesting at least, as well as the call sites starting with !nt, and mentions of “KernelMode” tells me this is fairly lowlevel stuff and not a driver at all. I tried !process ffffaa8740eea080 which was the Owning Process in this case. This gets me an ass-load of output, which looks to me like all the threads associated with that process, things like nt!KiSwapContext, !ntKiSwapThread and !ntPopTransitionToSleep.

I’m starting to reach the point where I don’t know how to go on anymore. We have a trap frame in there somewhere, which was ffff8101`02598a00, and .trap gives us:

Here I can see stuff that looks like memory registers, rax, rdx, rip and so on – things I have last seen in 2005 during a course on x86 Assembly. We’ve reached the terminus I think; my terminus anyway. The final bit was the !blackboxpnp command, which shows information regarding plug and play devices somehow related (again, we’re way at the deep end of the pool). That gave me the following:

Problem Code 24 translates to “This device is not present, is not working properly, or does not have all its drivers installed. (Code 24)”. The DeviceId is the last thing, and I can translate that to a device hooked to the USB ports on my Lenovo screen, which shows up as a USB Billboard device for some reason:

A vendor ID lookup for 0x0bda just ends up with Realtek Semiconductor, which makes a large selection of chips – things like network adapters, audio chips and so on. But, through the art of deduction, I can rule out my mouse, which is 0x1e7d, and 0x05ac which could be my keyboard. I’m not sure if what remains is just a chip in my screen because I only have two USB devices hooked to the screen. My reasoning for this showing up in the dump is, that I used the mouse and or keyboard (hooked to the screen), to try and wake up the computer, and then ending up with a crash? But as I mentioned, I believe the machine failed to go to sleep (LED solid, not blinking) in the first place, so there could not be a waking up in that case.

I did end up changing the USB cable going to my screen. It’s a USB A – B affair, 2.0 in generation. I suppose there could be something wrong with that but that just seems so far fetched.

Leave a Reply

Your email address will not be published. Required fields are marked *