Monday, December 4, 2017

Excessive Modified Memory

Logged into my workstation today and start noticing "Out of memory" errors. Whaatt? This computer has 8GB of memory and doesn't run that much!

Resource monitor showed excessive "modified" pages like this (not my screenshot, but shows how it looked):

Well, thats odd.. lets fire up google search.... 30minutes later and I've hit the solution (that many others have before more):

https://what.thedailywtf.com/topic/17472/finally-nailed-my-windows-memory-leak-a-k-a-the-official-we-hate-karl-club?page=1

Cause is due a bad realtek application, "runsw.exe".

The final kicker was the 800,000 HANDLES the application had open, you can see them in task manager if you add the handles column.

The service appears to be some kind of realtek monitoring/watchdog program that talks to another service.

Ok then, handle leak, lets find out why. First, let fire up API monitor, a wonderful program I've used before to solve issues.

Well well, lots of calls to Process32Next. Lets get it loaded up IDA and have a look. Browse the import table, do a reverse cross reference, convert to pseudocode, and tada, we get a function:


The function seems to be looking for a process using CreateToolhelp32Snapshot, then Process32First and Process32Next, then exiting, on first glance, no issue, until you consider the CreateToolhelp32Snapshot call, from the documentation:

If the function succeeds, it returns an open handle to the specified snapshot.
To destroy the snapshot, use the CloseHandle function.
Well, i'm not seeing any CloseHandle function, and the return is only being stored in esi (the HANDLE v0l // esi@1 line ).

We have located there memory leak. Looking at what is calling this function, it appears to be called once a second, so it's leaking at least one handle a second. Over time, these handles build up and use up Windows memory until Windows is either restarted or dies with out of memory errors.

A poorly designed executable from RealTek. I've emailed them about it but I don't expect it to be fixed, the memory leak has been reported about for years with no resolution.
 


Monday, September 25, 2017

Skype for Business connection issue

We recently had a Skype/Lync connection issue where the user would enter their details and the Skype client would stay at "Connecting" forever.

Our setup is a local Active Directory (with a .local domain), and a Office 365 subscription, the local directory DOES NOT sync (for various reasons) with Office 365, so the accounts/passwords could be different between them.

We have also been preparing to sync the directories and had setup a UPN Suffix for our .local domain using the Microsoft page here:
which we considered might have also been a contributing factor.

We tried a bunch of stuff to try to fix the Skype login issues:

  • Caches were cleared
  • DNS was flushed
  • DNS entries were checked
  • Temp files were deleted
  • Credentials were deleted 
  • Certificates were deleted
All of the standard troubleshooting techniques failed.

There were two interesting parts in the log files. The first was:

SIP/2.0 401 Unauthorized
after a SIP REGISTER a
No Certificate
error further down. No other information in the logs was helpful.


This seemed to be a very common issue with many possible fixes popping up over the web.
  • https://www.michev.info/Blog/Post/1235/lync-and-mandatory-profiles
  • http://ucken.blogspot.com.au/2011/10/lync-loses-connection-every-8min-28sec.html
  • https://social.technet.microsoft.com/Forums/lync/en-US/c3c7567a-ffd4-453b-a0d3-e79b06e92f23/client-cant-login
 But no single fix worked. The account was able to logged in on another domain connected computer which was also strange and pointed to some local problem.

We finally got a break by looking the the "SigninTelemetryLog.XML" file that Skype created, we noticed text like "GetBestManagedCredentialByType" and determined that Skype (for whatever reason) was trying to use the local domain NTLM authentication tokens to authenticate instead of the passwords entered by the user for use in Office 365. Since these two would be different then it would be unable to authenticate properly.


We then enabled the registry key DisableNTCredentials, listed here: Manage two-factor authentication in Skype for Business Server 2015 which let Skype login without issue.

All in all, this was a very time consuming and difficult issue to diagnose. We didn't feel that the authentication process was logged by Skype to sufficiently to precisely determine what issue was. I presume that using a local Lync server we would access to more debugging tools that might have made it easier.