Case of the high disk activity


A Blog by Ben Lavender

So it’s finally Friday here and I was driving to the office this morning when I thought about another idea for a tech-blog that I could share with our customers.

Now whenever I get stuck for ideas on where to begin with a technical issue I always remember Mark Russinovich quoting David Solomon in one of his Sysinternals TECHED sessions with “When in doubt, run Process Monitor!” By the way, I suggest watching some of them as you’ll learn a whole lot about Wininternals, I’ll link in some stuff from him at the end of this blog.

So what was the issue? I was asked by my MD to look at an issue with an Apache server running Windows Server 2008 R2, this was running Apache 2.2.22 and a certain process was hitting the disk quite heavily and also slowing the site rendering speed. We knew of the high disk write counts after a quick look using perfmon.exe.

As a starting tool on Windows I also use Performance Monitor (perfmon.exe) and Resource Monitor (perfmon.exe /res) and I particularly enjoy the Data Collector Sets (DCS) that you can use as filters, but I’m more of a Sysinternals guy when it comes to analysis.

*You can use the below perfmon.exe DCSs if you want to look deeper into Disk I/Os:

1

 

*Also generally looking at perfmon.exe /res “Resource Monitor” live screens will give a good idea of the activity:

2

Now to look deeper and I’ll show my way of using different tools to solve these issues at different stages, using mostly Sysinternals tools.

I started Process Explorer (procexp.exe) and added in the below fields:

select colums 3

 

 

 

 

 

 

 

 

 

 

 

 

 

Then sorted by Disk Writes and looked at the process associated, note this isn’t an actual screenshot from the server as I only saved the .pml file from the procmon.exe trace:

4

 

So at that time I knew that from the procexp.exe results that one of the httpd.exe processes, the child process of the primary httpd.exe process was writing a lot of new files to the directory below:

5

 

Now I moved to Process Monitor (procmon.exe) probably the most widely used Sysinternals tool. As I knew what process to look for, I set a trace going for 5 minutes then stopped and filtered by PID 544 and then added in filters to include Operation=Is=WriteFile and Operation=Is=ReadFile. Now this proved that this activity was writing .php files in batches to the \public directory but also there was a lot of information so I cleared those filters and set Path=Contains=\Apache2.2\htdocs\templates_c\public to see all results for that path, again popped up httpd.exe 544 again along with the usual process of explorer.exe.

You can see the quick succession of the file creations periodically, though the reason as to why it’s creating files in these directories is unknown. So to satisfy our suspicions that the httpd.exe PID:544 is creating the files we stopped the httpd.exe service (not recommended if you’ve got a popular website, obviously) to see that the file creations stopped, and they did.

So it was clearly this process that was creating the files but why exactly? I then decided to look at the call stack of one of the writefile events to see if there was any .dlls being called that either out of the ordinary or suspicious, funnily enough I found the php5ts.dll:

6

 

Notice I haven’t configured symbols but looking at this trace it provided the suspect straight away, also looking at the trace modules I can see the particular .dll and its location:

7

 

So now we knew that it was potentially a PHP script that was creating the files, maybe would have been more clear if I’d configured symbols so I could see the functions though I’d found enough info. We contacted the web developers and asked if they’d configured any PHP scripts that would create batches of these files and they advised there was such a script that should be disabled in the production environment that may have not been disabled. This particular scripts create log files in .php formats every time a user visits the site and of course this script was enabled. It was then disabled and the scripts stopped and the disk I/O returned to acceptable levels.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s