31 August 2008

Team System Web Access 2008 SP1 has shipped

Just wanted to drop a note out to let everyone know that the Team System Web Access team had released SP1 on Friday.

Some of the cool new features include:

  • Ability to run multi-language in a single TSWA instance.
  • 10 languages supported
  • Work Item only view for users without a CAL (excellent!!)
  • more

Check out Ed Hintz's blog post linked below for the full list.

Download: http://www.microsoft.com/downloads/details.aspx?FamilyId=3ECD00BA-972B-4120-A8D5-3D38311893DE

Ed Hintz’s announcement to see what’s new in this release: http://blogs.msdn.com/edhintz/archive/2008/08/29/team-system-web-access-2008-sp1-power-tool.aspx

28 August 2008

Windows Server 2008: Svchost process is running constantly at 50% CPU

As you may remember, a while back I decided to repave my Notion laptop with Windows Server 2008 Standard x64 so that I could make full use of the 4gb of ram and start playing with Hyper-V.

Since Win2k8 is a server OS, it doesn't come through with a bunch of niceties like Vista x64 does.  In fact, the base installation has most services turned off to reduce the attack surface of the OS (very nice).  That is another reason that I installed it over Vista x64.  By using an opt-in model, my performance was better than that of Vista on the same hardware.  At least it was until I added the Hyper-V role...

As you may have read in my earlier post on Hyper-V, once you install this role your OS is fundamentally reconfigured.  Your OS originally sat directly on top of the hardware but after installation, your OS becomes the "Parent" partition (VM) and the Hyper-V layer sits between the hardware and all of the VMs.

HyperVArchitecture

So what's the problem?

After configuring the Hyper-V role I began creating VMs to play with.  I first ported the TFS "Rosario" CTP VM from Microsoft from Virtual Server 2007 to Hyper-V and then ported the VSTS/TFS 2008 All-in-one VPC image.  Once these two were installed I connected my copy of Visual Studio 2008 Team Suite to the TFS 2008 instance on the VM to see how it all worked.  I noticed there was a bit of a lag when working with the VM but it wasn't anything that I couldn't live with and I really didn't have the time to investigate further.

How was it investigated?

Then came Mike Azocar's post about setting up your Win2k8 server as a workstation.  I had initially configured my laptop using most of the same posts that Mike referenced.  The one I didn't use was the Win2k8 Workstation Converter which basically automates all of the tweaks and hacks you would Beforedo by hand (and possibly screw up).  I ran most of the utilities and was able to install the Vista x64 Sidebar and then added my favorite gadget, the CPU usage monitor.  The first thing I noticed on the monitor was that my overall CPU usage was always up around 50%, even just after booting up.

This lead me to investigate further.  The first thing that I did was grab the Process Explorer tool from SysInternals Live.  After firing it up I noticed that one of the SvcHost processes was consistently running at 45% - 50%.  To determine if this was the culprit I did what anyone would do...I killed the process.  After blowing it away I saw the overall CPU usage drop to the single digits.  A-ha...I was right, that's the problem.  CPUAfterNow to figure out what app is running under the SvcHost process. 

I didn't have to wait long.  While checking email I noticed that the CPU usage had jumped back up again.  I reopened Process Explorer and did a quick scan of the CPU usage column until I found my SvcHost process gone wild.  By selecting this process I can hover the mouse over this entry to see what's running within it.  As you can see, the first thing there is the DNS Client service (DNSCache).  This is the service that keeps track of the IP addresses of sites you've visited so that you don't have to keep making round-trips to your Nameserver for resolution. 

When I killed the process and checked the DNS Client service in the Services applet I noticed that it was not running.  When I restarted it from the Services applet my CPU pegged out again.  I think I'm getting closer to the root problem.

whatisiit

The next thing to do is to look at the lower pane of Process Explorer to see what files, registry keys, directories, ports, etc. that are being used.  It also color-codes the entries to show opening (green) and closing (red).  As you can see from the screenshot below, the DNS Client service keeps opening and closing the same registry key:

HKEY_LOCAL_MACHINE\SYSTEM\ControlSet003\Services\Tcpip\Parameters\ DNSRegisteredAdapters\{5F0F822F-7D54-4716-8809-65A83C851F8F}

ProcessGoneWild

What does {5F0F822F-7D54-4716-8809-65A83C851F8F} refer to?

I grabbed the GUID from the registry key, opened RegEdit.exe and did a search to see if I could figure out what the problem was.  The first hit (shown below) showed that the GUID belongs to the Microsoft Virtual Network Switch Adapter.  This device was created to allow the VMs to share the single network connection on my machine under Hyper-V.  So that's why I didn't see this behavior prior to installing the Hyper-V role.

FoundIt  

So why is it having problems accessing registry entries? 

To find this out I had to resort to another SysInternals tool; Process Monitor.  The first thing to do is to press CTRL + E to stop it from collecting events.  This improves performance greatly.  Next, clear out the event window (CTRL+X) so that when you apply the filter it doesn't have any work to do with the current buffer. 

Since I can grab the PID of the SvcHost process I can use it to filter down all the information that Process Monitor collects to only those that belong to this process.

ProcMonFilter

The core issue

After the filter was applied I started it collecting events (CTRL + E) again.  I stopped it after a couple of seconds since I could clearly see the problem.  The DNS Client service was trying to read a specific registry key entry that could not be found.  The problem is that it the DNS Client service's way of handling the NAME NOT FOUND result is to retry the read.  Unfortunately it doesn't ever stop trying, hence the infinite loop.

CantFindKeys

Let's go registry spelunking!

I opened up RegEdit again and navigated to the key in question.  Lo and behold, there is no Flags entry, just like Process Monitor said.

BadRegKey

To find out what else was missing I opened the key just above it.  There are a lot more keys here.

GoodRegKey

So how do I fix this?? 

The easiest way is to add all of the missing values to the broken key.  The problem is that I don't know what the correct values should be.  So when all else fails, cheat!  I opened up 3 other keys above and below the broken one and compared values.  They all had the same entries.  That made me a bit more confident.  I exported one of the good keys to a text file (keys.reg) and then modified the path to point to the broken GUID.  Once I saved it back and ran it the broken one looked just like the rest.

The real test is to turn the DNS Client service back on and check its behavior.  After turning it on I found its entry in Process Explorer and checked the lower pane.  There were no more tell-tale key open/close cycles and no excessive CPU usage.

PlayingNice 

The Root Cause

I have no idea why the DNSRegisteredAdapters entry for the Microsoft Virtual Network Switch Adapter were incomplete.  I had been having issues getting my networking up and running since I stupidly tried applying what I know about Virtual Server 2007 networking to Hyper-V.  They are as similar as VSS is to TFS Version Control so it's no wonder I screwed it up.

My pain, your gain!

And that's how Steve got burned today!

23 August 2008

Why aren't my TFS reports updating?

The Background

I was working with a client this week to install a fresh TFS 2008 dual-server instance and then add SP1 on top.  They had been having some issues getting the QA instance installed so were a bit gun-shy when it came time to install the Prod instance.  As such they wanted to verify all of the services and settings after before and after the SP1 installation.

One of the pain points they had encountered was around the Analysis Services Cube and Reporting.  We decided to verify the install by making checking the reports, making changes to some work items and then checking the reports again to see the updated information.

To facilitate this process we changed the Data Warehouse's update interval to 1 minute by following the instructions here (How To: Set the Processing Interval for the Data Warehouse).

The Problem

After making this change we waited a couple of minutes and then ran our first Related Work Items report.  We then added a new Task work item and related it to an existing Scenario work item, waited a couple of minutes and then ran the Related Work Items report.  To our surprise, the new work item wasn't listed.  We reviewed the report and noticed that the Last Warehouse Update value hadn't changed between the two report runs.

LastWarehouseUpdate 

The Investigation

I started wondering about caching of the reports so I tested this theory by running the Exit Criteria report.  This one showed a current warehouse update timestamp.

LastWarehouseUpdate2

Ok so there's definitely some report caching going on here!  I can see that the warehouse is getting updated so I closed everything down and re-opened the Related Work Items report.  Voila!...no change. 

After a bit of head scratching and complaining under my breath I decided to put all my years of debugging experience toward solving this problem.  I said to myself; "Steve..." (I call myself Steve) I said "Steve, what could be caching these reports?"  Luckily I'm quite adept at single-person conversations so I immediately answered myself with "Maybe it's the SQL Server Reporting Services that's doing it, stupid!" (I have no patience for ignorance).

The Root Cause

I started up Internet Explorer and browsed to http://{yourTfsServerHere}/reports to view the Reporting Services settings.  At the Home page i clicked on the Site Settings link (upper right corner) to check to see if this are any site-wide caching going on.  Looking over this page you see that there aren't any settings for caching at this level that don't require force you to modify the database directly (this article does tell you how to manage Report Server level caching).

I then navigated back Home and then drilled down into my BuildProject reports.  I'm going to see if the reports have some kind of report-level caching going on.  I opened the Related Work Items report.  When it rendered to the page I noticed that it also had the same timestamp as the last run. 

To check the settings of the report I clicked on the Properties tab and then reviewed the options available to me.  The Execution section looked like a good place to start.

Lo and Behold!  The first entry in this section is "Always run this report with the most recent data" and it has 3 options.  The selected (default) option here is to "Cache a temporary copy of the report. Expire copy of report after a number of minutes" with a value of 30.  PAY DIRT!!!

ReportExecutionPropertiesDefault

It looks like the reports default to caching themselves for 30 minutes after the last rendering!

The Fix

To change this behavior i modified the setting to use the "Do not cache temporary copies of this report" entry instead.  Here are the steps:

  1. Open SSRS in your browser at http://yourTfsServer/reports
  2. Click on your Team Project name
  3. Click on the report you want to modify
  4. Click on the report's Properties tab
  5. Click on the Execution link
  6. Select "Do not cache temporary copies of this report" radio button under the "Always run this report with the most recent data" section.

 ReportExecutionProperties

In Closing

After completing these steps we were began enjoying a quick Change, Save, Review feedback cycle and completed our verification of the TFS installation and SP1 upgrade.  We subsequently changed all of the caching and warehouse update times back to their default values.

The default settings for the warehouse update and caching are Microsoft's best guesses at appropriate values for most organizations.  If your Analysis and Reporting servers can handle the load and you want "fresher" data for your reports feel free to modify these values.

Settings Recommendation

After reviewing this situation I've come to the realization that it is probably a good idea to set the Report Caching timeout to around 50% of the value you have set for the Warehouse Updated value.  This way there will be minimal lag between warehouse updates and expiration of the report cache.

Additional Information / Resources

Report Manager: Execution Properties Page -  http://msdn.microsoft.com/en-us/library/ms178821.aspx

Report Caching in Reporting Services - http://msdn.microsoft.com/en-us/library/ms155927.aspx

Database Journal: Black Belt Administration: Caching Options: Report Session Caching - http://www.databasejournal.com/features/mssql/article.php/3695721

Database Journal: Black Belt Administration: Report Execution Caching I: SQL Server Management Studio Perspective - http://www.databasejournal.com/features/mssql/article.php/3699546

Database Journal: Black Belt Administration: Report Execution Caching II: Report Manager Perspective - http://www.databasejournal.com/features/mssql/article.php/10894_3701041_1

22 August 2008

How many Vista VMs can you run on a laptop with Hyper-V?

Just came across a post on Keith Combs' blahg that show him running up to 27 Hyper-V VMs on his Lenovo laptop with 8gb RAM.  Pretty damn cool!

15 August 2008

TFS 2008 SP1 Install: Failed to call WMI on the RS server error

I was working with a client this week on an TFS 2008 upgrade from RTM to SP1.  During the upgrade we encountered an error (below) "Failed to call WMI on the RS server".  This error indicated that there was a problem accessing the SQL Server Reporting Services system. 

08/13/08 11:13:18 DDSet_Error: *** ERROR: Failed to call WMI on the RS server. The most likely cause is that the setup user does not have the required permissions: Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED))
08/13/08 11:13:18 DDSet_Status: Process exited with exit code: 15

A bit of searching online brought up this blog post which seemed to indicate that SP1 couldn't resolve the entry found in TFS' TfsIntegration..tbl_service_interface table for the ReportsService record.  The solution provided in the article allowed us to finish the SP1 installation but it seemed like a risky endeavor, so I wanted to get a second opinion on what actually happened and a "safer" way to get around the issue.

In all fairness, I need to also state that this client had configured a "friendly" name for the server by configuring a DNS HOST (A) record called TFS-Q (for their QA TFS instance) instead of using the machine name ( which is much longer and not very user friendly).

I asked a colleague at Microsoft what the issue could be and was given this response:

"One possible explanation for this failure to access WMI data remotely is that a firewall is blocking that traffic.  If the friendly name were actually represented as TFS-Q.somedomain.com, as is the form in the referenced blog post, that could drive traffic that would otherwise be local through a proxy server.  An address with ‘.’s in it would not be considered a local address when interpreting the setting that determines whether to bypass a proxy server for local addresses.  The traffic would then leave the box, which would mean that you would need the DCOM port open (TCP 135)."

From what I know about this client's network configuration and this installation, I suspect that this is exactly what happened.  The entry was a fully-qualified domain name which forced the traffic through a proxy server as a non-local address and then it was blocked from coming back in by a firewall.  When we changed the database record (by hand) to the netbios name of the machine it became a local address and avoided the proxy server and firewall thus completing successfully.

To my question about a "safer" workaround, he responded:

"You could have avoided the DB edit in this case by using the “tfsadminutil configureconnections” command to set the RS host name back to the NetBIOS name of the machine."

So there you have it; If you have configured a HOST record for access to your TFS instance that contains a dotted address, you need to run tfsadminutil configureconnections command to update the ReportService record (use the /ReportServerUri switch) before you install SP1.  After the upgrade, you can run tfsadminutil configureconnections a second time to set the value back.