24 May 2006

TFS Team Build can run forever

Problem: It is possible for TFS Server to lose track of a remote team build and believe that it is still running even if the build machine has not raised an event back to the TFS Server in a very long time (>36 hrs). It seems that the default timeout for a build is for the TFS server to wait forever. Scenario: I was testing a team build that went awry. My build script deleted a large portion of the C:\ folder and hosed the machine (stop laughing). To stop the build (prior to learning that there is a TFSBUILD STOP command), I shutdown the build server. The machine was hosed and needed to be re-imaged. About 36 hrs later, I reviewed "All Builds" within Team Explorer and noticed that the TFS Server thought that the build was still running. So even after 36 hrs+, TFS hadn't failed the build on a timeout. Fix: I had to use the "TFSBuild.exe Stop" command to inform the TFS server that the build should be aborted. You should also run tf workspaces owner:* server:[MYSERVER] on the Build Server and the Client machine that initiated the build to update the workspace cache which will clean up any stray workspaces. Comment: If you have a failure of a remote build machine during a build, you need to ensure that the build is cancelled on the TFS server or you may have a workspace collision when a new build is run when the machine is back up as the workspace's local path is considered to be still "in use". this "in use" status comes from the Build Machine or Client's workspace cache being out of sync with the Team Foundation Server's database. the tf workspaces command above will update the locaal cache from the server and clean this up. Mike Ruminer has posted a step-by-step listing of the entire event on his blog.

2 comments:

JOM said...

Hi Steve, I'm getting burned by this issue right now! I find that the TFSBuild Stop command produces the same result as hitting Stop Build in Team Explorer. Did you ever have to delete rows from the default collection database (not something I'm altogether very happy about doing)
Thanks

Steven St Jean said...

Hi JOM,

This post was originally targeted at TFS 2005 which used synchronous that blocked the UI during builds. Using the command-line allows me to get around the blocking. TFS 2010 and higher us an asynchronous model so they don't block the UI, so you will see the same behavior regardless of method used to Stop a build.

I've never had to edit the TFS Collection database and you should never do that except if you are told to by Microsoft Support. Editing the TFS Configuration or Collection databases puts your TFS installation into an unsupported state, so please don't ever go playing around in there.

- Steve

Post a Comment