16 August 2013

TFS 2012 – My warehouse jobs are supposed to be running but show QueuedScheduled

trucks

I just finished a very stressful couple of days.  I was working with a client to upgrade their TFS 2010 installation to TFS 2012.3.  During the process we had a couple of issues to resolve, but one of them was blog-worthy.

The stress came from the fact that if we couldn’t resolve these issues we would have to rollback the entire upgrade to TFS 2010. That would negate two weeks of work and probably have an very unhappy client.

Problem

After getting the customer’s system upgraded we noted that their warehouse wasn’t processing.  Strangely, when we launched the Warehouse Controller Web Service on the App Tier machine we saw that there were lots of jobs that had CurrentRun entries that showed a JobState of QueuedScheduled.

Notice that the Build Warehouse Sync job on Line 26 & 27 is showing a JobProcessingStatus of Idle, but the job’s CurrentRun element on lines 40&41 are showing that the job is running but in a QueuedScheduled JobState.  This makes entirely no sense.  The Job is running but it queued so it’s Idle?

   1:  <?xml version="1.0" encoding="utf-8" ?>
   2:  <WarehouseProcessingStatus 
   3:       xmlns:xsd="http://www.w3.org/2001/XMLSchema"
   4:       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   5:       xmlns="http://schemas.microsoft.com/TeamFoundatio...">
   6:    <RequestTimeUtc>2013-08-16T14:01:26.4000423Z</RequestTimeUtc>
   7:    <WarehouseProcessingOnlineStatus>Started</WarehouseProcessingOnlineStatus>
   8:    <AnalysisProcessingOnlineStatus>Started</AnalysisProcessingOnlineStatus>
   9:    <JobProcessingStatus>Idle</JobProcessingStatus>
  10:    <JobsRunning>0</JobsRunning>
  11:    <JobsQueued>6</JobsQueued>
  12:    <Instance Name="TEAM FOUNDATION"
  13:               JobProcessingStatus="Idle"
  14:               JobsRunning="0"
  15:               JobsQueued="1">
  16:      <Jobs>
  17:        ...
  18:      </Jobs>
  19:    </Instance>
  20:    <Collections>
  21:      <Collection Name="DefaultCollection"
  22:                   JobProcessingStatus="Idle"
  23:                   JobsRunning="0"
  24:                   JobsQueued="5">
  25:        <Jobs>
  26:          <Job Name="Build Warehouse Sync" 
  27:               JobProcessingStatus="Idle">
  28:            <LastRun QueueTimeUtc="2013-08-16T01:52:33.423Z"
  29:                      ExecutionStartTimeUtc="2013-08-16T02:15:11.017Z"
  30:                      EndTimeUtc="2013-08-16T02:15:12.123Z"
  31:                      Result="Blocked">
  32:              <ResultMessage>
  33:                [Build Warehouse Sync]: ---> TF221107: Reporting for
  34:                Team Foundation Server cannot execute job Build Warehouse Sync
  35:                for team project collection DefaultCollection because the 
  36:                warehouse is offline. Use the Team Foundation Administration 
  37:                Console to start reporting.
  38:              </ResultMessage>
  39:            </LastRun>
  40:            <CurrentRun QueueTimeUtc="2013-08-16T13:55:00.45Z" 
  41:                        JobState="QueuedScheduled" />
  42:          </Job>
  43:   
  44:          ...
  45:   
  46:          <Job Name="Work Item Tracking Warehouse Sync" 
  47:               JobProcessingStatus="Idle">
  48:            <LastRun QueueTimeUtc="2013-08-16T01:52:33.423Z" 
  49:                     ExecutionStartTimeUtc="2013-08-16T02:15:11.017Z" 
  50:                     EndTimeUtc="2013-08-16T02:15:12.123Z" 
  51:                     Result="Blocked">
  52:              <ResultMessage>
  53:                [Work Item Tracking Warehouse Sync]: ---> TF221107: Reporting 
  54:                for Team Foundation Server cannot execute job
  55:                Work Item Tracking Warehouse Sync for team project collection
  56:                DefaultCollection because the warehouse is offline. Use the
  57:                Team Foundation Administration Console to start reporting.
  58:              </ResultMessage>
  59:            </LastRun>
  60:            <CurrentRun QueueTimeUtc="2013-08-16T13:55:00.45Z" 
  61:                        JobState="QueuedScheduled" />
  62:          </Job>
  63:        </Jobs>
  64:      </Collection>
  65:    </Collections>
  66:  </WarehouseProcessingStatus>

Digging a little deeper, we jumped down to the Event Viewer on the App Tier machine. 

In case you are not aware, TFS has it’s own section to collect debugging records. These can be found in the Event Viewer | Applications and Services Logs | Microsoft Team Foundation Server | Debug.

SNAGHTMLfdb988a

One of the Information Debug Logs was showing the following error:

 

   1:  Log Name:      Microsoft-Team Foundation Server/Debug
   2:  Source:        Microsoft-Team Foundation Server
   3:  Date:          8/16/2013 10:09:10 AM
   4:  Event ID:      0
   5:  Task Category: This task has debugging events
   6:  Level:         Error
   7:  Keywords:      Info Messages
   8:  User:          DOMAIN\TFSSERVICE
   9:  Computer:      TFS.company.local
  10:  Description:
  11:  TFS Informational message
  12:  Event Xml:
  13:  <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  14:    <System>
  15:      <Provider Name="Microsoft-Team Foundation Server" 
  16:                Guid="{81761665-68A4-47D5-8106-F2ED2AA8687B}" />
  17:      <EventID>0</EventID>
  18:      <Version>0</Version>
  19:      <Level>2</Level>
  20:      <Task>1</Task>
  21:      <Opcode>10</Opcode>
  22:      <Keywords>0x8000000000000001</Keywords>
  23:      <TimeCreated SystemTime="2013-08-16T14:09:10.375773800Z" />
  24:      <EventRecordID>13163</EventRecordID>
  25:      <Correlation ActivityID="{018C9C58-F800-0003-99AD-5341209ACE01}" />
  26:      <Execution ProcessID="1472" ThreadID="4720" />
  27:      <Channel>Microsoft-Team Foundation Server/Debug</Channel>
  28:      <Computer>TFS.company.local</Computer>
  29:      <Security UserID="S-1-5-21-827956790-191126496-1124750213-8536" />
  30:    </System>
  31:    <UserData>
  32:      <Info TraceId="{00000001-0001-0001-0000-000000000000}  " 
  33:            xmlns="http://schemas.microsoft.com/TeamFoundation/2010/Framework">
  34:        <Tracepoint>1600</Tracepoint>
  35:        <ServiceHost>{00000000-0000-0000-0000-000000000000}</ServiceHost>
  36:        <ContextId>0</ContextId>
  37:        <ProcessName>TFSJobAgent</ProcessName>
  38:        <Username>
  39:        </Username>
  40:        <Service>
  41:        </Service>
  42:        <Method>
  43:        </Method>
  44:        <Area>JobAgent</Area>
  45:        <Layer>JobServiceUtil</Layer>
  46:        <UserAgent>
  47:        </UserAgent>
  48:        <Uri>
  49:        </Uri>
  50:        <Path>
  51:        </Path>
  52:        <UserDefined>
  53:        </UserDefined>
  54:        <Message>
  55:          Attempt #155: System.ArgumentException: An item with the same key 
  56:                                                  has already been added.
  57:          at System.Collections.Generic.Dictionary`2.Insert(TKey key, 
  58:                                                            TValue value, 
  59:                                                            Boolean add)
  60:          at Microsoft.TeamFoundation.Framework.Server
  61:              .TeamFoundationExtensionUtility.LoadExtensionTypeMap[T]
  62:                        (String pluginDirectory)
  63:          at Microsoft.TeamFoundation.Framework.Server.JobApplication
  64:              .SetupInternal()
  65:          at Microsoft.TeamFoundation.Framework.Server.JobServiceUtil
  66:              .RetryOperationsUntilSuccessful(RetryOperations operations, 
  67:                                              Int32 maxTries, 
  68:                                              Int32& delayOnExceptionSeconds)
  69:        </Message>
  70:      </Info>
  71:    </UserData>
  72:  </Event>

Notice on lines 37 & 55 – 56 we have the really important info.  The error is thrown by the TFSJobAgent and the problem is that it is trying to load an item with the same key into a Dictionary.

Solution

A quick online search turned up a blog post by Rory Street aptly entitled “TFSJobAgent Error – An item with the same key has already been added” in which he comments that the cause was that he had multiple copies of a custom warehouse adapter in the TFSJobAgent’s PlugIns folder. 

C:\Program Files\Microsoft Team Foundation Server 11.0\Application Tier\TFSJobAgent\plugins

When the TFSJobAgent service starts it recursively scans the Plugins folder looking for warehouse adapters (Microsoft’s or custom).  It uses reflection to find all of the assemblies that implement the ITeamFoundationJobExtension interface.  Since I have two copies of the same adapter (with different file names) in the folder, it tries to load them both and the names collide.  Rory had a slightly different configuration. He had copies of his custom warehouse adapter in a subfolder of PlugIns but the effect is the same sine the search is recursive

To clean up the multiple copies of my custom adapter in the PlugIns folder I needed to:

  • Start an elevated command prompt
  • Run: net stop TfsJobAgent
  • Delete the extra custom warehouse adapter DLLs from the Plugins folder (and its children)
  • Run: net start TFSJobAgent

Once that was done I rechecked the Warehouse Controller Web Service and all of my QueuedScheduled jobs were either Running or completed successfully.

Thanks Rory! :)

0 comments:

Post a Comment