Claims: Exception occurred issuing a security token

With a large customer going live on a fully multi-tenanted claims authenticated platform in the last month I’ve had the chance to really see the limits of these two new features in SharePoint 2010. This issue was one of the big impact problems that I’m hoping that I’ve now found the solution to, so with that in mind it’s definitely worth sharing.

Problem:

After a few days of working normally claims authentication stops working unexpectedly on any given server in the farm, the below errors are logged.

Event Logs

Log Name: Application

Source: Microsoft-SharePoint Products-SharePoint Foundation

Event ID: 8306

Task Category: Claims Authentication

Level: Error

Description:

An exception occurred when trying to issue security token: The server was unable to process the request due to an internal error. For more information about the error, either turn on IncludeExceptionDetailInFaults (either from ServiceBehaviorAttribute or from the <serviceDebug> configuration behavior) on the server in order to send the exception information back to the client, or turn on tracing as per the Microsoft .NET Framework 3.0 SDK documentation and inspect the server trace logs..

Log Name: Application

Source: Microsoft-SharePoint Products-SharePoint Foundation

Event ID: 8306

Task Category: Claims Authentication

Level: Error

Description:

An exception occurred when trying to issue security token: The security token username and password could not be validated..

Log Name: Application

Source: Microsoft-Windows-User Profiles Service

Event ID: 1511

Task Category: None

Level: Error

Description:

Windows cannot find the local profile and is logging you on with a temporary profile. Changes you make to this profile will be lost when you log off.

ULS Logs:

01/04/2011 13:38:52.17        w3wp.exe (0x037C)        0x0660        SharePoint Server        Shared Services        olgq        Exception       System.Runtime.InteropServices
.COMException (0x800703FA): Illegal operation attempted on a registry key that has been marked for deletion. at System.DirectoryServices.DirectoryEntry.Bind(…      

01/04/2011 13:38:52.17        w3wp.exe (0x0554)        0x0F30        SharePoint Foundation        Claims Authentication        8306        Critical        An exception occurred when trying to issue security token: The security token username and password could not be validated..       

 

Cause:

The third the Event log error above I included as although it is one that you often see, it was the message that eventually led me to the what looks like the source of this issue, with that combined with the “registry key that has been marked for” message in the ULS I was lead to the following DCOM blog:

A COM+ application may stop working on Windows Server 2008 when the identity user logs off

Resolution:

It seems that the Claims provider breaks when for some reason or other the App Pool account logs off unexpectedly, the solution (at least after 2 weeks with no reoccurrence) is as suggested in the above blog;

As a workaround it may be necessary to disable this feature which is the default behavior. The policy setting ‘Do not forcefully unload the user registry at user logoff’ counters the default behavior of Windows 2008. When enabled, Windows 2008 does not forcefully unload the registry and waits until no other processes are using the user registry before it unloads it.

The policy can be found in the group policy editor (gpedit.msc)
Computer Configuration->Administrative Templates->System-> UserProfiles
Do not forcefully unload the user registry at user logoff

Change the setting from “Not Configured” to “Enabled”, which disables the new User Profile Service feature.

‘DisableForceUnload’ is the value added to the registry

 

I’ll update this blog entry if the problem comes back.

Share and Enjoy !

Shares

SQL Reporting Services Integration Errors

When trying to configure the SQL Reporting Services Add-in for SharePoint 2010 from Central Admin you might see the following error:

image

(Error Text: Failed to establish connection with report server. Verify the server URL is correct or review ULS logs for more information. …)

Most common cause: Kerberos

There are actually a few causes for this, and usually the ULS will show the reason pretty clearly, for instance most commonly I see the following:

w3wp.exe (0x11E4)        0x1460 SQL Server Reporting Services SOAP Client Proxy  0000 High Exception encountered for SOAP method GetSystemProperties: System.Net.WebException: The request failed with HTTP status 401: Unauthorized

In that case the cause is likely to be related to Kerberos, if you’re using Windows Authentication mode and have multiple servers in the SharePoint farm then that is where you should look. A good test to prove that is if it works without error using ‘Trusted Authentication’.

 

Alternate cause: SSL

However recently I have found a new type of issue that causes this, here’s the ULS extract;

An operation failed because the following certificate has validation errors:nnSubject Name: CN=*.something.com, [… … snip … …]nnErrors:nn The root of the certificate chain is not a trusted root authority..

In this case the ULS log is very helpful indeed as it seems that SharePoint maintains its own certificate store which does not contain any of the usual certificates.

So the solution to this one is quite easy, you simply need to obtain and install the appropriate certificate.

With credits to this blog Calling an SSL Web Service from SharePoint 2010 (For example LinkedIn) you can download the certificates directly from Verisign:

https://www.verisign.com/support/roots.html

In my case it was the GeoTrust certificate, if you’re not sure then view the certificate by clicking the Padlock icon in IE and under Certification Path you can view the certificate hierarchy to determine who your root certificate is.

Once you have your certificate then open Central Admin, and under Security – Manage Trust add the certificate using New. It should then look something like this:

image

Once that’s done go back to the SSRS integration and you should now hopefully be able to complete the integration without further errors.

 

Final Thoughts

This SSRS integration has always had problems in my experience, all that I can suggest if you are still having issues is double check:

  • SQL Reporting Services 2008 should be installed with Service Pack 2 (or SP1 with CU8)
  • SQL Reporting Services 2008 R2 in my experience works without any updates, however you might like to try the latest patches: both SSRS and the SharePoint add-in are updated in SQL 2008 R2 CU4 (http://support.microsoft.com/kb/2345451)

Hope that helps someone out there!

Share and Enjoy !

Shares

Error saving projects to Project Server 2010 from 2007 client

This is probably the first major impact 2010 issue I’ve experienced, not the first issue let’s be clear, but the one causing significant impact to at least two of my major customers who are now well into production.

Problem:

For no apparent reason when attempting to Save a project from Microsoft Project Professional 2007 to the Project Server 2010 the following error is displayed:

image

(Text for Google: The file cannot be opened. … Project files saved in a version earlier than Microsoft Project 98 can’t be opened. If your file is from an earlier version … save in MPX format …)

 

After selecting Ok, any subsequent attempts to save report the following error:

image

(Text: An unexpected error occurred during command execution.)

 

This problem ONLY occurs with Project 2007 client version and attempting to open and save the file in Project 2010 will work as expected. Also it’s worth noting that once the error appears on one client, all clients will be unable to save the project.

Finally it is definitely not related to legacy projects, I have seen it occur on a test dozen line project based on a blank template.

 

Cause:

Unknown at this stage, although it is definitely some sort of corruption, likely in the Resource Sheet which can be proved by deleting all resources and resaving which will work immediately! It seems that the 2010 client has better error handling / correcting or perhaps that the 2007 client is introducing some errors!

 

Resolution:

We’ve (as in we at EPM Partners) have spent many hours on this issue with Microsoft as it seriously impacted one of Microsoft’s largest Asia-Pac 2010 customers, in the end with the support of a Microsoft PFE (onsite engineer) a resolution was found in the yet-to-be-released beta October Cumulative Update for Project Pro 2007.

So there is light at the end of the tunnel, however the problem is not yet closed, so far with over a week of testing post-cu this issue hasn’t reoccurred, but this is definitely one to watch out for!

Share and Enjoy !

Shares

Unable to delete Published version of Project in 2010

This is an issue I have seen now with two different Project Server 2010 clients;

Symptoms:

  • Certain Projects cannot be opened in either PWA or MS Project.
  • Oddly the same projects do not appear to exist in the Draft database, but DO appear in the Published database.
  • Projects cannot be deleted from Server Settings after selecting the Published version and hitting delete.

Errors:

Delete Queue job fails and review of the ULS log shows the following:

10/14/2010 11:30:40.33        Microsoft.Office.Project.Server (0x14A8)        0x172C        Project Server        Server-side Project Operations        7gvo        Critical        Standard Information:PSI Entry Point: Project User: i:0#.w|domainadministrator Correlation Id: e63eca62-f9fb-40d6-af52-db59bff5b7cd PWA Site URL: http://project2010:82/PWA SSP Name: Project Server Service PSError: ProjectDeleteFailure (23006) The request to delete a project encountered a problem – the relevant job failed in the Queue. Project UID: 01c7c325-1704-4269-8316-ab1b0bc85d07. Sub-job type where failure occurred: Microsoft.Office.Project.Server.BusinessLayer.QueueMsg.
AdjustTimeSheetForDeletedProjectMessage
. Sub-job ID where failure occurred: . Specific stage in the sub-job where failure occurred: . Does this failure block the correlated job group: Undefined. See the ‘Manage Queue’ page in PWA for more details.

Additionally enabling Verbose ULS logging for all Project Server events turns up this one:

10/14/2010 11:50:05.33        Microsoft.Office.Project.Server (0x14A8)        0x1434        Project Server Timesheet myzd Verbose PWA:http://project2010:82/PWA, ServiceApp:Project Server Service, User:i:0#.w|domainadministrator, PSI: Start ApproveProjectTimesheetLines(approvedTimesheetLines=’ ‘, rejectedTimesheetLines=… [GUID’s removed] … 10/14/2010 11:50:05.33        Microsoft.Office.Project.Server (0x14A8)        0x1434        Project Server Timesheet myzl Verbose Error is: GeneralItemDoesNotExist. Details: . Standard Information: PSI Entry Point: Project User: i:0#.w|domainadministrator Correlation Id: e63eca62-f9fb-40d6-af52-db59bff5b7cd PWA Site URL: http://project2010:82/PWA SSP Name: Project Server Service PSError: GeneralItemDoesNotExist (10000)

Resolution:

It took the enabling of Verbose logging to figure out the cause, the give away was the timesheet line delete job, it seems that for some reason the TS Line GUID’s above were causing the failure.

So the solution was relatively simple, identify the Timesheets (and respective lines) and delete them manually.

To that end I ended up creating a couple of SQL scripts;

Script 1:

Identify all projects which exist in the Published Database but not in Draft.

— Query to identify projects in Publish but not in Draft

USE ProjectServer_Published
SELECT PubProj.PROJ_UID, PubProj.PROJ_NAME AS Published, DrafProj.PROJ_NAME AS Draft
    FROM ProjectServer_Published.dbo.MSP_PROJECTS AS PubProj
    LEFT OUTER JOIN ProjectServer_Draft.dbo.MSP_PROJECTS AS DrafProj ON PubProj.PROJ_UID = DrafProj.PROJ_UID
    WHERE DrafProj.PROJ_NAME IS NULL

Script 2:

Identify all Timesheets with lines against projects not in the Draft database.

— Query to identify timesheets with lines against projects not in draft

USE ProjectServer_Published
SELECT MSP_TIMESHEETS.TS_UID, MSP_TIMESHEETS.CREATED_DATE, MSP_TIMESHEETS.TS_CACHED_RES_NAME, MSP_TIMESHEETS.TS_COMMENTS
    ,RepTSP.PeriodName, RepTSP.StartDate, RepTSP.EndDate
    ,MSP_TIMESHEET_LINES.PROJ_UID, MSP_TIMESHEET_LINES.TS_LINE_CACHED_PROJ_NAME
    FROM MSP_TIMESHEETS
    INNER JOIN MSP_TIMESHEET_LINES ON MSP_TIMESHEETS.TS_UID = MSP_TIMESHEET_LINES.TS_UID
    INNER JOIN ProjectServer_Reporting.dbo.MSP_TimesheetPeriod AS RepTSP ON MSP_TIMESHEETS.WPRD_UID = RepTSP.PeriodUID
    WHERE
    MSP_TIMESHEET_LINES.PROJ_UID IN (SELECT PubProj.PROJ_UID
        FROM ProjectServer_Published.dbo.MSP_PROJECTS AS PubProj
        LEFT OUTER JOIN ProjectServer_Draft.dbo.MSP_PROJECTS AS DrafProj ON PubProj.PROJ_UID = DrafProj.PROJ_UID
        WHERE DrafProj.PROJ_NAME IS NULL)
    ORDER BY CREATED_DATE

Note: For both script’s replace your ProjectServer_* name, and then run this script against your PUBLISHED database.

Running the 1st script is not mandatory, although it can be useful as in my case it identified a few more projects with this issue. However: Ignore the ‘eGlobal …” project(s) and do not attempt by any means to remove them, you have been warned!

The second script will identify line by line timesheets that need to be edited / deleted before you can delete from Published the problematic project. This needs to be done in the usual ways, probably by the Resource themselves unless using delegation or something similar.

 

Cause:

Good question! I was not able to identify the root cause of this one, although I did notice it occurring to me during a training session that I was hosting, in fact the issue occurred on the test project I was working on while testing the creation and use of Timesheet approval rules and of course submitting / approving / rejecting test timesheets. However despite that clue I was never able to reliably reproduce the issue, please leave a comment if you have any other ideas!

 

Hope that helps someone out there!

Share and Enjoy !

Shares

Issues in PerformancePoint when using OLAP and SQL datasources

I came across this little annoying issue while creating some linked PPS dashboards where one scorecard item would filter another report by the selected row (in this case a ‘Program Name’ custom field). I was stumped to get the following message when selecting my program:

Error running data source query

After much head scratching I found the problem deep in the logs which for Google’s benefit I will include some server log snippets here:

Log Name: Application

Source: Microsoft-SharePoint Products-PerformancePoint Service

Description:

An exception occurred while running a report. The following details may help you to diagnose the problem:

Error Message: Error running data source query.

<br>

<br>

Contact the administrator for more details.

Dashboard Name:

Dashboard Item name:

Parameters: Programme A

Exception Message: Error running data source query.

And

PerformancePoint Services error code 10116.

<Data Name="string1">Error running data source query.

Microsoft.AnalysisServices.AdomdClient.AdomdErrorResponseException: Query (10, 19) Parser: The syntax for ‘A’ is incorrect.

The problem turned out to be a space in my custom field value name!

Basically the hint was in the Parameter and the syntax, it seems that PerformancePoint fails to send my parameter correctly and as a result it sends ‘Programme A’ as two separate invalid parameters ‘Programme’ and ‘A’, hence the error on “The syntax for ‘A’ is incorrect”.

This was quickly proved by updated the custom field lookup table values to remove any spaces, which after a cube rebuild resolved the error!

Definitely chalk that one down to a little defect hidden in the code somewhere!

Share and Enjoy !

Shares

Database Restore from / to Claims Auth App Problem

When restoring SharePoint 2010 databases or sites I have recently come across the following problem:

Access Denied to all Sites for all users, including Site Collection Admins.

This appears similar to old 2007 restore issues which were fixed using the STSADM MigrateUser command, however in this case it seems that the STSADM (or PowerShell move-spuser) commands doesn’t fix the issue. And I have definitely made certain to change the Site Collection admin on the restored site collections using Central Admin.

A little investigation revealed that the reason is the use of Claims (or Forms) Authentication in either the source or destination web application, so in my case where my Test environment used Claims (both NTLM and LDAP) but my Prod was only going to use NTLM authentication the restored sites were inaccessible (including PWA!).

Fortunately in my case I found a workaround; setup my new Web App to use Claims, however only enable NTLM authentication, effectively resulting in a pure NTLM setup. However that won’t always work, in fact I have another case where the source data (from an old 2007 portal) is using NTLM and I want to migrate it to 2010 using Claims, in that case another solution will be required.

I plan to investigate further using the "move-spuser" powershell command as that seems to be the solution, it just seems that something is preventing it from migrating the users as expected. I’ll update this blog with my results.

Share and Enjoy !

Shares