Tuesday 22 July 2014

Remote Event Receivers on host web gotcha – no context token/ClientContext


Update August 2015 - new information!

Vesa Juvonen (Microsoft) got in touch with me to say that he and the SharePoint engineering team had been doing some testing around this, and that whilst the "app-only" workaround I discuss below does indeed work, it's actually only necessary under certain conditions. It turns out that the problem I describe in this article ONLY occurs if the event receiver is NOT registered *using app authentication*, e.g. in an AppInstalled event. If the RER is registered using *user authentication*, e.g. with the SharePointOnlineCredentials class, then you WILL see the behavior detailed here - so your choices then are either to register the RER with app authentication, or use the app-only approach detailed below in your code.

In other words, *how* the remote event receiver was originally registered has an impact when the code runs.

Thanks for the info Vesa :)

I’ve spent quite a bit of time working with Remote Event Receivers in SharePoint 2013/Office 365, and in the early days I hit an issue that initially made me think they didn’t work properly at all. Now with some help, I realize that mainly the code just has to be written a certain way - but it’s also true to say that there is a limitation which can affect some things you might want to do. In any case, this is a gotcha that I’m starting to see other people hit so I think it’s worth talking about it.
Specifically the issue occurs when:
  • The event receiver is on the host web (rather than the app web)
  • The event receiver uses CSOM code to talk to SharePoint (rather than just using NON-SharePoint code)

Issue symptoms

When you’re trying to use RERs on the host web, what you might see is that whilst your event receiver always fires (and your remote code does execute), your code fails to authenticate back to SharePoint. You’ll get a 401 error when trying to obtain or use a ClientContext object – this could manifest itself with one of these in your code:
System.NullReferenceException: Object reference not set to an instance of an object
..but if you look at the HTTP requests/responses, you’ll see the HTTP 401 (Access Denied).
This means you effectively can’t do anything with SharePoint – and often, of course, the whole point of your event receiver is that it’s going to do something in SharePoint! Perhaps you need to create, update or delete an item somewhere, or perform some other action related to sites, users or data in SharePoint.
Other info:
  • No context token
    • If you do some debugging, you’ll find that the event receiver does NOT receive a valid context token from SharePoint/Office 365 - SPRemoteEventProperties.ContextToken is an empty string. This causes all the later code to fail. At this point, you probably think there’s a bug with SharePoint/Office 365 – even at the end of all this, I’m personally not sure whether there is a bug or it’s just a “by-design” limitation. Fortunately, as we will see there is a workaround which works for many (most?) scenarios of this type.
  • The same code works on the app web
    • If you do some wider testing, perhaps to see if there is a code issue – you’ll find the very same code works fine in other contexts. For example, if you use the same code in a RER on the app web (e.g. “ItemAdded” on a list in the app web), everything works fine. The issue is purely related to RERs on the host web.
  • No change if you use ProcessOneWayEvent (asynchronous events such as added, deleted etc.) or ProcessEvent (synchronous events such as adding, deleting etc.)
    • Remote Event Receivers are similar to traditional full-trust event receivers, in that both async and sync events can be used. An example here would be the “added” and “adding” events respectively – these correspond to the ProcessOneWayEvent and ProcessEvent methods in a Remote Event Receiver class. For reference, the issue we’re discussing here manifests itself in both cases.
So those are the symptoms. Before we dig into the “Why it happens” and “The workaround” sections, it’s worth reminding ourselves of some aspects of remote code and authentication in particular. Feel free to skip forward if you’re familiar with this territory.

A reminder on authentication of remote code

For those just starting out with Remote Event Receivers (or SharePoint apps/remote code in general), it’s worth noting that an app is required for any form of remote code. More correctly, we could expand this to say any remote code which runs on the server-side and uses “app authentication” - as opposed to “user authentication” where a username/password is specified. In any case, this can seem strange for some things like Remote Event Receivers, where there isn’t really an “app” (e.g. with some pages) that the user will enter/click around etc. All we have for a RER is a remote WCF service with no front-end! But, this is the way the trust model works for remote code – so the first thing to say is that if you don’t have an app registered and installed/trusted for your remote code, it will not work.
If we were to summarize common authentication options for remote SharePoint code, we could break it down like this:
  • User authentication
    • The username/password of a “named user” is passed – usually a very poor approach that brings security risks. Effectively this led OAuth to be invented as an alternative.
  • App authentication – this breaks down into:
    • User + app authentication – the default. Permissions of both the user *and* the app are checked
    • App-only authentication – the permissions of just the app are checked (thus allowing operations that the user themselves does not have permission to do)

Why it happens

If you’re hitting the issue, your RER most likely has code which instantiates the ClientContext object in a certain way - specifically, you’ll be using “user + app” authentication. The vast majority of MSDN and blog samples use this approach. This is ultimately the problem -  “user + app” authentication does not currently work for events related to the host web. My testing shows this is the case for both Office 365 and SharePoint 2013 on-premises installations. Along the journey of realising this, I tried some different coding approaches – and I see other developers perhaps doing the same (e.g. in forum posts). Since there are various ways the CSOM ClientContext object can be obtained, you might try different code for this – but several cannot be used in a RER and will fail. The table below shows some of the “wrong” approaches.
Code approaches which will NOT work:
Approach
Code sample
Why not
Using the typical RER approach ClientContext clientContext = TokenHelper.CreateRemoteEventReceiverClientContext(properties) This is the approach that “should work” (and the approach that most samples use). However, it only works for events on the app web – NOT events on the host web.
Creating ClientContext using constructor ClientContext clientContext = new ClientContext(properties.ItemEventProperties.WebUrl) The context is not obtained in a way which deals with authentication. This is what the TokenHelper methods are for.

Sidenote – it would be possible to authenticate with user authentication instead of app authentication, if the username/password of a named account was attached using the SharePointOnlineCredentials object. But that’s lame, and a big security risk! Instead, we want to use OAuth (via app authentication) of course, to avoid storing/passing passwords.
Using the “app event” approach ClientContext clientContext = TokenHelper.CreateAppEventClientContext(properties, false) The context is obtained in a way which is only appropriate for app events (e.g. AppInstalled/AppUpgraded/AppUninstalled etc.) Since we are using an event on the host web, authentication does not work.

As the table says, the one that *should* work is the first one (TokenHelper.CreateRemoteEventReceiverClientContext()) – but this relies on “user + app” authentication which appears to be the problem. So to work around this we need a different approach. 

The workaround

Instead of “user + app” authentication we can use “app-only” authentication. This requires 2 things:
  1. The app permission policy to specify that app-only calls are allowed (something the person installing the app must agree to). This is enabled in the app manifest in Visual Studio:

    Allow app-only auth
  2. The code in the app (i.e. in the Remote Event Receiver) to obtain ClientContext using an app-only access token. The expected way to do this in C# is to use the correct TokenHelper methods – big thanks to Kirk Evans for helping me with this. The code should look like this:

    ** N.B. There is a code sample here but it will not show in RSS Readers - click here for full article **
If you ensure these two things are in place, your Remote Event Receiver will work fine.

The limitation – when the workaround is no good

So, that’s the answer then. However, you might hit occasions where the app-only approach isn’t really what you want. If your remote code writes any data to SharePoint (e.g. adds a list item), you’ll notice that the SharePoint user interface makes it clear that the list item was added/modified by an app. Specifically, the username will be in the following format:
Last modified at [time] by [app name] on behalf of [username]
This is a pretty useful feature for some business requirements, since it makes it clear that although the named user was involved, they didn't make the change in SharePoint "directly". Unfortunately, use of the app-only authentication route to solve our problem changes this – specifically, with app-only auth only the app name is recorded as the user who made the change. Effectively, we lose the information which tells us which actual person was involved in the change - in some circumstances, this could be a problem if some kind of audit trail is required. To make things clear, here are two screenshots which show the difference:
List item added by "app + user" authentication:
List item added with App and User context
List item added by "app-only" authentication:
List item added with App Only context
Related sidenote (for the curious!):
If you’re wondering how SharePoint/Office 365 deals with this under the covers, you’ll find that all lists now have 2 new hidden columns - “App Created By” and “App Modified By”. These store a reference to an entry in the User Information list in the current site (just like other Person/Group columns):
AppCreatedByModifiedByColumnsOnList AppCreatedByModifiedByColumnsOnList2
So now you know ;)

Summary

It’s pretty easy to get things wrong with Remote Event Receivers, and the pitfall I’ve talked about concerning authentication and CSOM code is definitely a big gotcha – I certainly see other people hitting this. If you have remote code which modifies data in SharePoint somewhere, the way around the problem is to use app-only authentication - by ensuring the app’s permission policy allows this *and* also writing your CSOM code to authenticate in this way. This can be done by obtaining an access token using the TokenHelper.GetAppOnlyAccessToken() method, and then obtaining a ClientContext object using this token.
However, with this approach comes a trade-off - the loss of user information (in “pseudo-audit” terms) related to the user behind the change. Developers should bear this in mind when working with Remote Event Receivers.

8 comments:

Navya said...

Excellent Article Chris. However, when we deploy an app file with app only authentication unchecked to a developer site using visual studio. It works fine. I tried it in our DEV environment and has no issues. When we upload the .app file to the app catalog site and try to install it on a team site, it errors out. Opened a ticket with Microsoft product support team to see if there is an alternate way.

Human Fireplace said...
This comment has been removed by the author.
Human Fireplace said...

Thanks Chris! This blog was very helpful! Especially the 2 points on app-only contexts!

libin said...

Chris, Thanks for helping me resolve the context issue. But now I am stuck with access denied error when I tried to load event receivers and run clientContext.ExecuteQuery(); which was working perfectly on Dev site while debugging or installed but not when I moved to Host web. Below is the code I am using and I have tried with full permissions on List and Web.
Sorry for hijacking this comment section, please let me know if there is any other way to ask you this question. Thanks much!!

private void HandleAppInstalled(SPRemoteEventProperties properties)
{
try
{
System.Diagnostics.Trace.WriteLine("HandleAppInstalled " + ListName);
// System.Diagnostics.Trace.WriteLine("HandleAppInstalled " + properties.ContextToken + " " + properties.ListEventProperties.WebUrl + " " + properties.WebEventProperties.FullUrl);

string webUrl = properties.AppEventProperties.HostWebFullUrl.ToString();
System.Diagnostics.Trace.WriteLine("HandleAppInstalled " + webUrl);
Uri webUri = new Uri(webUrl);

string realm = TokenHelper.GetRealmFromTargetUrl(webUri);
string accessToken = TokenHelper.GetAppOnlyAccessToken(TokenHelper.SharePointPrincipal, webUri.Authority, realm).AccessToken;

//using (ClientContext clientContext =
//TokenHelper.CreateRemoteEventReceiverClientContext(properties))
using (var clientContext = TokenHelper.GetClientContextWithAccessToken(webUrl, accessToken))
{
if (clientContext != null)
{
System.Diagnostics.Trace.WriteLine("Client context not null " + ListName);
List myList = clientContext.Web.Lists.GetByTitle(ListName);
clientContext.Load(myList, p => p.EventReceivers);
System.Diagnostics.Trace.WriteLine("Reached load" + ListName);
clientContext.ExecuteQuery(); // Fails here

Marit said...

Hi,
thanks for this post. I had a hard time to figure this out :)

Now I have another issue. I have a provider hosted app that creates two lits items in two separate lists. Both of these lists have remote event receivers attached to them, both itemadded.

What I experience is that when I add the first list item in list1 from the app the remote event receiver for list1 is triggered. But when the code in my app reaches the point where a list item in list2 is added, it seems like the first event receiver, the RER for list1, is cancelled and the event receiver for list2 is executed. Is it not possible to have two remote event receivers to run at the same time?

Have you any experience with this? I use 'ProcessOneWayEvent'. Any clue to a workaround?

Chris O'Brien said...

@Marit,

Sorry, I haven't seen that one. I've had RERs execute fine, in cases where the same RER is shared amongst multiple lists, and also where lists have their own RER in the same web. So I'm afraid I don't have any helpful info on what you're seeing.

Best of luck..

COB.

Unknown said...

Great article thanks!

With latest CSOM library version they introduced the SystemUpdate method that solves also the "App Modified By" issue.

Marius said...

Great article, many thanks