Wednesday 16 December 2009

ECM platform enhancements - Content Organizer, List Throttling, Enterprise Content Types etc.

This is part 2 of my write-up on my ‘ECM Enhancements in SharePoint 2010’ talk which I did recently. You can find previous parts here:

Part 1: Managed Metadata in SharePoint 2010 – a key ECM enhancement
Part 1.5: Managed Metadata in SharePoint 2010 - some notes on the "why"

Having spent some time on the taxonomy/metadata piece in the last articles, this post will delve into other ECM enhancements. ECM is quite a broad topic (Document Management/Collaboration/Records Management/Web Content Management), but it’s amazing just how much of the new stuff you can’t cover in such a talk – Sandboxed Solutions, BCS, Service Applications, API improvements etc. are all only tenuously linked to ECM, so didn’t get coverage.

Here’s what I think of as some key generic ‘ECM platform’ enhancements:

  • Scalability
  • Enterprise Content Types
  • User experience
  • Taxonomy/metadata (as covered in earlier posts)
    • Navigation by metadata
  • Content Organizer
  • Document Sets
  • Tagging

No doubt you might think of others as being a big deal too, but that’s a good list for starters. In this article we’ll look at some of these items in detail, but for others I’ll point you to other articles which have good coverage on those topics.

Scalability/list throttling

Although us architects/developers think that we know a lot about scaling SharePoint these days, consider that in the 2007 release a common cause of Out Of Memory exceptions in large farms was use of the Content Query Web Part on site home pages, left to default settings. Of course this usage is perfectly natural – CQWP is about rolling-up content after all, but what compounded the problem is that some site templates (e.g. publishing portal) added this to the home page for you, thus triggering the problem without explicit configuration. In large sites this would result in a significant query which put significant load on the servers, and the built-in caching didn’t fully mitigate it.

In SharePoint 2010, the query-throttling feature is designed to prevent this problem – this will cut off large queries once a threshold has been passed and the full results will not be returned, thus safeguarding stability. The limits are set at the web application level, and it’s possible to specify a window when the governor will not kick in (aka ‘happy hour’):

ListThrottleSettings

If the query originated from custom code, the query will not complete and an SPQueryThrottledException will be raised:

SPQueryThrottledException

This of course, is a good thing - developers just need to ensure their 2010 code catches this exception type and presents a pretty message on the page, rather than show a Yellow Screen of Death. Interestingly, when a user encounters a similar scenario in a regular SharePoint list view, the experience is somewhat more sophisticated as some results are returned (up to the point where the threshold is crossed), and an equivalent ‘signal’ is passed to the user in the form of a message:

ThrottledListResults

As an aside, I notice at this point I’ve been redirected to a URL which contains all the parameters needed to display the “permitted” subset of data:

http://cob.collab.dev/Lists/LargeList/ShowEverything.aspx#ServerFilter=FilterField1=ID-FilterValue1=3752-FilterOp1=Geq-OverrideScope=RecursiveAll-FallbackLimit=3752-ProcessQStringToCAML=1 

I haven’t yet worked out if it’s possible to use this “show partial results” facility in custom code – certainly it would be nice in some scenarios. It sounds feasible as in the list view request, the server has worked out what parameters constitute an acceptable query (ID >= 3752) and (it seems) has redirected the user to a new request which contains the filter parameters. I’d be interested to hear if anyone has noticed how this can be done with code – I haven’t dug too deep, but the most obvious place to communicate this info back (the SPQueryThrottledException) doesn’t have anything.

Enterprise Content Types

In 2007, maintaining content types and site columns across an enterprise deployment effectively became a technical problem – the business could not take care of this without help. Changes they made would only be local to a site collection and code/scripts were required to roll out updates more globally. With Enterprise Content Types in SharePoint 2010, this common pain point is eased – effectively the model is a hub and spoke-type arrangement where a ‘master site’ is nominated, and then a service application/set of timer jobs takes care of synchronizing the changes to other sites which are hooked up to the same Managed Metadata service app as the hub. I decided not to demo this in my talk in the end, partly because it doesn’t really make the world’s most scintillating demo. However, that’s not to downplay the significance of this feature – I think this goes a long way to making SharePoint more manageable in the enterprise. Others have covered this well, a couple of good write-ups are:

User experience

Put simply, the new wiki-style page editing experience will have a huge impact on collaboration sites. The ability for users to place rich text, images and even web parts exactly where they like on a page (as opposed to having to deal with web part zones and the Content Editor web part) will make a lot of end users happy:

PageEditing

Interestingly if you don’t want the new page editing experience in team sites, you’ll probably need to do some customization work. It is possible to create web part pages in the new team site template (and even store them in the ‘Site Pages’ library alongside wiki pages), but the user experience isn’t obvious and the menu options aren’t right there in front of the user – you have to pass by the ‘New Page’ menu item and head to ‘More Options’ > ‘Web Part Page’ and specify which library to store it in. So if you did want the ‘old-style’ page editing, you’d probably either use a custom site definition or at least add something to the Site Actions menu or ribbon if sticking to the 2010 team site template. In any case, the wiki page editing functionality is defined by the actual field control used rather than the field type – it seems there’s a fair amount of behind-the-scenes jiggery-pokery which goes on with hidden fields, update panels and a new EmbeddedFormField control (in team sites at least, publishing sites continue to use the RichHtmlField control), but the upshot is that anywhere the new Rich Text Editor is used will display the new editing experience.

I know for a fact that the site owners on the collaboration roll-out I’m currently working on would welcome this with open arms. Consequently this is a significant for advance for SharePoint as an ECM platform, and I can see it being a key driver for some organizations to upgrade to 2010 for their collab sites.

Finally, as alluded to just now, publishing sites also get the new RTE – which as an aside, means things like the wiki-syntax for linking pages can be used there too.

Content Organizer

You may have already seen AC talk about this – in essence the Content Organizer allows you to add certain types of content (e.g. documents, pages, rich media files but NOT list items) to one bucket (known as the ‘Drop-Off Library’), and then let some rules define where it should actually end up within your site. A timer job handles the actual processing of the rules. Typically you’ll define rules based on metadata, so continuing to use my electrical goods example, here I’m creating a rule which moves documents with a ‘Screen type’ value of ‘Plasma’ to a library called ‘Television specifications’:

ContentOrganizerRules

I’m also specifying that subfolders should be created within the document library, such that I’ll end up with a set of folders for each of the ‘Screen type’ values encountered when documents are uploaded. In general, the way rule definition works is you first specify the content type the rule applies to, then the UI allows you to select columns on that content type to use as metadata filters. The metadata bit is actually optional though – your rule could simply be based on content type, with no further sub-filter based on metadata values.

If you’re like me, you may have wondered how the Content Organizer copes with certain scenarios – so here are a few findings:

  • What happens if users don’t go to the ‘Drop-Off Library’ to add documents there? Surely the rules won’t run otherwise? To ensure all uploads do indeed use the routing rules, you can specify that users should be redirected to the Drop-Off Library when they upload – this is pretty seamless to the user, they just see a small message in the dialog indicating their file will be routed:

    ContentOrganizerMessage
  • What happens if the user doesn’t have permissions to write to the library specified in the rule? The user will get an Access Denied, and the document gets cleared from the Drop-Off Library. I’d be interested to know if the ‘rule managers’ are notified in such a scenario (I didn’t have SMTP set up at the time of testing) – this is the group who can be configured to be notified when a document is uploaded which doesn’t match any rules, but I’m not yet sure if they get notified for other cases like this.
  • When specifying a rule to route items to another site’s Content Organizer, what happens if the rules there redirect back to this one? Somewhat tongue in cheek this one, but hey, it’s good to know what happens! The answer is the document stays in the Drop-Off Library, and as with any such routing failure, if configured the rule managers are notified after the specified wait period is over.
  • What does the user see if no rules are matched? As well as the notification to the rule managers, the user sees a message to inform them their document won’t be being routed anywhere just yet:

    RuleNotMatched

Steve Peschka has a great series of 3 in-depth articles on Content Organizer, and also you can quickly understand a lot about defining rules from the screenshots in AC’s article.

Document Sets

Briefly, Document Sets allow multiple items to be treated as one in terms of workflow, approval, versioning and so on. A classic use case might be a proposal which consists of several documents – however they are bundled up and given to a client all together, meaning they need to be treated almost as a ‘release’ in coding terms. There are some useful features in here, like the fact you can set default values on columns, which individual documents will inherit when added to the set. Also, each Document Set gets a “home page”, meaning content (including web parts) can be added to provide an entrance on to the information.

Liam has a good write-up here - “SharePoint 2010 User Experience – Document Sets

Tagging

Tagging in SharePoint 2010 is a deep area so I can’t do it justice here, and it reaches into the “social” arena as well as the Managed Metadata framework I’ve previously discussed. Some highlights are the fact that the social bookmarking feature (“Tags and notes”) allows any item inside or outside SharePoint to have your tags and notes added, and that your recent tagging activity is summarized in your activity feed on your MySite (à la Facebook tagging).

Christian Glessner has a write-up here - Managing Metadata in SharePoint 2010

Summary

SharePoint 2010 contains a raft of ECM enhancements, and any one could be a killer feature for your organization. The “ECM platform” features discussed here are relevant whether you are using SharePoint for collaboration/document management, WCM, Records Management or all three. Many of the criticisms of the 2007 release have been addressed, and right now it looks like SharePoint 2010 will be an even bigger hit for ECM than it’s predecessor.

No comments: