Sunday, 31 May 2009

More on optimization, HTTP 304s etc. – a solution?

In my last post Optimization, BLOB caching and HTTP 304s, I did a fairly lengthy walk-through on an issue I’d experienced with SharePoint publishing sites. A few people commented, mainly saying they’d noticed the same thing, but there have been further developments and findings I wanted to share!

Quick recap

Under certain circumstances some files in SharePoint are always re-requested by the browser despite being present in the browser cache (“Temporary internet files”). Specifically this is observed for files stored in the Style Library and Master Page Gallery, for anonymous users. Although SharePoint responds with a HTTP 304 to say the cached file can indeed be used (as opposed to sending the file itself again), we effectively have an unnecessary round-trip to the server for each file – and there could be many such files when all the page’s images/CSS/JS files are considered. This extra network traffic can have a tangible impact on site performance, and this is magnified if the user is geographically far away from the server.

A solution?

Waldek and I have been tossing a few development matters around recently over e-mail, and he was curious enough to investigate this issue for himself. After reproducing it and playing around for some time, Waldek discovered that flushing the disk-based cache seems to cause a change in behaviour – or in layman’s terms, fixes everything. To be more specific, we’re assuming it’s a flush of the BLOB cache which is having the affect – in both Waldek’s test and my subsequent validation, the object cache was also flushed as well:

FlushDiskCache

After the OK button is hit on this page, the problem seems to go away completely, so now when the page is accessed now for the first time as an anonymous user, the correct ‘max-age’ header is added to the files (as per the BLOB cache declaration in web.config) – contrast the ‘max-age=86400’ header on the Style Library files with what I documented in my last post:

AnonymousCorrectHeadersAfterFlushCache

This means that on subsequent requests, the Style Library files are served directly from the browser cache with no 304 round-trip:

SecondRequestNo304s

This is great news, as it means the issue I described is essentially a non-issue, and there is therefore no performance penalty for storing files in the publishing Style Library.

So what gives?

I’m now wondering if this is just a ‘gotcha’ with BLOB caching and publishing sites. I know other people have run into the original issue due to the comments on my previous post, and interestingly enough one poster said they use reverse proxy techniques specifically to deal with this issue. Could it really be that everybody who sees this behaviour just didn’t flush the BLOB cache somewhere along the way, when it’s actually a required step? Or is the testing that Waldek and I did flawed in some way? Or indeed, was my initial investigation flawed despite the fact others reported the same issue?

I am interested to hear from you on this – if you can reproduce the problem I’ve described with a publishing site you’ve developed, does flushing the BLOB cache solve it for you as described here? Leave a comment and let us know!

Good work Waldek :-)

Sunday, 17 May 2009

Optimization, BLOB caching and HTTP 304s

There's been an interesting mini-debate going on recently in terms of where to store static assets used by your site - images, CSS, JS files and so on. Broadly the two approaches can be characterized as:

  • Developer-centric - store assets on the filesystem, perhaps in the 12 hive
  • Author-centric - store assets in the content database, perhaps in the Style Library which comes with publishing sites

Needless to say these options offer different pros and cons depending on your requirements - Servé Hermans offers a good analysis in To package or not to package: that is the question. However, I want to throw another point into the debate - performance, specifically for anonymous users. Frequently, this is an audience I care deeply about since some of the WCM sites I work on often have forecast ratios of 80% anonymous vs. 20% authenticated users. Recently I was asked to help optimize an under-performing airline site built on MOSS - as usual the problem was a combination of several things, but one of the high-impact items was this decision to store assets in one location over the other. In this post I'll explain what the effect on performance is and why you should consider this when building your site.

The problem

Once they've been loaded the first time, most of the static files a website uses should be served from the user's local browser cache ("Temporary internet files") - without this, the internet would be seriously slow. Consider how much slower a web page loads when you do a hard refresh (ctrl+F5) compared to normal - this is because all the images are forced to be re-downloaded rather than served from the browser cache. Unfortunately, for files stored in some common SharePoint libraries/galleries (i.e. the author-centric approach) SharePoint doesn't deal with this quite right in some scenarios - most of the gain is there, but despite having the image locally, the browser still makes a request for the image - the conversation goes like this (for EACH image on the page!):

Browser: I need this image please - I cached it last time I came at [date/time], but for all I know it's changed since then.
Server: No need dude, it's not changed so just use your local copy (in the form of a HTTP 304 - "Not modified")
Browser: Fair enough, cheers.

This essentially happens because the file was not served with a "cacheability" HTTP header to begin with. Needless to say, this adds significant time to the page load when you have 30+ images/CSS/JS files referenced on your page - potentially several seconds in my experience (under some circumstances), which of course is a huge deal. If say, the user is in Europe but the servers are in the U.S., then suddenly this kind of network chatter is something we need to address. Needless to say, in the majority of cases we're happy to cache these files for a period since they don't all change too often, and we get better performance as a result.

The Solution (for some SharePoint libraries *)

Mike Hodnick points us to part of the solution in his highly-recommended article Eliminating "304" status codes with SharePoint web folder resources. Essentially, SharePoint's BLOB caching feature saves the day since it serves the image with a "max-age" value on the HTTP header, meaning the browser knows it can use it's local copy of the file until this date. This only happens when BLOB caching is enabled and has the max-age attribute like this (here set to 84600 seconds = 24 hours):

<BlobCache location="C:\blobCache" path="\.(gif|jpg|png|css|js|aspx)$" maxSize="10" enabled="true" max-age="86400" />

When we configure the BLOB cache like this we are, in effect, specifying that it's OK to cache static files for a certain period, so the "cacheable" header gets added. HOWEVER, what Mike doesn't cover is that this only happens for authenticated users - files served out of common content DB locations such as the Style Library and Master Page Gallery still do not get served correctly to anonymous users. Note this isn't all SharePoint libraries though - so we need to be clear on exactly when this problem occurs.

* Scope of this problem/solution

Before drilling down any deeper, let's stop for a moment and consider the scope of what we're discussing - a site with:

  • Anonymous users
  • Files stored in some libraries - I'm not 100% sure of the pattern but discuss it later - the Style Library and Master Page Gallery are known culprits however. Other OOTB libraries such as SiteCollectionImages do not have the problem.

If you don't have this combination of circumstances, you likely don't have the problem. For those who do, we're now going to look closer at what's going on, before concluding with how we can work around the issue at the end.

Drilling deeper

For a site which does have the above combination of circumstances, we can see the issue with Fiddler - as an anonymous user browsing to page I've already visited, I see a stack of 304s meaning the browser is re-requesting all these files:

BlobCachingDisabled_304s

However, if I'm authenticated and I navigate to the same page, I only see the HTTP 200 for the actual page, no 304s:

BlobCachingEnabled_No304s

Hence we can conclude it works fine for authenticated users but not for anonymous users.

So what can we do for our poor anonymous users (who might be in the majority) if we're storing files in the problematic libraries? Well, here's where I draw a blank unfortunately. Optimizing Office SharePoint Server for WAN environments on TechNet has this to say on the matter:

Some lists don't work by default for anonymous users. If there are anonymous users accessing the site, permissions need to be manually configured for the following lists in order to have items within them cached:

  • Master Page Gallery
  • Style Library

Aha! So we need to change some permissions - fine. This seems to indicate that it is, in fact, possible to get the correct cache headers added to files served from these locations. Unfortunately, I simply cannot find what permissions need to be changed, and nobody on the internet (including the TechNet article) seems to detail what. The only logical setting is the Anonymous Access options for the list - these are all clear by default, but adding the 'View Items' permission (as shown below) does not change anything:

AnonPermissions

As a sidenote, the setting above is (I believe) effectively granting read permissions to the identity which is used for anonymous access to the associated IIS site. So in IIS 7.0, I'm fairly sure you'd achieve the same thing by doing this:

AddPermsIUsr

So the problem does not go away when anonymous users are granted the 'View Items permission, and what I find interesting about this is that a closer look with Fiddler reveals some inconsistencies. The image below shows me browsing to a page anonymously for the first time, and to save you the hassle we can derive the following findings:

  • Files served from the 'SiteCollectionImages' library are given the correct max-age header (perhaps expected, since not one of the known 'problem libraries' e.g. Style Library)
  • Files served from the '_layouts' folder are given a different max-age header (expected, settings from the IIS site are used here)
  • Some files in the Style Library are in fact given a the correct max-age header! (not expected) 

MixedHeaders_Anonymous

So the 2 questions which strike me here are:

  • Why are some files being served from 'Style Library' with the correct header when most aren't?
  • Why can SharePoint add the 'max-age' header to files in the 'SiteCollectionImages' library but not the 'Style Library'?

The first one is a mystery to me - it's perhaps not too important, but I can't work it out. The second one might be down to how the libraries are provisioned - the 'Style Library' is provisioned by declarative XML in the 'PublishingResources' Feature, whereas the 'SiteCollectionImages' library is provisioned in code using the same Feature's activation receiver. Could this be the key factor? I don't know, but I'd certainly be interested if anyone can put me straight - either on this or the mystery "permissions change" required to make BLOB caching deal with libraries such as the 'Style Library'.

Conclusion

The key takeaway here is that for sites which want to take advantage of the browser caching for static files (for performance reasons) and have anonymous users, we need to be careful where we put our images/CSS/JS files as per Mike Hodnick's general message. If we want to use the author-centric approach and store things in SharePoint libraries, we need to consider which libraries (and test) if we will have the 304 problem. Alternatively, we can choose to store these files on the filesystem (the developer-centric approach) and use a virtual directory with the appropriate cacheability settings to suit our needs. My suggestion would be to use a custom virtual directory for full control of this, since the default settings on the '_layouts' directory ("cache for 1 year") are unlikely to be appropriate.

Monday, 4 May 2009

Fix to my Language Store framework for multi-lingual sites

In my last post, I talked about a fix to my Config Store framework for an issue which manifested itself on certain SharePoint builds, with Windows 2008 and a recent cumulative update seeming to be the trigger. Some of you may know that I produced a sister project to this one called the 'Language Store', which is designed to help build multi-lingual SharePoint sites - since this framework is built off the same underlying XML and plumbing, this solution was also affected.  So this post is just a short one to say that the fix has now been applied to the Language Store framework, and the new version is now available on Codeplex at http://splanguagestore.codeplex.com.

Problem/solution

The problem was effectively that items in the SharePoint list could no longer be edited - well, in fact they could be updated using code, but the list form .aspx pages were not showing the fields correctly so items couldn't be edited in the UI. Since it kind of defeats the point of SharePoint to have to write code to update list items (!), this was a big issue on affected builds. Interestingly some users reported working around the issue by removing/re-adding the content type from the list in the browser, but happily this is no longer necessary since the root issue has now been resolved. The problem was traced to some incorrect XML in my FieldRef elements - see the last post Fix to my Config Store framework and list provisioning tips for the full info.

General recap - the Language Store

If you're still reading, I figure some folks would welcome a reminder/intro on what the Language Store actually does - it's not about replacing SharePoint's variations functionality which is commonly used on multi-lingual sites. I noticed Spence gave it a better name in an e-mail recently where he described it as a 'term store' for multi-lingual sites - this actually captures what it does far better than my name for it. Effectively the idea is to provide a framework for the many small strings of text which are not part of authored page content which need to be translated and displayed in the appropriate language. As an example, here is a page from the BBC site where I've highlighted all the strings which may need to be translated but which don't belong to a particular page:

BBCExample

There are many of these in a typical multi-lingual site, and to help deal with this requirement the Language Store framework provides the following:

  • SharePoint list/content type/site columns etc.
  • API to retrieve items with a single line of code
  • Granular caching for high-performance
  • Packaged as a .wsp for simple deployment
  • All source code/XML freely available

If you want to find out more, see Building multi-lingual SharePoint sites - introducing the Language Store. The solution can be downloaded from the Codeplex site at http://splanguagestore.codeplex.com.

Apologies to existing users who were affected by the issue.

Thursday, 23 April 2009

Fix to my Config Store framework and list provisioning tips

Had a couple of reports recently of an issue with my Config Store solution, which provides a framework for using a SharePoint list to store configuration values. If you're using the Config Store this article will definitely be of interest to you, but I've also picked up a couple of general tips on list provisioning which I want to pass on. I have to thank Richard Browne (no blog) of my old company cScape, as the fix and several of the tips have come from him - as well as alerting me to the problem, he also managed to fix it before I did, so many thanks and much kudos mate :-)

Config Store problem

Under some circumstances, fields in the Config Store list were not editable because they no longer appeared on the list edit form (EditForm.aspx). So instead of having 4 editable fields, only the 'Config name' field shows in the form:

ConfigStoreMissingFields

I've not fully worked out the pattern, but I think the problem may only appear if you provision the list on a server which has the October or December Cumulative Update installed - either that or it's a difference between Windows 2003 and Windows 2008 environments (which would be even more bizarre). Either way, it seems something changed in the way the provisioning XML was handled somewhere. This is why the problem was undetected in the earlier releases.

I had seen this problem before - but only when the list was moved using Content Deployment (e.g. using the Content Deployment Wizard) - the original 'source' list was always fine. We managed to work around this by writing some code which 're-added' the fields to the list from the content type, since they were always actually present on the content type and the data was still corrected stored. Having to run this code every time we deployed the list was an irritation rather than critical, but something I wanted to get to the bottom of - however, on finding some folks were running into this in 'normal' use meant that it became a bigger issue.

The cause

I always knew the problem would be down to a mistake in the provisioning XML, but since I'd looked for it on previous occasions I knew it was something I was seeing but not seeing. In my case, Richard spotted that I was using the wrong value in my FieldRef elements under the ContentType element - I was mistakenly thinking that the 'Name' attribute needed to match up with the ''StaticName' attribute given to the field; the documentation says this attribute contains the internal name of the field. So my FieldRefs looked like this:

<ContentType ID="0x0100E3438B2389F84cc3965600BC16BF32E7" Name="Config item" 
Group="Config Store content types" Description="Represents an item in the config store." Version="0">
<FieldRefs>
<FieldRef ID="{33F5C8B4-A6BB-41a4-AB24-69F2152974C5}" Name="ConfigCategory" Required="TRUE" />
<FieldRef ID="{BD413479-48AB-41f5-8040-918F32EBBCC5}" Name="ConfigValue" Required="TRUE" />
<FieldRef ID="{84D42C64-D0BD-4c76-8ED3-0A9E0D261111}" Name="ConfigItemDescription" />
</FieldRefs>
</ContentType>

..to match up with fields which looked like this:

<Field ID="{33F5C8B4-A6BB-41a4-AB24-69F2152974C5}"
Name="Config category"
DisplayName="Config category"

StaticName="ConfigCategory"
....
....
/>

The CORRECTED version looks like this (note the change in value for the Name attribute of FieldRefs):


<ContentType ID="0x0100E3438B2389F84cc3965600BC16BF32E7" Name="Config item"
Group="Config Store content types" Description="Represents an item in the config store." Version="0">
<FieldRefs>
<FieldRef ID="{33F5C8B4-A6BB-41a4-AB24-69F2152974C5}" Name="Config category" Required="TRUE" />
<FieldRef ID="{BD413479-48AB-41f5-8040-918F32EBBCC5}" Name="Config value" Required="TRUE" />
<FieldRef ID="{84D42C64-D0BD-4c76-8ED3-0A9E0D261111}" Name="Config item description" />
</FieldRefs>
</ContentType>

So, the main learning I got from this is to remember that the 'Name' of the FieldRef attribute needs to match the 'Name' of the Field attribute - that simple. Why did it work before? No idea unfortunately.

However, I also picked up a few more things I didn't know about, partly from Richard (this guy needs a blog!) and partly from some other reading/experimenting..

Some handy things to know about list provisioning

  • To make a field mandatory on a list, the 'Required' attribute must be 'TRUE'. Not 'True' or 'true' - this is one of the cases where the provisioning framework is pernickety about that 6-choice boolean ;-)
  • FieldRefs need an ID and Name as a minimum (which must match the values in the 'Field' declaration), but you can override certain other things here like the DisplayName - this mirrors what is possible in the UI.
  • You don't have to include the list .aspx files (DispForm.aspx, EditForm.aspx and NewForm.aspx) in your Feature if you use the 'SetupPath' attribute in the 'Form' element in schema.xml (assuming you don't need to associate custom list forms).
  • You can use the 'ContentTypeRef' element to associate your content type with the list (specify just content type ID), rather than using the 'ContentType' element which needs to redeclare all the FieldRefs.
  • It's safe to remove all the default 'system' fields from the 'Fields' section of schema.xml

Going further than these tips, the best thing I found on this is Oskar Austegard's MOSS: The dreaded schema.xml which shows how you can strip a ton of stuff out of schema.xml. I've not tried it yet, but I'm sure that will be my starting point for the next list I provision declaratively. If you're interested in the nuts and bolts of list provisioning, I highly recommend you read it.

Happy XML'ing..

Tuesday, 14 April 2009

Slide deck from my deployment talk at Best Practices Conference

Had a great time presenting at the European SharePoint Best Practices Conference last week. I've been trying to put my finger on what made it such a good conference and I'm actually not sure, but I notice that other speakers and attendees have also been full of praise, so it's not just me. The event itself was extremely well-organized with excellent content, and Steve Smith and his team did a great job of looking after us speakers.

Highlights for me on the dev track were sessions from AC, Todd Bleeker, Eric (or "Uncle Eric" as I like to think of him, with his wise words on high-performance coding :-)) and Andrew Woody, but whenever I did stray from developer content I seemed to run into a great session like Mike Watson's on SQL Server in relation to SharePoint. Similarly I heard good things about speakers like Dan McPherson doing innovative sessions on the Information Worker track which I was disappointed to miss. [UPDATE: Here's a gratuitous shot of me in my session:]

COB_BestPracticesTalk_2

Another highlight was being on the two dev panel sessions we did, and having an interesting debate in one of them with Todd on approaches for provisioning - declarative (Features) vs. programmatic (code/PowerShell etc.). This was probably a good lead-in to my talk the next day, and some folks came up to say they really liked this conversation and that we covered it from angles they hadn't considered, which was good to hear. [UPDATE: Photo below of the second session, chaired by AC and with (from left to right) Todd Bleeker, Stacy Draper, Maurice Prather, Andrew Woodward, Ben Robb, Brett Lonsdale, me (with the mic) and Eric Shupps:]

DevPanel2

So all in all, a top conference, and fantastic to catch up with so many friends. Here's the link for my deck:

Slide deck - Approaches and best practices for deploying SharePoint sites through multiple environments (dev, QA, UAT, production)

SBP

Thursday, 26 March 2009

Command-line support for Content Deployment Wizard now available

I'm pleased to announce I've now completed initial development on the next version of the Content Deployment Wizard - this is a beta release for the next few weeks so if you need it "just work", you should continue to use the previous version (1.1), but I'm hoping some people out there are happy to test this beta. The tool has become fairly popular as a 'handy tool to have in the SharePoint toolbox', and hopefully this release extends it's usefulness significantly for some scenarios. If you're not familiar with the tool, it provides a way to import/export site collections, webs, lists, and files or list items, either between farms or between different sites in the same farm - the Codeplex site has more details. As previously mentioned, the key new additional functionality in this release is:

  • Command-line support
  • Support for saving of import/export settings to a file (in the Windows Forms app) for later re-use
  • An installer

Having command-line support for the Wizard means that it can now be used in an automated way. Some key scenarios I think this might be useful in are:

  • Continuous integration/automated builds - if your site relies on SharePoint content, you can now move 'real' data as part of a build process, copying selected content from 'dev' to 'build' or 'test' for example. I often see static data (perhaps from an XML file or Excel spreadsheet) used in this way in nAnt/CruiseControl/MSBuild scripts, but for frequently changing data (config values, lookup lists etc.), this doesn't work so well as there is always a static file to maintain separately. 
  • Deployment scripts - if you have deployment scripts to 'bootstrap' a website on developer machines, again pulling real data from a central 'repository site' can help here.
  • As part of a production 'Content Deployment strategy' - since out-of-the-box Content Deployment is restricted to deploying a web as the smallest item, the Wizard could be used to deploy selected lists/list items/files

Obviously you might have your own ideas about where it could slot into your processes too.

How it works

  1. First, we select the content to move as we would normally using the Wizard..

    SelectExportItems
  2. ..and select the options we want to use for this export..

    SelectExportSettings 

  3. On the final screen, the new 'Save settings..' button should be used to save your selections to an XML file: 

    SaveSettingsButton  
    This will then give you an XML file which looks like this:
  4. <ExportSettings SiteUrl="http://cob.publish.dev" ExcludeDependencies="False" ExportMethod="ExportAll" 
                    IncludeVersions="LastMajor" IncludeSecurity="None" FileLocation="C:\Exports" 
                    BaseFileName="BlogSubwebAndPageTemplates.cmp">
      <ExportObjects>
        <DeploymentObject Id="b0fd667b-5b5e-41ba-827e-5d78b9a150ac" Title="Blog" Url="http://cob.publish.dev/Blog" Type="Web" IncludeDescendants="All" />
        <DeploymentObject Id="cfcc048e-c516-43b2-b5bf-3fb37cd561be" Title="http://cob.publish.dev/_catalogs/masterpage/COB.master" Url="_catalogs/masterpage/COB.master" Type="File" IncludeDescendants="None" />
        <DeploymentObject Id="670c1fb3-12f3-418b-b854-751ba80da917" Title="http://cob.publish.dev/_catalogs/masterpage/COBLayoutSimple.aspx" Url="_catalogs/masterpage/COBLayoutSimple.aspx" Type="File" IncludeDescendants="None" />
      </ExportObjects>
    </ExportSettings>

  5. So we now have an XML 'Wizard deployment settings file' which has the IDs of the objects we selected and the export options. We'll go ahead and show how this can be used at the command-line, but note also these settings can also be loaded into the Wizard UI on future deployments to save having to make the selections again - the key is the 'Load settings..' button on the first page (which we didn't show earlier):

    LoadSettingsButton 

  6. For command-line use of the Wizard a custom STSADM command is used. We pass the settings file in using the -settingsFile switch. To run the export operation we showed above, our command would look like:
    stsadm -o RunWizardExport -settingsFile "C:\DeploymentSettings\ExportBlogSubwebAndTemplates.xml" -quiet
    The -quiet parameter is optional, and suppresses some of the progress messages which are returned during the operation.

  7. For an import operation, we follow the same process - go through the Wizard and select the settings for the import operation, then click 'Save settings..' at the end to get the file (N.B. note the 'Import settings' screen has been simplified slightly from previous versions):

    SelectImportSettings
  8. The command to import looks like this:
    stsadm -o RunWizardImport -settingsFile "C:\DeploymentSettings\ImportBlogSubwebAndTemplates.xml" -quiet
    So that's both sides of it.

Using it for real

In real use of course, you may be deploying from one SharePoint farm to another. In this case, you also need to deal with copying the .cmp file from the source environment to the target if you're going across farms - if you have network access between farms (e.g. you're using it internally for automated builds/CI), a simple XCOPY in your scripts is the recommended way to do this. For production Content Deployment scenarios with no network connectivity, what I'm providing here will need to be supplemented with something else which will deal with the file transport. Clearly something web service based could be the answer.

Summary

Using the Wizard at the command-line may prove extremely useful if you need to move any SharePoint content regularly in an automated way. In contrast with other ways you might approach this, the XML definition file allows you to choose any number of webs/lists/list items/files to move in one operation, which may suit your needs better than shipping items around separately.

This is very much a beta release, but as a sidenote I'm expecting the initial issues to mainly be around the installer rather than core code - hence I'm providing a 'manual' install procedure which will get you past any such issues (see the readme). Needless to say, all the source code is also there for you on Codeplex if you're a developer happy to get your hands dirty. As I say, I'm hoping a couple of friendly testers will try it out and help me iron out the wrinkles - please submit any issues to the Codeplex site linked to below.

You can download the 2.0 beta release of the Wizard (and source code) from:

Monday, 9 March 2009

Update on next version of Content Deployment Wizard

Generally I only ever talk about SharePoint tools I'm working on once they're 100% complete and ready for use, but recently I had a conversation with someone at a user group which made me think about a policy change. Regular readers will know the main tool I'm associated with is the SharePoint Content Deployment Wizard which has become fairly popular (over 7000 downloads) - occasionally I've mentioned that one goal was to implement a command-line version, since this opens up all sorts of deployment possibilities. However I've not talked about this for a while, and just recently I've spoken to a couple of people who assumed I dropped this/didn't have the time to look at it, so here I am to tell you this is not the case!

For anybody that cares, the good news is I've actually been working on this since December interspersed with blogging, and am nearly done. The yicky refactoring work is complete, and I got chance to write the custom STSADM command on the front of it on the flight to the MVP summit last week. I need to do more testing first, but I'm hoping to release a beta to Codeplex over the next couple of weeks - if you're interested in the idea of scripted deployment of specific sites/webs/lists/list items between sites or farms (remember MOSS Content Deployment only does sites/webs and requires HTTP(S) connectivity), I'm hoping some friendly beta testers will help me screw the last bits down. The key aspects of this release are:

  • Command-line support
  • Support for saving of import/export settings to a file (in the Windows Forms app) for later re-use

Shortly after this release, I'm hoping to add support for incremental deployments (so only the content which has actually changed in the sites/webs/lists/you select will be deployed), but that's not going to make into this next cut unfortunately.

Keep tuned for further updates :-)

Other stuff

Whilst I'm at it, other things in the pipeline from me include:

Needless to say, there are plenty of other blog articles on my 'ideas list' too.

Sidenote - reflecting on 2 years of SharePoint blogging

Bizarrely, I'm into my 3rd year of SharePoint blogging now. I've no idea how this happened. Having done some interesting work with SharePoint's Feature framework, the initial idea was to write 4 or 5 articles I had material for - as a record for myself more than anything - and be done with it. Since then, although I do write the odd 'easy' post (like this one), generally my articles seem to take a long time to get completed, but I know they could be better. Occasionally I get reminded of this! So there's a long way to go for me to become a better blogger, but I'm fully hoping to still be at it in another 2 years time - and I'll have plenty more to say when the next version of SharePoint approaches :-)

Tuesday, 24 February 2009

UK user group meeting in London this Thursday, with Q & A panel

Just a quick note to remind UK-based folks within reach that there is a UK SharePoint user group meeting in London this Thursday. There are two sessions, one of which is an open Q & A for you to bring your trickiest SharePoint questions - I'll be amongst those on the panel representing the developer side of the house, but the line-up will cover all the bases. Needless to say, if you don't get chance to ask your question during the main session, there'll probably be ample opportunity in the pub afterwards. Michael Noel's session also looks extremely interesting, with a whole host of architecture/infrastructure knowledge condensed into one easily-digestible chunk.

Details from the suguk.org - to sign-up, use the link at the bottom of this post:

Session 1 - Building the Perfect SharePoint Farm: A Walkthrough of Best Practices from the Field - Michael Noel (see books written by Michael)

SharePoint 2007 has proven to be a technology that is remarkably easy to get running out of the box. On the flipside, however, some of the advanced configuration options with SharePoint are notoriously difficult to setup and configure, and a great deal of confusion exists regarding SharePoint best practice design, deployment, disaster recovery, and maintenance. This session covers best practices encompassing the most commonly asked questions regarding SharePoint infrastructure and design, and includes a broad range of critical but often overlooked items to consider when architecting a SharePoint environment. In short, all of the specifics required to build the 'perfect' SharePoint farm are presented through discussion of real-world SharePoint designs of all sizes.
• Learn from previous real world deployments and avoid common mistakes.
• Plan a checklist for architecture of SharePoint environments of any size.
• Build the 'perfect' SharePoint farm for your organization.

Session 2 - SharePoint Q & A Session

Following the session from last year we thought it would be a good idea to have a session where you can bring your SharePoint problems and hassles to and we can debate them as a group. We'll have a whiteboard, a laptop, and lots of clever people to discuss your questions and issues - so bring along your best and toughest!

The meeting is hosted at Microsoft in Victoria - arrive 6pm for a 6:30pm start:

Microsoft London (Cardinal Place)
100 Victoria Street
London SW1E 5JL
Tel: 0870 60 10 100

To register, simply reply to this thread leaving your full name - http://suguk.org/forums/thread/16904.aspx

Look forward to your questions :-)

Tuesday, 10 February 2009

Extending the web part framework - part 2

In part 1, I showed how we implemented a 'toolbox' of page templates and functionality modules wrapped up in a governance framework, to fulfil our client's requirement of a flexible WCM platform for building 80-100 internet sites with varying requirements. In this post, I want to detail some of the issues we ran into and the resolutions we found, focusing primarily on the 'module framework' we developed which is heavily-oriented around SharePoint web parts. 

Quick recap

The client is a large multi-national enterprise, and the idea is that content authoring teams in 80-100 countries will take what we've delivered on MOSS to create their country's internet presence e.g. .com, .co.uk, .fr, .es etc., replacing the existing mish-mash of sites on different technologies with inconsistent branding/look and feel.

In terms of the module framework, the cornerstones of our implementation were (see part 1 for more complete details on these):

  1. Module matrix - rules for which module can be used where, to guide authors away from building a user experience which doesn't  'make sense'
  2. SmartPart-like approach, but with web part properties - web parts wrapping user controls but also supporting web part properties exposed in custom tool parts
  3. Base web part/base tool part class - responsible for 'framework' behaviour such as checking if the current web part can be added (according to the module matrix)
  4. Combine interface of publishing field controls with web part storage - since publishing field controls (e.g. RichHtmlField) must be added in a 'static' manner at design-time but our authors can add controls dynamically at run-time, we developed custom controls which combine the rich functionality of the publishing HTML editor with web part storage
  5. Control adapter for WebPartZone for accessibility compliance - to get round the problem of all the HTML tables generated by SharePoint's web part framework, which will prevent a site validating for AA
  6. Present only our web parts in the web part picker - since standard SharePoint web parts are not used anywhere in these sites
  7. Remove unnecessary options when editing web part properties (tool parts) - to avoid confusing the authors

Issues and resolutions

I think that many of the challenges we faced are worth sharing as they came about through general web part development, rather than anything specific to what we did. Before I detail the actual gotchas, take note of some key development characteristics of our project:

  • Solutions and features used to deploy artifacts such as page layouts, content types etc.
  • Kivati Studio used for some other deployment aspects
  • Main functionality implemented in user controls - web parts were effectively thin wrappers around the .ascx files using LoadControl()
  • Web parts which are 'mandatory' are added to pages using the AllUsersWebPart element in a feature (though as the points probably illustrate, we looked at numerous ways of dealing with this)

Finding #1 - web parts outside of web part zones cannot be edited

The reason we wanted to have web parts outside of zones (perfectly possible by dragging a web part directly into page layout markup in SharePoint Designer) is for 'fixed' page modules which could not be removed by the content author. When we placed web parts outside of web part zones, we found the web parts would run fine in presentation mode but unfortunately cannot be edited (e.g. to edit web part properties) - the edit menu for the web part simply does not appear. I speculate this is because it is web part zones which are linked to web part storage, and thus web part properties cannot be persisted without a zone (the values in the markup will always be used). Hence, if you want editable web parts, you need web part zones.

Resolution - ensure all web parts (even ones which cannot be removed) live in a web part zone.

Finding #2 - embedding web parts into user control markup appears to be problematic

We tested various permutations of using web parts in/out of web part zones, and also with the HTML markup directly in the page layout .aspx or in a child .ascx file. After establishing that web part zones were required, we also found that whether the markup was in the .aspx or .ascx appeared to make a difference. This was unexpected, but the net effect seems to be that if you insert the web part markup into a web part zone which is in a user control rather than directly in the page layout .aspx (i.e. by refactoring the HTML markup for the web part zone and it's contents into a user control), again the edit menu will not display. I'm not sure why this is, but it could be related to the page execution lifecycle.

Resolution - accept that if web part zones will have web parts added to them at design-time by markup, the web part zone declaration cannot be in a user control.

Finding #3 - when using AllUsersWebPart element, duplicate web parts appear if the feature containing your page layouts is reactivated

Having decided our 'fixed' web parts would be added to pages using the AllUsersWebPart feature element (N.B. using this approach, 'default' web parts are associated with page layouts in the feature which deploys them. Web part zones are left empty on the page layout, and SharePoint provisions the web part into the zone at the time of creating a page from the layout). The issue we had with this is that all the web parts in all the zones in existing pages would be duplicated if the page layout feature was reactivated - this is because this XML is used both when the feature is activated (in the same way as say, provisioning for content types happens on activation) but also when new pages are created from a page layout.

Resolution - write a script (a Kivati task in our case) to remove duplicate web parts across all sites

[UPDATE - Waldek has an elegant solution to this problem in 'Preventing provisioning duplicate Web Part instances on Feature reactivation', as well as sample code similar to what we wrote for our script. DOH!]

Finding #4 - duplicate web parts can also appear when the page layout is customized (ghosted)

I'm not exactly clear on the reasons why customized files would ever cause duplicate web parts to appear, but that's certainly what we seemed to find. What happened is that we would deploy our master pages/page layouts using a feature to our QA environment, but immediately these files would be provisioned in that site as customized (i.e. the content in the content database), instead of being uncustomized and referenced on the filesystem. After further investigation, we traced the cause of this unexpected behaviour to the use of these attributes SPD adds to page layouts:

meta:progid=”SharePoint.WebPartPage.Document” meta:webpartpageexpansion=”full”

Resolution - ensure the version of the file does not contain these attributes. We actually switched to running uncustomized master page/page layouts even in our development farm. This means that we deployed the files using a feature and thereafter never opened them in SPD (editing only the source-controlled feature file instead).

Finding #5 - avoid setting default properties in the web part definition file (.webpart)

A final lesson we learnt is that, when working with web parts it's often better to avoid using the .webpart definition file extensively for setting default property values. There's nothing wrong with the mechanism - effectively these values are read whenever the web part is provisioned on a page, and your instance will set it's properties to these values. The problem, of course, is when you realize a property value you defined in the .webpart file needs to be updated because something changed. What happens to all the existing instances on pages around your site? As you might guess, the answer is nothing - unless you take steps to update those also, which generally means writing some kind of script to use SPLimitedWebPartManager. This can be pretty inconvenient when all you wanted to do was quickly change a default value.

Resolution - consider ensuring .webpart files are stripped to the bare minimum (assembly name etc.) and configuration comes from somewhere else. We typically rolled these config items into our use of the Config Store.


Summary

We ran into a few unexpected gotchas when building on the web part framework, but steps can be taken to minimise their impact. Hope you find these useful if you do web part development. Special thanks to Karoly Szalkary for helping to refresh my memory on some of these!


P.S. After 2 years writing about it, I've decided I no longer need to capitalize the 'f' in 'feature' - I think we're all on the same page on that one now ;-)

Saturday, 31 January 2009

Extending the web part framework - part 1

Today I want to show some of the interesting things we've been doing with web parts for one of our clients. There's quite a lot to talk about so it will be over two articles:

  • Part 1 - background and implementation
  • Part 2 - issues and resolutions

There are a couple of things in particular which I think are quite cool, as we've effectively combined classic WCM (publishing) site functionality with a customized implementation of the web part framework. The context is a fairly large roll-out to an enterprise client, but what we're rolling out is a centralized platform for 80-100 internet sites. The idea is that content authoring teams in 80-100 countries will take what we've delivered on MOSS to create their own sites - replacing the existing mish-mash of sites on different technologies with inconsistent branding/look and feel.

Clearly a key challenge here is satisfying the diverse needs of so many stakeholders. So a cornerstone of the platform is that sites can be tailored somewhat, so each country has some flexibility to communicate with their audience in the way they think is best. We effectively give the authors a set of page templates and building blocks, and a system which governs how the blocks can fit together so that the user experience will still 'make sense'. Needless to say, a lot of analysis and consideration has gone into this - both in terms of what functionality was needed but also user journeys and navigation through the site, and the experience architects on our side (LBi) played a vital role here. There are many aspects to the project I could zone in on, but since I want to focus on the implementation details here, I'll briefly list some of these building block requirements before showing how we did it.

Key requirements/challenges:

In order to create the different page types, we needed around 15 page layouts, including these:

  • Home page
  • Channel hub/Alternative hub/Sub home - these are different template options for '2nd and 3rd level' pages 
  • Content page
  • Product page
  • Media release
  • List - provides links to a series of related pages
  • Etc.

And whilst some aspects of page functionality was 'fixed' on the template, there were many other items which were optional - these were to be added to pages by the authors, either in a 'web part' kind of way or perhaps something else. Some examples of these optional 'page modules' were:

  • 'Hero' feature - used to highlight something on prominent pages with an image/flash/text
  • Right-hand promo
  • Content editor module - allows an author to enter arbitrary content, but for reasons which will become clearer we developed an interesting custom control which is kind of a cross between a publishing HtmlField and a Content Editor web part (covered later)
  • Generic content module - rolls-up formatted content/links to a selected page
  • List/tabbed list - provides links to a series of related pages
  • Dynamic share price - displays latest stock price based on web service call
  • Product selector - using AJAX cascading dropdowns to filter products
  • Etc.

Although there were lots of other challenges (such as multi-lingual content, packaging/documenting every deployment aspect so the hosting company could deploy etc.!), I felt that building the 'framework' could be more challenging than individual functionality bits. To help frame what you'll read next, some initial questions we had for the implementation were:

  • How do these optional bits of functionality get added to the page? As web parts, or something else?
  • How do we get accessibility-compliance if web parts?
  • How do we provide configuration if not web parts?
  • How do we restrict which modules can be used where (as per the specification)?
  • Since we're in a 'flexible' publishing site, how do we determine which fields are needed on the content types? Does each content type need to have all the possible fields the author might choose to add?
  • If we are working with publishing controls, how would we bind the dynamically added control to the 'back-end' publishing field on the content type?

The implementation

As well as the optional page modules, most of the templates had a classic set of publishing fields. After looking at custom approaches, we concluded the web part framework had a lot going for it for the optional stuff - clearly we could avoid building a user interface to pick the module from a list/add to page/allow configuration of properties specific to the module, and also get drag and drop (amongst other things) as an added bonus. The concept of web part zones - as a container where one or more modules could be placed - was also important to our page structure.

Another challenge for the optional modules was where to store the data. If they were publishing fields, we would need every possible module to have a corresponding field on every possible content type, and this was pretty impractical when looking at the spec. Web parts, of course, use a different model and the framework takes care of data storage regardless of how many controls are on the page.

On the downside, a key thing to remember with web parts in publishing pages is that web part data is (by definition) not stored in publishing fields, and therefore isn't versioned in the same way. After discussing with the client, in our case this proved to not have as big an impact as we initially thought, due to the split and nature of what content would be stored in publishing fields vs. what would be web parts. So, having the client's acceptance of this trade-off, we went with web parts and came up with these solution elements:

  1. Module matrix

    This comprised two SharePoint lists which contained the 'rules matrix', to enforce the design team's specification of what functionality could be used on which page type. Effectively the data provides the mapping of modules and page layouts. Being list data, it meant that it could be easily updated by the central team if a policy change was required. This data was consumed by our base web part (point 4).

  2. SmartPart-like approach, but with web part properties

    We wanted the actual functionality of our web parts to be implemented in user controls, for the typical reason of avoiding building HTML in C# code (wrong on so many levels!). This is obviously what the SmartPart does using LoadControl(), but we had the additional requirement of needing to pass web part property values to our user controls - this meant we could use the familiar 'tool part' interface (i.e. setting web part properties in the right-hand pane) for control configuration. 

    In our model, each user control has a corresponding 'wrapper' web part/tool part which understands which properties are required and how to build the properties UI. In the web part's OnInit() method, values are passed from the web part properties to the user control so that the latter is initialized ready to do it's processing.

  3. Base web part/base tool part

    All our web parts/tool parts were derived from our custom classes which abstracted some responsibilities. Since we couldn't easily change the web part picker screen to only display appropriate web parts for the zone the author had selected, we built the check into the base web part - if an 'invalid' web part was added, the web part renders nothing in presentation mode but in edit mode we display a message to the author like this:

    ModuleNotValidMessage 

    Adding too many web parts to a zone (count determined in the module matrix data) would have a similar effect.

  4. Combine interface of publishing field controls with web part storage

    Having decided to use web parts for our control architecture, we had one requirement for something similar to the standard Content Editor web part (CEWP). However, this control is pretty lame compared to the MOSS publishing HtmlField, and we quickly established our client needed more than the basic CEWP. So we combined the bits we wanted from both - the front end control used by the publishing field type (the RichHtmlField control), but the backing store of web part storage rather than a publishing field. This meant authors could add multiple instances of this optional module to their page (and get the nice editing experience), but because it's a web part we didn't need to worry about having a corresponding set of fields on each possible content type. In code/integration terms it's the same approach, but in the end we actually swapped the standard MOSS control for the control which fronts Telerik's RADEditor field since the client wanted to move to this: 

    CustomContentEditorWebPart  

    Also note use of another control typically used with publishing fields here, the AssetUrlSelector - this provides the 'Browse...' button shown above, and can be used to provide a friendly way for an author to browse to a file.

  5. Control adapter for WebPartZone for accessibility compliance

    Since web parts normally render with a stack of nested HTML tables which won't validate against AA, action needs to be taken to remedy this if accessibility is a design goal. However this isn't necessarily a big deal - the approach is that you 'correct' the HTML for the WebPartZone control in presentation mode only, thus leaving the tables intact in edit mode for all the web part editing framework stuff which needs to happen. You do lose the client-side Web Part Services Component (WPSC) API doing this, but we had no requirement for it anyway (I rarely see it used). I initially assumed I'd have to write a control adapter to do this, but I found that David Schneider has already done the job - this works fine. It's also possible the latest version of the AKS has one, can't remember if I checked.

  6. Present only our web parts in the web part picker

    Since this is a highly bespoke WCM platform rather than a standard collaboration environment, we don't want to see any of the standard web parts in the picker for these sites. Two steps to this one:

    - delete all the .webpart files from the web part galleries in the sites (N.B. we used Kivati for rolling out such changes across all the site collections - more on this in the future). However, doing this will still leave you with ListView web parts for all the lists/libraries in your site, so you also need to..
    - ensure all your WebPartZone declarations have the little documented 'QuickAdd-ShowListsAndLibraries' property set to false:
    <WebPartPages:WebPartZone id="g_AB07678E486C46bc962DFC8446A6CD13" runat="server" title="Zone 1" QuickAdd-ShowListsAndLibraries="false" />


    Authors are then not confused by any standard web parts which aren't appropriate for our scenario:

    StrippedWebPartPicker


  7. Remove unnecessary options when editing web part properties (tool parts)

    Finally, we do a bit of work with the accompanying tool parts (for properties editing) for our web parts to avoid confusing our authors with options which won't take effect. As an example, for a web part which looks like this in presentation mode:

    ProductSelectorModule 

  8. The tool part looks like this:

    ProductSelectorToolPart

    In case you're wondering what to look at, it's that we've removed the standard options SharePoint would normally provide for every web part (such as chrome style etc.), since we want to control these to ensure proper formatting. Normally we'd have these sections at the bottom of the tool part:

    RemovedToolPartOptions

Summary

There are many ways SharePoint's web part framework can be extended, and here I'm only showing the path we followed. For a requirement such as our client's, web parts provided a great starting point, perhaps showing there can sometimes be a place for web parts in an accessible publishing site so long as the trade-offs are understood and accepted.

In part 2 of this series we'll look at issues encountered and their resolutions.