Sunday, 31 August 2008

Developer lessons learnt - SharePoint WCM in the finance sector

So in my recent SharePoint WCM in the finance sector post, I talked about what we built and why I think the result is kind of interesting. What I want to do today is share some of the technical lessons learnt, and give a sense of what worked and what didn't. As I mentioned last time, UK-based folks will hopefully be able to gain more than I can provide here when the site gets presented at the SharePoint UK user group, meaning we'll answer any question you care to come up with, not just some of the developer stuff I want to discuss today.

Now to frame all this, it's important to consider the type of project this was - the terms mean slightly different things to different people, but to me the emphasis was on 'development', as opposed to 'implementation' or 'customization'. In code terms, we ended up with the following:

  • 17 Visual Studio projects in total
  • 4 Windows services
  • 5 nightly batch processes
  • 5 supplementary SQL tables (outside of the SharePoint db)

Not bad for 8 weeks work. As an aside, although the first number seems surprisingly high, in the technical washup I did with the team nobody thought this wasn't the right way to factor the code. This is partly explained by the fact this single project is actually part of a bigger program of projects being done for the client, and also partly by the complexity of the Endeca search implementation and batch processes we needed.

What worked well (in no particular order)

  • Using a 'development farm' amongst the developers - this means the content database is shared, and thus no effort is required for one developer to see the lists, site columns, content types, master pages/layouts etc. created by others. This is actually the only way to do team development in SharePoint for me, but worthwhile mentioning it as I know not all SharePoint shops do things this way.
  • Proper use of tracing - this is the idea of writing log statements throughout code to easily diagnose problems once the code has been deployed to other environments (e.g. QA/UAT/production). We used the standard .Net System.Diagnostics trace framework with levels of Verbose, Info, Warning and Error - this has been familiar to me for a long time but a couple of the devs were new to it and agreed it was invaluable. In particular, we had a lot of library code and it's often difficult to find logic bugs when you can't directly see the result of something on a screen. For me, tracing essentially gives you the power to find certain bugs in seconds or minutes which could otherwise take hours to resolve. Although adding the tracing code can slow down coding, to mitigate this we used..
  • ReSharper - at the start of the project I created several ReSharper templates to call our common code e.g. for tracing, and got all the team to download trial versions of ReSharper. This meant we could add trace statements in just a few keystrokes, meaning the 'I didn't have time to add trace!' excuse couldn't be used :-)
  • My Config Store framework based on a SharePoint list * - we stored over 130 'configuration items', from 'True'/'False' config switches such as 'enforce password change for users first logon,' to known URLs, to certain strings displayed throughout the site. We also found a couple of areas for improvement (e.g. field not big enough to store XML fragments!) which will hopefully make it into the next release.
  • Implementing logging/notifications for unhandled exceptions - I know the MS Enterprise Library a component for this, but we developed our own using a HTTP handler which sits in front of SharePoint's SPRequest handler. This means that whenever something happens in the code which we're not expecting, we get to find out about it immediately and can see the stack trace and other debug info in the e-mail. This was invaluable when the testers got to work, as it meant we could proactively deal with bugs before they even got reported. As soon as we noticed a mail with a new exception, we shout over to the particular test guy (identified by the user ID) "What exactly did you just do?" (which impressed them greatly!), so we nail the exact set of circumstances/data which caused the bug right there and then. 
  • My Content Deployment Wizard tool * - I also played the 'deployment/release manager' role on this project, so I was probably the guy who benefited from this the most but I've actually used it somewhere on every single project I've done since building it. For releases when the team had updated x page layouts, x lookup lists and x Config Store items, the tool is invaluable for picking out just the changed items and deploying them to the other environments. For Config Store items in particular it was useful as some config is different between environments (similar to web.config keys) so you don't want to overwrite the entire list. For early releases when the team had made lots of of complex 'schema' updates (such as lots of intricate changes to site columns/lookups/content types), due to the time pressures I elected to take the 'everything will definitely work this way' route and drop the site collection on the target and import the whole thing (since no valuable data to preserve) so there are some complex deployment scenarios I still haven't fully tested personally, but with 3 environments on top of the dev environment to deploy to the Wizard was prettes tilitiy much a lifesaver.
  • Cross-site lookup field by SharePoint Solutions - this solves the problem that a lookup column can only lookup data in the current web. We use this for several key sets of data so we get to have one copy and one copy only. Damn useful.
  • LINQ to SQL - we use this for CRUD operations in our supplementary SQL tables, and the guys who used it agreed they saved significant time over the standard approach of writing ADO.Net code.

* hopefully it doesn't come across as shameless self-promotion to include these - the very reason I built them was to solve recurring problems I saw on SharePoint dev projects, and both utilities really did help us here.

Project challenges (in no particular order)

  • Team Foundation Server weirdness - for reasons we still haven't established, we found the .csproj file for the web project (i.e. the most critical VS project!) would be checked out whenever a developer compiled the solution. With multiple checkout enabled, this means that pretty much every developer had the project file checked out all the time, regardless of whether he/she was making any project changes (e.g. adding new classes). This meant we had many more merge issues than normal - not fun.
  • VM issues - for a while we thought ReSharper was the culprit here, but a VS hotfix brought more stability. A hunch says at least some of the issues are 64-bit related (our dev environment was matched to production in this respect), since often the problem would manifest itself via Visual Studio (a 32-bit application remember). Frequent VS crashes, "attempted to read or write protected memory" messages in the event log - oh joy.
  • Failure to identify shared code soon enough - often a concern on complex development projects when the team is working at high speed. We did daily standup meetings (similar to scrum) but I suspect we may have focused too much on issues rather than what was being 'successfully' developed. So we lost some time to refactoring to bring things back in line, but this is why I like to think of the approach as 'Dangerously Rapid Application Development' (for those who remember the term ;-))
  • Issues arising from sharing IP addresses on SSL - in several of our environments, we attempted to use the technique documented by Adrian Spear in To setup SSL on multiple Sharepoint 2007 web applications using host headers under IIS 6.0. I've used this successfully in the past but had some problems this time round - despite working fine in our QA environment, we had problems in other places. After carefully analyzing the differences, I worked out that this technique will only work if the SSL certificate being used is a wildcard certificate or is matched on the machine name rather than the site URL. This might be obvious to other people but wasn't to me!

Hope someone finds this useful!

5 comments:

Alex Angas said...

Thanks Chris, there's some great techniques/processes here. I agree with you on the development farm approach by the way.

Could you please provide the KB number for the VS hotfix? I've been having the same issue thinking it was Resharper.

Cheers, Alex.

Andy Burns said...

Interesting. I like the idea of the HTTP module for catching exceptions and alerting. Neat!

And I agree about having a development farm - but did it belong to the client? I'm campaigning for virtual development farms here, but we seem to end up with too many VMs for the servers all the time, what with the number of clients. That'd be different if we were using the client's hardware, though...

And speaking of which - what type of virtual machines were you using? We've got VMware, and haven't had any problems (from the virtualisation, anyway!)

Chris O'Brien said...

@Alex,

I believe the KB number is KB 947841.

Another alternative available now (but not when we had the problem) would probably be to install SP1 for VS2008 though.

Cheers,

Chris.

Chris O'Brien said...

@Andy,

The dev farm and QA (testing) environment is on our side. The client also has a UAT (User Acceptance Testing) environment in addition to production - this is a fairly common arragement in my experience.

We used VMWare for virtualization because our dev farm was 64-bit (to match production). It's now possible to do this with MS technologies though under Windows 2008.

HTH,

Chris.

Brian Pulliam said...

Thank you for posting this, very helpful.