Tuesday 18 November 2008

Simple data access pattern for SharePoint lists

So, you're in the early stages of your project and coding has started. It's already becoming apparent that some abstraction is needed for accessing data in a few key lists, but you're not sure what. Of course there's no 'one-size-fits-all' answer to this question, but let's run through some options:

In my mind the best answer to this question is probably one of the first two - particularly on projects which are building something like a traditional application (i.e. with a domain model and entitities which are tied to data) on top of SharePoint, as opposed to projects which are doing say, simple WCM. However, like me, maybe you've still not spent significant time looking at LINQ4SP (and maybe have memories of the original LinqToSharePoint incarnation not quite making it to maturity) and are either in the same position with the P&P samples or are thinking it looks a little too enterprise for your current needs. And on a small project with a short dev cycle, I wouldn't blame you.

So onto other options - in terms of building your own DAL, well of course it's possible, but if you're on a smallish project then presumably finding the time to hand-code an abstraction for all your data needs is going to be tricky. Additionally I'm not too fond of this approach - I've just seen it done poorly by others too many times, resulting in a layer which performs badly/doesn't scale because objects aren't disposed correctly/doesn't provide the convenience it was intended to. And finally, as far as the 'do nothing' option goes, although we might not be striving for the ultimate pattern we are trying to avoid the duplication, inconsistency and maintenance nightmare that could come if we allow each developer on the project to work with the data how they like.

My suggestion:-

What I've used a couple of times is an 'in between' approach - specifically, in between doing nothing and building a complex DAL. What I'm showing here isn't exactly what I've used (made a couple of 'improvements' as I was tapping it out!), but I think of it as a very quick, lightweight pattern which at least helps reduce the worst of the problems:

public static class Employee
{
public static readonly string ListName = "Employees";

public static class Fields
{
public static string PersonTitle = "Person_x0020_title";
public static string FirstName = "First_x0020_name";
public static string LastName = "Last_x0020_name";
public static string StartDate = "Start_x0020_date";
public static string Division = "Division";
public static string ID = "ID";
public static string Salary = "Salary";
}

/// <summary>
/// Fetches employee list item - notice that caller has to supply SPWeb object, but all other implementation
/// details (list name, fields names) are taken care of..
/// </summary>
public static SPListItem GetEmployeeListItem(int EmployeeID, SPWeb Web)
{
SPListItemCollection queryResult = executeEmployeeLookup(EmployeeID, Web);

return queryResult[0];
}

/// <summary>
/// Fetches employee DataRow for 'read-only' situations, or where we want to cache/serialize etc.
/// </summary>
public static DataRow GetEmployeeDataRow(int EmployeeID, SPWeb Web)
{
SPListItem employeeItem = GetEmployeeListItem(EmployeeID, Web);
DataRow drEmployee = SPListItemCollectionHelper.CreateDataRowFromListItem(employeeItem);

return drEmployee;
}

/// <summary>
/// Private method which does actual query.
/// </summary>
private static SPListItemCollection executeEmployeeLookup(int employeeID, SPWeb web)
{
SPList employeeList = web.Lists[Employee.ListName];

// query employee list..
SPQuery employeeQuery = new SPQuery();
employeeQuery.Query = string.Format("<Where><Eq><FieldRef Name=\"{0}\" /><Value Type=\"Text\">{1}</Value></Eq></Where>",
Employee.Fields.ID, employeeID);

SPListItemCollection employees = employeeList.GetItems(employeeQuery);

return employees;
}

/// <summary>
/// Example of another method related to employees.
/// </summary>
public static SPListItemCollection FetchAllHiresSince(DateTime StartDate, SPWeb web)
{
SPList employeeList = web.Lists[Employee.ListName];

// query employee list..
SPQuery employeeQuery = new SPQuery();
employeeQuery.Query = string.Format("<Where><Geq><FieldRef Name=\"{0}\" /><Value Type=\"Text\">{1}</Value></Geq></Where>",
Employee.Fields.StartDate, SPUtility.CreateISO8601DateTimeFromSystemDateTime(StartDate));

SPListItemCollection employees = employeeList.GetItems(employeeQuery);

return employees;
}
}


Points of note:

  • We've centralized the repository details of employee data such as list name and field names

  • Getting hold of an Employee list item is now one line for the caller, and they don't need to know how to find the data

  • An SPWeb object must be passed, meaning the caller has responsibility for obtaining and disposing - more on this later

  • The caller has a 'get as DataRow' method - you might feel this isn't needed, but I think it can be a useful API function. In contrast to an SPListItem, a DataRow is completely disconnected from SharePoint and therefore there are no unmanaged SPRequest objects hanging off it which need to be disposed. This means it can be cached/serialized etc., and additionally can be passed around a deep stack of methods without the calling code having to be responsible for disposals (which by the way, would need to be done in a Finally block so the disposals happen even if an exception occurs)

  • For the 'get as DataRow' method, a helper class is used to translate from SPListItemCollection to DataTable and SPListItem to DataRow - this is principally because the existing SPListItemCollection.ToDataTable() method has a bug. (N.B. This is intentionally not in the form of extension methods for clarity!)

  • The class is not an instance class, so we don't do anything like wrap each list field with a property. This might not be as convenient for the caller, but means we don't have to worry about whether the data item is 'dirty'

  • Update operations are left to the caller

This means the calling code (in a SharePoint web context) looks something like this:



public void GetReadOnlyEmployee()
{
DataRow drEmployee = Employee.GetEmployeeDataRow(56, SPContext.Current.Web);

// do something/pass happily around codebase..
}

public void GetEmployeeForUpdate()
{
SPListItem employee = Employee.GetEmployeeListItem(56, SPContext.Current.Web);
employee[Employee.Fields.Salary] = 50000;
employee.Update();
}

To me, this approach has a good "bang for buck" because it's extremely quick to implement but does make the situation drastically better in team development. 

Passing the SPWeb object is key:-

In my article Disposing SharePoint objects - what they don't tell you, I highlight the difficulties of keeping track of objects to dispose in a complex class library. However, I'm now starting to see the code I used as an example there as something of an anti-pattern.  The simplified code sample I used to demonstrate the problem was:



public void DoSomething()
{
bool bDisposalsRequired = false;

// get list from SPContext if we have one..
SPList list = getListFromContext();
if (list == null)
{
// otherwise get list from objects we create..
list = getInstantiatedList();
bDisposalsRequired = true;
}

// do something with list..
foreach (SPListItem item in list.Items)
{
processItem(item);
}

if (bDisposalsRequired)
{
list.ParentWeb.Dispose();
list.ParentWeb.Site.Dispose();
}
}

private SPList getInstantiatedList()
{
// can't dispose of these objects here if we're returning a list - we'll be attempting to use
// objects which have already been disposed of..
SPSite site = new SPSite("http://cob.blog.dev");
SPWeb web = site.OpenWeb("/MyWeb");
SPList list = web.Lists["MyList"];

return list;
}

private SPList getListFromContext()
{
SPContext currentContext = SPContext.Current;
SPList list = null;

if (currentContext != null)
{
list = currentContext.Site.AllWebs["MyWeb"].Lists["MyList"];
}

return list;
}


In this structure, internal API code (i.e. the DoSomething() method) is responsible for obtaining the SPWeb object needed to find the list, which generally means it is also responsible for it's disposal. And this is where the difficulties arise. However, if the caller has to provide the IDisposable objects, it can then be responsible for calling Dispose() because it knows when they are no longer needed. Most often the caller will simply be passing a reference to SPContext.Current.Web, but if the caller happens to be a Feature receiver then SPFeatureProperties.ListItem.Parent.Web would be passed, or if no existing SPWeb object was available to the caller (e.g. console app) a new SPWeb object would have to be instantiated and passed. Disposals then become much simpler because they can always happen in the same place as any instantiations.

Conclusion:-

I'm sure there are better patterns out there, but even a simple approach like this provides much more structure than leaving it all to the caller (particularly when the calling code is being written by different developers!). For me the key thing is to standardize how code in your codebase accesses list data - much better than having field and list names dotted here, there and everywhere. One thing the approach doesn't specifically consider is unit-testing - the sample projects in the Patterns & Practices stuff look useful here, and I for one will be getting better-acquainted!

P.S. Thanks to Rob Bogue for helping me crystallize these thoughts, and apologies to Sezai Komur for not getting round to mailing a draft through earlier as I promised!

5 comments:

Sezai Komur said...

>> and apologies to Sezai Komur for not getting round to mailing a draft through earlier as I promised!


No problemo mate - love your work!

... and I love this pattern, its nice and simple and definetly better than doing nothing. The most common dev task people seem to do is get data out of lists. A simple pattern like this provides nice abstraction without being overly complex to implement. By leaving disposal up to the caller we can code static classes that do the DAL work and not have to worry about memory leaks caused by them.

Jeremy Thake said...

Great post Chris, I think I'm going to write a full comment on my post on this one, but for now a few things off the top of my head.

This is leading towards the days of SQL, .NET, ASP.NET web projects where you had a SQL database with a schema and stored procs to do all the CRUD and then a .NET layer to structure this all then a ASP.NET page to present it all. The .NET layer could have a data layer and a business objects layer.
With what you are suggesting, basically SharePoint takes over from SQL, stored procs and .net data layer.

This could potentially be autogenerated, although the likes of MyGeneration/CodeSmith used to use foreign keys to understand relationships between tables which would now be lists...and SharePoint isn't that great at relational data. It handles data types and required fields well though.

In conclusion...would you really go this far with SharePoint Lists?

Chris O'Brien said...

@Jeremy,

Indeed - and the code-generation approach you mention is one of the key aspects which makes the Linq approach possible with SharePoint - both the LINQ4SP and LinqToSharePoint implementations use it. I guess the only issue with this is that you have to go back and regenerate your proxy class (and recompile/deploy) whenever you add/change anything on this list. But presumably you're changing the code anyway, and this is what prompted you to make the list change.

So yes, I think for implementations which are looking for a 'patterns' approach to accessing SharePoint list data there are some good 'ORM' options there. I'm keen to use LINQ4SP on a real project soon.

In the meantime, something like my suggestion should hopefully serve as a quick enhancement to the default. Will look out for your post.

Cheers,

Chris.

Anonymous said...

Hi Chris. I'd be interested to know how you'd structure the data access pattern you show if the list was unknown (unknown that is until the calling code calls).

Would you just swap out the SPWeb objects with SPList?

I appreciate your example is tailored to data access for one list but I'm coming from a content type wrapper point of view.

Thanks. Great article.

Chris O'Brien said...

@Anonymous,

Since SPList doesn't implement IDisposable it's not necessarily the same problem. I think you'd still need to pass in the SPWeb object so the caller can be responsible for it's disposal. However, off the top of my head I don't see any issues with the generic method using it's own logic to obtain the actual list since it doesn't require disposal.

HTH,

Chris.