Wednesday, 27 July 2022

Identifying Syntex use cases - how the SharePoint Syntex assessment tool can help

If you're a regular reader of this blog, you'll know I'm a big advocate of what Microsoft are doing with SharePoint Syntex. In short, Syntex brings intelligent automation to every organisation for processes which involve documents - and in most businesses today, that's a substantial proportion of processes. We're in the middle of big shift in technology where cloud power is commoditising advanced AI and automation capabilities so that they are no longer restricted to expensive, specialist, and often industry-specific tools. Examples of this include the legal and engineering sectors, in which it has been common to invest in specialist proposal generation and contract automation software, but usually at significant cost. Instead, these AI and automation tools are now baked into core platforms such as Microsoft 365 and democratised so that 'ordinary' employees can tap into them - add-on licensing might be required, but these tools have never so widely available. 

As market awareness grows, Syntex is forming quite a few of my client conversations at the moment. Organisations are considering how this new tool could help them, and in common with other innovative technologies one of the challenges is identifying use cases within the business where Syntex could have a big impact. I have a lot of thoughts on this in general, but one thing Microsoft have done to help is release the Microsoft 365 Assessment Tool (gradually replacing what was the PnP Modernization Scanner) which now has a 'Syntex mode' specifically - this can be used to assess your tenant and usage for Syntex automation opportunities. The idea is that by scanning your SharePoint landscape and IA for certain characteristics, this could uncover areas of the business using SharePoint in certain ways where Syntex could help. In reality, Syntex really shines where documents are part of a complex or time-consuming process - and a tool can only go so far in identifying that. But the idea has merit, so let's explore what the tool provides and how it's used.

Later, we'll also consider a more rounded approach to identifying document automation and Syntex opportunities.

What the Syntex Assessment Tool provides

Once you've done the work to install and configure the tool (covered below), an assessment is run to scan your Microsoft 365 tenant in 'Syntex adoption' mode. This launches a scanning process across your entire SharePoint Online estate which, depending on tenant size, will take some time. Once execution is complete, a Power BI report is created as the output - allowing you to slice and drill around your data in later analysis. The theme of the tool is to identify areas of 'SharePoint intensity' - examples include your largest document libraries or document libraries where custom columns and/or content types have been created. Other insights include your most heavily used content types and libraries with names matching common Syntex usages (e.g. invoices and contracts). The full list of report elements and descriptions from the tool is:

  • Libraries with custom columns - Identify libraries where Syntex can automatically populate columns, improving consistency
  • Column usage - Identify patterns of column usage, to target Syntex models where they will have the maximum benefit 
  • Libraries with custom content types - Identify libraries using custom content types, where Syntex models can be used to automatically categorize files. 
  • Content type usage - Identify patterns of content type usage, to target Syntex models where they will have the maximum benefit
  • Libraries with retention labels - Identify libraries where retention labels are used, where Syntex can be used to automate and improve consistency
  • Library size - Identify large libraries where classification and metadata can improve the content discovery experience
  • Library modernization status - Identify libraries which may need to be modernized to fully make use of Syntex
  • Prebuilt model candidates - Identify libraries where names or content types suggest a prebuilt model could be applied
  • Syntex model usage - Review the current use of Syntex models in your site
So that's an overview but it's more helpful to look at the results of running the tool - the sections below dive in into the output, and then towards the end we'll zoom out again to consider the role of the tool overall. 

Looking at real-life results - the Power BI report from my tenant

The screenshots below show the report output from one of my tenants - this isn't a production tenant but does have 1000+ sites and several years of activity.

Assessment overview

Provides an overview of the assessment run you performed, covering how many sites were processed successfully vs. any failures. I had 15 failures out of 1169 site collections for example:


Libraries with custom columns

A fairly useful indicator of 'SharePoint intensity', because if lots of columns have been created it shows that tagging/metadata is important here. This could indicate that having Syntex extractors automatically tag each document could be powerful. I have 574 such libraries in my tenant:


Column usage

Similar to the above, but focused on re-use of your custom columns and most common custom column types:


Libraries with custom content types

Again, a potential sign that files here are important because the library is using custom content types:


Content type usage

Gives you insight into your top content types - how many lists each one is applied to, how many items are assigned to the content type etc.


Libraries with retention labels

You might ask 'how are retention labels relevant to Syntex'? Remember that a key Syntex capability relates to information governance - the ability to automatically recognise potentially sensitive documents from their contents (e.g. contracts, CVs, NDAs, HR documents etc.) and ensure they are retained (or disposed of) with appropriate compliance. Since this is a non-production tenant I don't have too much of this, but you may do:


Library size

Again, knowing where your biggest libraries are can help you understand SharePoint hot spots, where many documents potentially relate to a process:


SharePoint modernisation status

This one is less directly connected to Syntex (relating as it does to the modern/classic status of the library in general), but relevant because Syntex can only be used on modern libraries. If you find important libraries still in classic status, you'll need to modernise them for the Syntex options to show up:

Prebuilt model candidates

Syntex ships with prebuilt AI models for receipts and invoices. This report element is simple but can be highly effective - essentially, 'find all the libraries in my tenant which have receipt or invoice in the name'. Most likely your production tenant *will* have this content somewhere, and Syntex could help provide insights or automate processes here:


Syntex model usage

This last page in the report gives insight into any existing Syntex usage in your tenant. In my case I have 8 models, and because none of them have recently executed the number of items classified in the last 30 days shows as 0:

So that's the tool output. Now let's turn our attention to how to run it in your tenant. 

Running the Microsoft 365 assessment tool

The tool itself is command-line based, hosted on GitHub, and comes in Windows, macOS and Linux flavours. Here's what you'll need:
  • A machine to run the tool
  • The tool downloaded from GitHub - see Releases · pnp/pnpassessment · GitHub
  • To register an AAD app with certificate-based auth - you'll register this in your tenant to allow the tool read access to your sites and workflows
The tool itself has a mode to help create the self-signed cert and get the AAD app registered. The command to do this is detailed on the Authentication page in the documentation. Most likely you will want to register a dedicated app for this tool rather than piggy-back on something else, because any SPO throttling will then be first restricted to this app rather than a critical production solution you have. 

The permissions required may need some thought because the 'optimal' permissions (which are needed for a full assessment) for application scope are:
  • Graph: Sites.Read.All
  • SharePoint: Sites.FullControl.All
The tool can perform a less complete audit with more restrictive permissions however - you'll get a less informative report with some sections missing, and whether that provides enough decision-making info to you is for you to decide. All of this is documented on the Permission Requirements page in the docs.

Once you're set up with authentication, it's a matter of running the tool in Syntex mode with the --syntexfull flag (if that's the assessment type you're able to run):

microsoft365-assessment.exe start --mode syntex --authmode application ` --tenant --applicationid [AAD app ID] ` --certpath "[cert path and thumbprint]" ` --syntexfull
Your Power BI report will emerge once the tool has trawled through your SharePoint estate.


So we've seen how the tool is used and what the report provides. But what value does it provide in the real world?

My recommendation - use the tool as ONE input
It was a great idea to extend the Microsoft 365 assessment tool for Syntex, and I fully agree that there are certain indicators of SharePoint use that align strongly with Syntex. However, the tool is no substitute for real process mining in your organisation - and I've no doubt the tool creators (the Microsoft PnP team) take the same view. My recommendation is to use the tool if you can, but perhaps think of it as background research before talking to the business. When working with a client to identify possible scenarios to automate with Syntex it's useful to talk to different teams and functions. I like to ask questions like these to uncover situations where Syntex automation could add high value:
  • Which document types are most important to the business? Why?
  • What types of documents do you have which are time or labour intensive (for people to read and process or create)?
  • What types of documents do you have which have a significant process around them?
  • What types of documents do you create in large volumes?
  • Which documents are part of a transmittal or submittal process? In other words, which documents are exchanged with other parties rather than spend their entire lifecycle within your organisation?
  • Which documents contain sensitive information, and should therefore potentially have information protection policies applied to them to support compliance?

Hopefully this analysis of the Syntex assessment tool has been useful. Syntex is a powerful tool to bring automation to an organisation's critical processes and we're going to see a lot more of it in the future.