Wednesday, 27 July 2022

Identifying Syntex use cases - how the SharePoint Syntex assessment tool can help

If you're a regular reader of this blog, you'll know I'm a big advocate of what Microsoft are doing with SharePoint Syntex. In short, Syntex brings intelligent automation to every organisation for processes which involve documents - and in most businesses today, that's a substantial proportion of processes. We're in the middle of big shift in technology where cloud power is commoditising advanced AI and automation capabilities so that they are no longer restricted to expensive, specialist, and often industry-specific tools. Examples of this include the legal and engineering sectors, in which it has been common to invest in specialist proposal generation and contract automation software, but usually at significant cost. Instead, these AI and automation tools are now baked into core platforms such as Microsoft 365 and democratised so that 'ordinary' employees can tap into them - add-on licensing might be required, but these tools have never so widely available. 

As market awareness grows, Syntex is forming quite a few of my client conversations at the moment. Organisations are considering how this new tool could help them, and in common with other innovative technologies one of the challenges is identifying use cases within the business where Syntex could have a big impact. I have a lot of thoughts on this in general, but one thing Microsoft have done to help is release the Microsoft 365 Assessment Tool (gradually replacing what was the PnP Modernization Scanner) which now has a 'Syntex mode' specifically - this can be used to assess your tenant and usage for Syntex automation opportunities. The idea is that by scanning your SharePoint landscape and IA for certain characteristics, this could uncover areas of the business using SharePoint in certain ways where Syntex could help. In reality, Syntex really shines where documents are part of a complex or time-consuming process - and a tool can only go so far in identifying that. But the idea has merit, so let's explore what the tool provides and how it's used.

Later, we'll also consider a more rounded approach to identifying document automation and Syntex opportunities.

What the Syntex Assessment Tool provides

Once you've done the work to install and configure the tool (covered below), an assessment is run to scan your Microsoft 365 tenant in 'Syntex adoption' mode. This launches a scanning process across your entire SharePoint Online estate which, depending on tenant size, will take some time. Once execution is complete, a Power BI report is created as the output - allowing you to slice and drill around your data in later analysis. The theme of the tool is to identify areas of 'SharePoint intensity' - examples include your largest document libraries or document libraries where custom columns and/or content types have been created. Other insights include your most heavily used content types and libraries with names matching common Syntex usages (e.g. invoices and contracts). The full list of report elements and descriptions from the tool is:

  • Libraries with custom columns - Identify libraries where Syntex can automatically populate columns, improving consistency
  • Column usage - Identify patterns of column usage, to target Syntex models where they will have the maximum benefit 
  • Libraries with custom content types - Identify libraries using custom content types, where Syntex models can be used to automatically categorize files. 
  • Content type usage - Identify patterns of content type usage, to target Syntex models where they will have the maximum benefit
  • Libraries with retention labels - Identify libraries where retention labels are used, where Syntex can be used to automate and improve consistency
  • Library size - Identify large libraries where classification and metadata can improve the content discovery experience
  • Library modernization status - Identify libraries which may need to be modernized to fully make use of Syntex
  • Prebuilt model candidates - Identify libraries where names or content types suggest a prebuilt model could be applied
  • Syntex model usage - Review the current use of Syntex models in your site
So that's an overview but it's more helpful to look at the results of running the tool - the sections below dive in into the output, and then towards the end we'll zoom out again to consider the role of the tool overall. 

Looking at real-life results - the Power BI report from my tenant

The screenshots below show the report output from one of my tenants - this isn't a production tenant but does have 1000+ sites and several years of activity.

Assessment overview

Provides an overview of the assessment run you performed, covering how many sites were processed successfully vs. any failures. I had 15 failures out of 1169 site collections for example:


Libraries with custom columns

A fairly useful indicator of 'SharePoint intensity', because if lots of columns have been created it shows that tagging/metadata is important here. This could indicate that having Syntex extractors automatically tag each document could be powerful. I have 574 such libraries in my tenant:


Column usage

Similar to the above, but focused on re-use of your custom columns and most common custom column types:


Libraries with custom content types

Again, a potential sign that files here are important because the library is using custom content types:


Content type usage

Gives you insight into your top content types - how many lists each one is applied to, how many items are assigned to the content type etc.


Libraries with retention labels

You might ask 'how are retention labels relevant to Syntex'? Remember that a key Syntex capability relates to information governance - the ability to automatically recognise potentially sensitive documents from their contents (e.g. contracts, CVs, NDAs, HR documents etc.) and ensure they are retained (or disposed of) with appropriate compliance. Since this is a non-production tenant I don't have too much of this, but you may do:


Library size

Again, knowing where your biggest libraries are can help you understand SharePoint hot spots, where many documents potentially relate to a process:


SharePoint modernisation status

This one is less directly connected to Syntex (relating as it does to the modern/classic status of the library in general), but relevant because Syntex can only be used on modern libraries. If you find important libraries still in classic status, you'll need to modernise them for the Syntex options to show up:

Prebuilt model candidates

Syntex ships with prebuilt AI models for receipts and invoices. This report element is simple but can be highly effective - essentially, 'find all the libraries in my tenant which have receipt or invoice in the name'. Most likely your production tenant *will* have this content somewhere, and Syntex could help provide insights or automate processes here:


Syntex model usage

This last page in the report gives insight into any existing Syntex usage in your tenant. In my case I have 8 models, and because none of them have recently executed the number of items classified in the last 30 days shows as 0:

So that's the tool output. Now let's turn our attention to how to run it in your tenant. 

Running the Microsoft 365 assessment tool

The tool itself is command-line based, hosted on GitHub, and comes in Windows, macOS and Linux flavours. Here's what you'll need:
  • A machine to run the tool
  • The tool downloaded from GitHub - see Releases · pnp/pnpassessment · GitHub
  • To register an AAD app with certificate-based auth - you'll register this in your tenant to allow the tool read access to your sites and workflows
The tool itself has a mode to help create the self-signed cert and get the AAD app registered. The command to do this is detailed on the Authentication page in the documentation. Most likely you will want to register a dedicated app for this tool rather than piggy-back on something else, because any SPO throttling will then be first restricted to this app rather than a critical production solution you have. 

The permissions required may need some thought because the 'optimal' permissions (which are needed for a full assessment) for application scope are:
  • Graph: Sites.Read.All
  • SharePoint: Sites.FullControl.All
The tool can perform a less complete audit with more restrictive permissions however - you'll get a less informative report with some sections missing, and whether that provides enough decision-making info to you is for you to decide. All of this is documented on the Permission Requirements page in the docs.

Once you're set up with authentication, it's a matter of running the tool in Syntex mode with the --syntexfull flag (if that's the assessment type you're able to run):

microsoft365-assessment.exe start --mode syntex --authmode application ` --tenant --applicationid [AAD app ID] ` --certpath "[cert path and thumbprint]" ` --syntexfull
Your Power BI report will emerge once the tool has trawled through your SharePoint estate.


So we've seen how the tool is used and what the report provides. But what value does it provide in the real world?

My recommendation - use the tool as ONE input
It was a great idea to extend the Microsoft 365 assessment tool for Syntex, and I fully agree that there are certain indicators of SharePoint use that align strongly with Syntex. However, the tool is no substitute for real process mining in your organisation - and I've no doubt the tool creators (the Microsoft PnP team) take the same view. My recommendation is to use the tool if you can, but perhaps think of it as background research before talking to the business. When working with a client to identify possible scenarios to automate with Syntex it's useful to talk to different teams and functions. I like to ask questions like these to uncover situations where Syntex automation could add high value:
  • Which document types are most important to the business? Why?
  • What types of documents do you have which are time or labour intensive (for people to read and process or create)?
  • What types of documents do you have which have a significant process around them?
  • What types of documents do you create in large volumes?
  • Which documents are part of a transmittal or submittal process? In other words, which documents are exchanged with other parties rather than spend their entire lifecycle within your organisation?
  • Which documents contain sensitive information, and should therefore potentially have information protection policies applied to them to support compliance?

Hopefully this analysis of the Syntex assessment tool has been useful. Syntex is a powerful tool to bring automation to an organisation's critical processes and we're going to see a lot more of it in the future.

Thursday, 30 June 2022

Speaking at ESPC 2022 (Copenhagen) on SharePoint Syntex and Viva Topics

Process automation and AI will be big growth engines for technology in the next few years, so I'm really happy to have been selected as the speaker to cover these hot topics (and how the Microsoft stack solves for them) at one of the big Microsoft conferences this year. The European SharePoint, Microsoft 365 and Azure Conference will be in Copenhagen, Denmark from 28th November - 1st December 2022, meaning I've got plenty of time to practice my speaker face and bad jokes. I'll be delivering two sessions:

SharePoint Syntex - art of the possible and lessons learnt (session code T29)

Tuesday 29th November, 15:15

Session abstract:

Organisations using Microsoft 365 are waking up to the potential of SharePoint Syntex to have a dramatic impact on their business. Syntex provides AI capabilities to 'read documents for you', allowing your Microsoft tenant to recognise your individual document types, extract meaning, and automate processes - with these ingredients the possibilities are endless. We've implemented Syntex to read safety reports, risk assessments, project plans and more, learning a lot about how things work in practice and the pitfalls that will cost you time or lead to poor results.

In this session we'll discuss potential use cases, what you can expect from Syntex, key decision points, and down to important tips such as how to work with documents containing tables. Over the course of several demos we'll walk through the end-to-end of creating and tuning Syntex AI models, building automations, and even advanced scenarios such as adding Power BI dashboards to drive process compliance.

We'll complete the session with a discussion on licensing and roadmap, so you leave armed with everything you need to get achieve more with SharePoint Syntex.

Viva Topics 18 months later - what did we get?  (session code W16)

Wednesday 30th November, 11:45

Session abstract:

Finding information and expertise is far too time-consuming in the vast majority of organisations. Poorly configured search, the sprawl of repositories and sites, unceasing content growth and difficulties recognising authoritative content all conspire against the information worker. No wonder McKinsey and IDC report that the average knowledge worker spends 20-30% of their time just looking for things.

Viva Topics is Microsoft's answer to this challenge. After being part of the Project Cortex private preview, we've had 18 months of Viva Topics being live in our business and have implemented the technology for several clients. This session covers the benefits we've seen (both the expected and unexpected), and shares best practice guidance on how to plan, implement, and build on Viva Topics. We'll demo the technology in our production environment so you can see the experience in action.

Implementing Viva Topics at scale can be a big investment and it's important to know what's coming in the future. We'll end with a discussion on Microsoft's future roadmap and capabilities, so you can plan ahead with confidence.

Conference details

The tagline for the event is "Europe’s premier Microsoft 365 & Azure Conference" and that's probably a fair statement - I always really enjoy speaking at this event and being immersed in the great conversations which happen there. As usual, there's extremely strong representation from Microsoft too - keynote speakers include: 
  • Jeff Teper - CVP, Microsoft 365 Collaboration with Teams, SharePoint, OneDrive, Microsoft
  • Scott Hanselman - CVP, Microsoft 365 Collaboration with Teams, SharePoint, OneDrive, Microsoft
  • Karuana Gatimu – Principal Manager, Customer Advocacy Group, Microsoft Teams Engineering Microsoft
  • Vesa Juvonen - Principal Program Manager Microsoft
The overall conference programme and list of speakers make for a compelling event in my eyes. Over 2500 attendees are expected, and the conference should be a great mix of sessions and networking with many experts, partners and vendors in the Microsoft cloud space. 

Here's the link to the conference pricing page

Hopefully see you there!

Wednesday, 8 June 2022

SharePoint Content Assembly - hints and tips

In recent articles I've covered SharePoint Syntex from a few angles. Most recently in Automate creation of new documents with SharePoint Syntex Content Assembly we looked at exactly that, automated document creation using the scenario of role description documents, showing the end-to-end process of using Content Assembly. Here at Content+Cloud the scenario is a good fit for Syntex in our business because we have a large number of open roles, which equals a large number of documents, and they're all based on our common template for role descriptions. With Syntex Content Assembly we can simply create an item in a SharePoint list, run the process, and have a Word document created in our C+C branded format which combines the individual specifics for a new role with our standard content on benefits, pension, approach to hybrid work, office locations, and so on.

Having spent some time with Syntex Content Assembly, in this post I want to share some tips which might accelerate your understanding.

Syntex Content Assembly tips

Tip #1 - Understand the 1:1 relationship between your Syntex modern template and where it's created

Syntex Content Assembly is based on creation of a 'modern template', a new construct in Microsoft 365 and SharePoint which acts as the base template for the Word documents you are going to create. One aspect of Content Assembly which needs consideration is that today the template lives in the SharePoint document library you create it in - and nowhere else. There is a 1:1 mapping between your template and this document library: 

What this means is that if you'd like to generate documents from this template across your tenant (e.g. different business units) you'll need to recreate the modern template in multiple locations. An alternative approach could be to centralise the creation process into one doc lib and then use Power Automate to 'send' the created document instances to where they need to be. Either way, you need to consider this as a primary consideration when working with Syntex.

Tip #2 - Syntex Content Assembly stores modern templates in the hidden 'Forms' area in the document library 

Building on the first tip, it's useful for more technical SharePoint people to understand where modern templates are stored in SharePoint's internals. The answer is they get stored in the hidden 'Forms' directory inside the document library you create the template in. You'll never see this in the SharePoint front-end, but using a tool like the SharePoint Client Browser allows you to see this - specifically a subfolder created within 'Forms' which has the name of your template with spaces removed. In here you'll see the .docx or .pptx file for your template:  

Armed with this knowledge, if you have access to SharePoint development skills it's certainly possible to create a single Syntex modern template and use it in multiple locations across your tenant. Use of SharePoint file/folder APIs is the key (perhaps with PnP or a SharePoint migration tool to help), but you could nominate one location as a master and ensure the template is synchronised to other locations. To put this in context, one example could be if your organisation has multiple countries and each should use the same role description template - by synchronising updates to the modern template with each of the locations, you can effectively use one shared template globally. 

Tip #3 - keep up with Syntex capability updates: things are changing! 

Content Assembly, like the rest of Syntex, is moving fast and it's worth monitoring the Microsoft 365 to see what's coming. As a great example, when I started writing this article (May 2022) one annoying limitation was that you couldn't have Content Assembly drop values into a table within a Word template. This was frustrating because it's common for a document template to have a table (or several) containing different values in each created document - our C+C role description document has core role details such as title, hours, department, reporting line etc. in a table for instance. When using Syntex Content Assembly a few weeks ago, we needed to reformat the document template to remove the tables because Syntex would give messages like this:

But no more!

Another great capability launched in the last few weeks is the ability to create PDF documents (not just Word) using Syntex Content Assembly - this came in late May/early June. So, stay on top of things by going to the Microsoft 365 roadmap and filtering on Syntex

Tip #4 - deal with pre-requisites first: create the SharePoint list/taxonomy and have the document ready

Having been through the process a few times, one recommendation I have with Content Assembly is to ensure you have your prerequisites created and to hand before you go through the 'modern template' creation process. Not doing this is the equivalent of trying to book a holiday without your credit card or passport number to hand - you'll only get so far before realising you need to stop, gather up some things, and most likely abandon the process to restart later. 

With Syntex Content Assembly, you map placeholders in your document to the columns in the SharePoint list (or taxonomy terms) where the data will be pulled from - so it makes sense that in the mapping process the source data needs to exist for it to be selected. In the example below these are shown in the right-hand panel by Syntex:

There are actually a couple of variants here - consider that in Syntex Content Assembly, the value to be dropped into a particular placeholder in your document can come from:
  • A particular column value in a SharePoint list item (likely to be the most common case)
  • A term in a SharePoint taxonomy term set
  • A one-off value entered manually by the user 
So what we're really saying is that in the first two cases, make sure the SharePoint 'thing' exists first. When using list items for your document creation process, creating the underlying SharePoint list first is important - even if you only create the list and define the columns but don't yet add any data. To put this in context, my SharePoint list for C+C roles looks like the image below - the columns on the right are multi-line fields with lots of detail on the specific role so I've blurred those out, but hopefully the consideration is clear:

In summary, make sure the list and columns (or taxonomy term set) exist before creating the Syntex modern template so that you can map to them.

Tip #5 - avoid SharePoint rich HTML fields with Syntex Content Assembly

There a few 'compatibility' considerations but one in particular I'd like to call out is that SharePoint fields containing HTML (e.g. for rich formatting such as bullet lists, tables, font styles etc.) won't come over to your document well. Your formatting will be lost and you'll see raw HTML in your Word document like this:

To avoid this, ensure any multi-line SharePoint fields are plain text only:
Other restrictions include:
  • Only Word/PDF are supported for now - no PowerPoint or Excel
  • Your Word template cannot have comments or Track Changes enabled
  • Content controls in Word (remember those?) are not supported
  • Images cannot be dropped into the document - only text
In addition to the Microsoft 365 roadmap for high level details, see the 'Current release limitations' section of the Microsoft documentation on Syntex Content Assembly to keep up with these constraints and new capabilities as Syntex evolves.


Despite being a relatively new technology Syntex doesn't have too many foibles. Microsoft are putting advanced process automation capabilities into the hands of every (licensed) Microsoft 365 and SharePoint user here, the Content Assembly feature ticks the box of generating new documents as the counterpart to core the Syntex ability of reading, understanding, classifying, and extracting key information from documents. On the document generation side, in addition to the lower-level constraints listed above, it's worth remembering of course that the document creation process still involves a couple of clicks - we still don't have end-to-end automation without human involvement. However, we can be sure that will appear on the roadmap soon most likely through Power Automate integration. Syntex has a bright future as an automation enabler which is relevant to almost every sector and organisation - understanding how to approach Syntex and some of the implications of the model is important to get the most value. Hopefully this post has been useful.

Tuesday, 10 May 2022

Automate creation of new documents with SharePoint Syntex Content Assembly

With the recent Content Assembly feature, SharePoint Syntex can now create templated documents as part of an automated process - all native to Microsoft 365. This is an important part of intelligent automation since so many processes involve a document - in the consumer world your mortgage documents, an insurance quote, a letter confirming a policy change and so on. While financial and ERP systems often take care of invoices, sales orders and purchase orders, the business world revolves around proposals, contracts, safety reports, compliance certificates, checklists, affiliate agreements and many other types of standard document. I know what you're thinking - sure, there have always been ways to automate document creation, but custom code or a 3rd party product have typically been needed. Often these solutions have been very expensive indeed, such as some of those for intelligent proposal generation. 

Another significant thing here is that I don't remember ever seeing a solution 100% baked into the core Microsoft experience in such an intuitive way - that means potentially everyone in the organisation can make use of automated document creation, and without training on specialist tools. This combination of things that make me believe that what Microsoft are doing with Syntex (and the journey they're on with Content Services in general) is extremely significant in the market. 

Scenario used in my example

Before I dive into showing how Syntex Content Assembly works, some details on my scenario - here I want to easily create documents for different open roles here at Content+Cloud. We have no less than 100+ open roles at the moment and these documents are used in internal hiring and to brief external agencies, so simplifying the creation of these documents and reducing the copying and pasting that people currently is a big win. The documents combine variable content (the role details) with standard content (our benefits and working environment etc.) Using Syntex Content Assembly, we can automate the document creation and Syntex ensures the respective details for the role are dropped into placeholders in the template. 

Let's start with the fundamentals.

Syntex Content Assembly ingredients
Understanding how Content Assembly works starts with understanding the two main ingredients needed:
  • A SharePoint list - this will contain an item for each document instance to create, with the values to drop in
  • A 'modern template' - this is an Office document specially created with Syntex support, providing the outer template for the document

Creating a Syntex modern template

Step 1 - creating the list

Since you'll automate the creation of documents from items in a SharePoint list, the first thing to do is to define and create that list. In my case it's a list for our open roles and I've added SharePoint columns for the variable pieces of information which will be dropped into the documents:  

It's very important to have this list available and ready (at least it's structure and columns) before going to create the modern template. The reason for this is because the template creation process involves you picking the columns and specifying the location in the document where this information should display. 

Step 2 - creating the template itself

To do this you'll need a user with a Syntex license. A licensed user will see a new option in a SharePoint document library to create a modern template:

Have your Word document ready when you click the item above - but note that it can be a "filled" version of your document rather than a specially prepared template. You'll upload it from your PC rather than select it from an existing SharePoint location:

Once your document has been selected we go to the next phase.
Adding content placeholders to the template
At this point you'll see a panel on the right which will help you drop the placeholders into the document template. Simply highlight some text and then the 'add placeholder' control appears on the right:

The placeholder control asks you to name the placeholder and specify what should be dropped into it when new documents are created from the template. There are two options here:

Text shown Use when
Enter text or select a date The user should enter some new text as each document is created from the template
Select from choices in a column of a list of library The text should be pulled from the list item as each document is created from the template

For the most part you'll probably be using list item values for the vast majority of values dropped into your documents - adding text ad-hoc may be useful sometimes but the real power is in more automated creation from your structured data. Let's look at each in more detail:

Option 1 - entering an ad-hoc value

When creating the template, you'll specify what kind of data can be used for this value, but note this isn't adding any kind of column to the list containing your list items - it just constrains the data that can be entered:

When using the template, a control is presented to the user (matching the data type) to allow the ad-hoc value to be entered:

Option 2 - selecting from a list item

Using this approach, when creating the template you'll browse to the SharePoint list you're using a choose the column which holds values for this 'field' in the document:
First choose the list:
...and then the column:
Once selected you'll see confirmation of the selected column, including if you come back in later to edit:
When using the template, once the user has selected an item in the list for the document instance being created, Syntex will automatically fill in this placeholder from your list item - as it will do for ALL of your placeholders in the document. A placeholder which is bound in this way can't be overtyped, but that makes sense because you are creating the document from a 'record' in your list:

Finalising the template
The template creation process from here is essentially repeating the above steps for each placeholder you need in your template. For each placeholder, either select that the value should be entered ad-hoc or (more likely) pulled from the list item. Once this is done you'll have lots of placeholders in your template, with each one listed on the right-hand side showing where the value will come from:
At this point the template can be published. Hit the publish button and see the confirmation:
So once we have the template in place, what does creating a document look like?

Creating documents using Syntex Content Assembly

At this point all of the one-off, upfront work to create the template is done - so creating new document instances from the template is quick and easy. No more copy and paste between documents! The critical point to note here is that currently document creation does involve a manual step. I think we can be sure that Power Automate actions will be here very soon to provide end-to-end automation, but for now here's how the process works - your Syntex modern template appears on the 'New' menu for your SharePoint document library:
Selecting the menu option takes you into creating a document instance from your template:
You can give the document a name (the default is the template name) using the editable name highlighted with the red arrow. On the right-hand side, all the placeholders are currently empty - the only reason the document in the image below has content is because in this case my template has content in it (as opposed to empty placeholder values). In any case, they will be overwritten in the next step. To create a document using one of my items in the SharePoint list, I simply choose a field (most likely the title field) and I see a picker of my list items:
Once an item is selected, all values are pulled from the list item and dropped into the corresponding placeholder in the document. You'll need to hit the 'Refresh to preview' button:
The document is now created!

As you'd expect, it lands in the document library where you created and applied the modern template:

So what do we take away from this?

The significance of Syntex Content Assembly
What we're seeing here is another big step in the democratization of automation solutions. Sure, there's some complexity as you get used to Syntex Content Assembly, but for the first time it's completely native and baked into Microsoft 365 (specifically the SharePoint document library interface) and doesn't require any coding, scripting, or complex configuration.

Every single organisation today has people copy/pasting into documents - statements of work, project plans, legal agreements, sales pitches, supplier/partner/employee contracts, role descriptions, and many more. In the best-case scenario, a lot of time is being wasted. In the worst-case, human error can creep in and in some cases can have a serious impact on work and people. With Syntex Content Assembly, Microsoft is making a serious play in document automation for the masses - and a segment of specialised product vendors are likely be worried.

Today there are quite a few limitations. A couple of clicks are needed to create the document (so arguably not full automation), there's a one-to-one relationship between the template and a document library, and for now it's only Word. Additionally, in my experience Syntex is a bit finicky in terms of what can be in your Word template. These are worth discussing in more detail, so I'll cover these and more in my next post:

Monday, 28 February 2022

I'm speaking at the European Power Platform Conference 2022 in Berlin - RPA and Power Automate

The rise of low-code business applications continues, with industry analysts such as Gartner predicting that by 2025 70% of new applications developed by enterprises will use low-code or no-code technologies, and Microsoft claiming that 92% of Fortune 500 clients are now using Power Apps to provide tools to their employees and customers. With this backdrop I'm excited to be speaking at the inaugural European Power Platform Conference this year, held in April in Berlin.

The conference is from the same team behind the European SharePoint, Office 365 and Azure Conference (ESPC), one of the biggest events in the Microsoft cloud space across Europe for the last few years and one which I've enjoyed speaking at for the last 6 or 7 years. I've no doubt that the Power Platform event has emerged to fulfill the huge demand for quality learning and guidance on best practices and lessons learnt which are specific to the Power Platform family. The surface area and capability of the platform has expanded massively in recent years, and in each release wave Microsoft are driving innovation hard in each of the areas depicted below:  

The European Power Platform Conference event

The headlines of the event are:
  • When - April 6-8 2022
  • Where - Berlin, Germany
  • Language - English
  • Who - lots of community experts, Microsoft MVPs, and keynotes from Charles Lamanna (Microsoft Corporate Vice President, Business Applications & Platform) and Marios Stavropolous (ex co-founder and CEO of Softmotive, now at Microsoft as Partner, Microsoft Power Automate following the Softmotive acquisition)
  • Pricing - tickets from €695, see the conference pricing page 

My session - Automate the Impossible with RPA and Power Automate Desktop

My session focuses on RPA (Robotic Process Automation) and uses a fun scenario to demonstrate serious technology. The abstract is:

Automate the Impossible with RPA and Power Automate Desktop

An alternative title for this session could be “How I frustrate my kids' ambitions of unauthorised gaming through automation!” Robotic Process Automation (RPA) has been around for a while but is now part of Microsoft’s Power Platform and easier to use than ever. Power Automate Desktop allows “island” or legacy applications and systems without an API to be automated – even my home Wi-Fi settings though my ISP’s terrible portal in my case.

In this session you’ll learn how to get started with automating web applications using Power Automate Desktop. We’ll focus on how to interact with web sites like a human with a mouse and keyboard does – using navigation links, entering text into textboxes and clicking buttons. You’ll learn the difference between web automation and UI automation in Power Automate Desktop, and how to decide which approach is best. Finally, we’ll cover how to integrate with Alexa at home, how to schedule your processes and discuss licensing considerations. The automation wave is about to hit – make sure you’re part of it!

For more details see:

 Automate the Impossible with RPA and Power Automate Desktop

Conference link

Hope to see you there! Here's that link to the conference site again:

Wednesday, 2 February 2022

SharePoint Syntex AI - my top 5 real-world tips

SharePoint Syntex is the AI-powered ability for Microsoft 365 to learn and understand your documents, enabling knowledge to be extracted and automated processes to be implemented around your content. Building on earlier work I did, I've worked on a couple of Syntex implementations recently, one for a client and one internal to us at Content+Cloud. Putting Syntex into action in the real world is quite different to general experimentation, and in this post I want to share some learnings as I've worked with different scenarios. Before that, if you're looking for some fundamentals content on Syntex some of my earlier articles might be useful:
Syntex is an evolving technology and although Microsoft have worked hard to democratise the AI and put it in the hands of non-developers, there's definitely a learning curve and the constructs used in the training of AI models take some time and experience to get the most from. It starts with making the right choice between Syntex' two approaches (as discussed in choosing between document understanding and form processing) and then requires a good understanding of the constructs and some trial and error (including interpreting the feedback received) during the AI training process.

Background - real world scenarios these tips are based on

Before I go into the tips I think it's useful to understand the types of documents I trained Syntex to understand. The first is known as a 'method statement' and is used in construction and engineering in the UK - it's required by law in some scenarios but is also commonly used on practically any service task where risk has been identified. Examples I've seen include replacing a building's alarm system or even cleaning windows with a tall ladder. The method statement goes alongside a risk assessment, and the overall process is often known as RAMS (Risk Assessment - Method Statement). The method statement describes the 'safe system of work', detailing the control measures taken for the identified hazards and risks. Here are the first pages from a couple of examples - as you can see the formats are quite different, but that's OK for Syntex:

For these types of document, Syntex document understanding works best.

The other work I'm doing revolves around an Excel format we use commonly at Content+Cloud which looks like this:

As you can see this one is very tabular in nature, and unsurprisingly if you know Syntex I found that the forms processing model works best here.

My top 5 Syntex tips


1. To extract from Excel, convert to PDF first

Syntex forms processing actually doesn't directly support Excel files so a quick conversion to one of the supported formats (JPG, PNG or PDF) is needed - I recommend PDF to avoid the uncertainties of image processing. Power Automate offers a couple of simple ways to convert an Office file to PDF and because you call into Syntex forms processing through a Power Automate Flow you create, this is convenient all round. Some options are:
  • Move the file to OneDrive temporarily and use the 'Convert file' action with a target type of 'PDF', then copy or move back
  • Use a 3rd party conversation service which has a Power Automate connector. Encodian works great and is free for small volumes
To avoid a recursive loop you should add a check that the file being created is indeed an Excel file before converting. You'll need a check such as:

Content-Type is equal to 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'

Your file is now ready for Syntex processing.

2. When working with tables, try forms processing first

From the two Syntex approaches, document understanding often isn't well-suited to extracting content from tables. The main reason for this is that Syntex document understanding strips away all the structural and formatting elements of the document to leave just the raw content. Consider the following document:

What Syntex actually 'sees' for this document has all of the formatting removed - we see this when labelling documents in the training process:

As a result, extracting contents from a certain row or column in a table can be difficult in document understanding. You might be able to do something with before/after labels and a proximity explanation, but accuracy might vary. In these situations, it's worth trying a switch to forms understanding. The model there has special support for processing tables and it's capable of detecting rows and columns in a table - this makes it possible to zone in on 'row 4, column 2' or whatever you need.  

Of course, what comes with a change to forms understanding is a completely different approach (with its Power Automate integration) and pricing. But, you might need to go this way if the content you're trying to extract is tabular.

3. Implement a 'proximity explanation' and train on more documents when extracting content in a less describable formats

In Syntex document understanding models, when creating a Syntex extractor to recognise some content in a document and pull it out, it's worth understanding the following - unless specific steps are taken Syntex is best at extracting content in known, describable formats. For example, an extractor you create may be accurate immediately if you're extracting a consistent format such as a postcode or set of product IDs for example. However, training a model to recognise arbitrary data can be more difficult even if it's in a consistent place in the document. This becomes an important aspect of working with Syntex and something to plan for. 

Consider two scenarios: 
  • Extracting a product description (lots of arbitrary phrases)
  • Extracting a product ID in form XXX-123456-YY (distinct pattern)
In addition to the AI machine teaching you do when providing Syntex with some sample documents, the 'explanations' you create are vital here:

My tips for extracting more arbitrary formats are:
  • Implement a proximity explanation in conjunction with something just before or just after the content you're trying to extract 
  • Train on larger numbers of documents - don't just stop at five sample docs, provide Syntex with ten or more
  • Expect to spend a bit more time on these scenarios
With some tuning, you should be able to get the extractor accuracy to where you need it to be. If not, see tip #2 - setting up a quick test with Syntex forms processing to see if that's more effective is a good idea at this point.

4. Combine Syntex with formatting in SharePoint lists and libraries

Syntex and SharePoint JSON column formatting are a winning combination - I use them together frequently. Consider the case where you're extracting some content from documents with Syntex, and of course the values are extracted from the document into list item values - when some column formatting is applied we can really bring out the value of what Syntex has found.

From this:

To this:

We can now get an instant view across many documents at once:
  • Engagements with a higher value become obvious from the Excel 'data bar' style formatting on that column
  • The business manager is highlighted in bold, blue text
  • Any missing values from the project manager field are highlighted in red
In another example, I use very similar formatting but the emphasis is on highlighting the missing gaps across the documents (as shown by the red):

This provides an intuitive, instant picture of how complete the documents are according to what Syntex found in them. This can be a useful technique when Syntex is used with documents where it's important that certain information is present - which applies to many of the world's is commonly the case when you think about it.

5. Combine Syntex with Power BI reporting to measure document compliance

Expanding on the theme behind the formatting approaches shown above, many business processes rely on documents being completed to a certain standard - the document represents an important form of information exchange or capture. The RAMS review process I discuss above is a great example of this. I'll talk about that particular scenario in more detail in a future article, but as we're discussing general concepts today this idea of "Syntex + reporting" is, I feel, worthy of being a top Syntex tip.

Consider that for any process where information completeness is important, measuring and reporting on this is likely to be valuable. For my client's scenario where fully completed RAMS method statements are a guardrail to help ensure safe working, we implemented a Power BI dashboard to provide insights on to what Syntex was finding. As you can see in the image below, this measures overall 'process compliance' levels, and measures such as the average fidelity of documents from individual subcontractor companies. This allows us to create a subcontractor 'leader board', helping the team understand where there may be greater risk and what to investigate:

This does rely on your Syntex AI models being accurate for the documents you are processing, but even without full accuracy the insights help guide you to where improvements can be made.


Syntex is an amazing technology and while other AI-based document processing solutions have existed for a while (often vertical-specific, such as legal or contracts management solutions), the advantage Syntex has of being baked into your core collaboration platform is profound and opens many doors. With a good understanding of the constructs and capabilities together with some imagination, Syntex can be the foundation for some high value solutions. Hopefully this post provides some inspiration and useful tips!