Showing posts with label Syntex/SharePoint Premium. Show all posts
Showing posts with label Syntex/SharePoint Premium. Show all posts

Wednesday, 12 July 2023

My real-world Microsoft Syntex demo videos - and a note on Copilot

I've written quite a lot about Microsoft Syntex AI on this blog and spoken about it at various conferences, and my firm view is that while the arrival of ChatGPT and Microsoft Copilots changes many aspects of how we work, neither fundamentally change the value Syntex can bring to an organisation. I recently gave a talk with the snappy title "AI choices for apps and automation in 2023 - understanding AI Builder, Azure Cognitive Services, Microsoft Syntex, and Azure OpenAI" at the excellent European Power Platform Conference in Dublin, and as you might infer, this covered Syntex alongside other options in the Microsoft AI toolbox. Along with some other fun conversations, I had feedback from a couple of attendees along the lines of "those Syntex demos are great, how could we show our colleagues?" - so this prompted me to get the videos narrated and uploaded to YouTube. Here's a shot from the conference by the way, it was a great event:


You can find these videos below. If you watch them, I'd suggest that listening to the voiceover or enabling captions will give you a better understanding of what's being shown. But while we're here, I think it's worth expanding on the the relevance of Syntex today - does Syntex get disintermediated by ChatGPT and Microsoft 365 Copilot, given their ability to directly answer questions of your documents and data?

Syntex in the era of ChatGPT and Microsoft Copilots

AI has always had a wide variety of tools and approaches, but for many it's more confusing than ever now - notably, I find some execs are conflating ChatGPT with AI in general, rather than seeing a broader pictire. Regarding Microsoft Syntex, it's certainly true to say generative AI like ChatGPT didn't exist a couple of years ago when Syntex was introduced - the AI landscape has truly changed dramatically in that time. So it's entirely valid to question how things fit together and whether the newer technologies remove the need.

In short, neither of those technologies diminish the value Syntex AI can add to a business - if anything, they will be increasingly used together, as Microsoft have shown with the Syntex plugin for Microsoft 365 Copilot.  

Syntex has unique capabilities when it comes to understanding what's in your documents and allowing you to derive insights and automate processes in which they are involved. Because Syntex is trained  specifically by a human on a representative set of your key documents (machine teaching, not machine learning) - it builds a greater understanding of the different variations, and ultimately a more granular comprehension of your documents. If you want to reliably detect (i.e. classify) which documents are which in your Microsoft 365 tenant - and this is powerful because contracts, board packs, proposals, statements of work, order forms, purchase orders, receipts, safety reports, risk assessments etc. are all very different documents used in different ways - Syntex is the technology to do that. Syntex is also the technology to extract the most relevant info from your documents, thus helping you to unlock data trapped inside documents and automate processes. The possibilities are endless, and Microsoft 365 and SharePoint have evolved to be an intelligent platform capable of so much more than simply allowing documents to be thrown in there.

The examples below try to give a flavour of this. They are both real-world scenarios from either inside my company (Advania/Content+Cloud) or work we've done with clients. While not shown in these videos, we also have a few more examples - such as implementing Syntex in the insurance industry to understand insurance policy documents and accelerate approvals. One powerful aspect of Syntex is that because it's AI rather than OCR (which looks for content in a specific place in the page or document, such as would be the case for invoices from one company), the AI has more intelligence and allows for a lot of variation within instances of the document. Consistency or uniformity between the documents is not necessarily required. 

Both videos have 1-2 mins of context setting before the demo starts.

Example 1: Syntex fundamentals - accelerating our SOW process

In this first example, we look at Microsoft Syntex AI fundamentals in a real-world process - bringing automation to our client engagement process at Advania/Content+Cloud. We run a lot of projects for our clients, and in each a Statement of Work is created to describe the work, it's scope, the project costs, and more. Syntex helps accelerate the weekly pipeline review of upcoming projects and creation of the engagement documentation. Take a look:




Example 2: More advanced Syntex - automating a risk assessment process

Where the first example focuses on identifying Statement of Work documents in our tenant and analysing the documents to understand the project pipeline, this next example relates to risk assessments - in this case, AI is used to automate the first pass of documents received by subcontractors. Syntex reads the documents and if certain information is not found, it automatically rejects the submission and e-mails the subcontractor - attaching the original document along with a message detailing the missing information. A Power BI analytics dashboard provides insights on the process and overall compliance levels. In this case, Syntex AI is taking the burden off a human team and ensuring they focus on the more complex/higher risk cases:



Summary

Hopefully those videos give you more insight into how Microsoft Syntex works and how it gets used in practice. As more emerges on Microsoft 365 Copilot (still not expected until end 2023 for General Availability), it's becoming increasingly clear that it is designed to work alongside Syntex rather than replace it. Any time a lower level of understanding of your documents is required than is supported by Microsoft 365 Copilot - and there will be many such scenarios as you pursue your AI and automation goals - Syntex will still be needed to provide that in-depth, process specific capability. Copilot will be able to call into Syntex if you have it, unlocking more scenarios such as asking questions of a document or asking Copilot to summarise the document based on Syntex capabilities. In other words, another example of the "better together" product story that Microsoft work hard on. 

You didn't think Microsoft were going to cannibalise one of their own products unnecessarily did you? :)

Sunday, 27 November 2022

Microsoft Syntex - December 2022 update and compiled articles

Microsoft Syntex has been one of Microsoft's biggest announcements in 2022 - which can be somewhat confusing because it existed previously as SharePoint Syntex since 2020 - but Syntex is expanding massively from "AI that understands your documents so you can automate processes" to an entire suite of advanced capabilities related not just to your documents, but also your images and videos. Some of the bigger recent or forthcoming features include document eSignatures, annotations, image recognition, automated document summaries and translations, auto video transcription and much more - many of which were announced at Microsoft's Ignite conference in October 2022. I'm hearing Microsoft folks say that "Syntex could be as big or bigger than the Power Platform in time", which is an interesting thought given the impact that has had.

Over the last 2 years I've been writing a lot about Syntex on this blog and thought it would be good time to do two things:
  • Provide a 'Syntex on a page' round-up of the current and future capabilities    
  • Provide links to my Syntex articles from one place
  • Provide links to a couple of YouTube videos demonstrating Syntex in action from conference talks I've given
This article provides these.

Syntex on a page - December 2022


If you're confused about what's in Syntex and what's coming, I like to put things into these buckets:
  • Content understanding and processing - using AI to understand your documents and automate something 
    • Example - find all risk assessments missing a start date and contact the supplier organisation
    • Example - read an insurance contract and save the policy details to a database
  • Content assembly - automated creation of new documents
    • Example - generate a new contract for every new starter
  • Content management and governance - premium document management capabilities 
    • Example - send a completed contract for eSignature (like DocuSign or Adobe Sign but fully integrated into Microsoft 365) and move location once complete
    • Example - automated detection of pay review documents so specific security or policies can be applied without manual tagging 
The image below (click to enlarge) shows what's in Syntex today and what's in the roadmap:





Hopefully that helps position the today/tomorrow capabilities somewhat.
 

Compilation of my articles

I’ve been slowly creating a back catalogue of Syntex articles as I research, learn, and write about the technology. I’ve covered concepts such as training Syntex to read and understand documents, extracting data from forms, automating the creation of new documents, using Syntex in a fully automated process (Straight Through Processing) and various hints and tips articles. Here's a list of Syntex articles which might be useful, starting with the fundamentals and moving into more advanced topics:
Note that there have been some renames along the way, and some of those articles might contain the old names. Here are some examples:
  • SharePoint Syntex -> Microsoft Syntex
  • Document understanding -> Unstructured document processing
  • Forms Processing -> Structured document processing
Hopefully the links above are useful on your Syntex learning journey.

My Syntex videos on YouTube

Seeing Syntex in action can bring it to life a lot more than reading about it. This link points to a couple of the demos I've shown at conferences with a talkover for YouTube: 

Reminder - why is Syntex important?

One way or another, automation will always be a theme of many of the I.T. projects undertaken in the next few years. The trend is increasing, with analysts predicting a $30b market by 2024 (IDC) and Gartner saying 60% of organisations are pursuing four or more automation initiatives. There are many technologies in the space, but Microsoft Syntex changes the game because advanced AI and document automation tools are now baked into the core productivity platform used by 91% the world’s top businesses (i.e. Microsoft 365) - inexpensive, readily available and democratised for every business.
 
It's no surprise Microsoft are investing heavily here. Every organisation has thousands of processes needing human input. If you consider Financial Services as just one industry, banks deal with applications for loans, mortgages, credit cards, loyalty schemes and many other products. Insurance companies quote, sell, and renew policies for home, car, travel, pet cover and more – and those are just the obvious products and services. Zooming out, every industry you can think of has an entire ecosystem of processes that can be optimised.
 
As Syntex powers are amplified and more capabilities are added, spending time evaluating what Syntex can unlock is likely to be valuable for many organisations. 

Wednesday, 27 July 2022

Identifying Syntex use cases - how the SharePoint Syntex assessment tool can help

If you're a regular reader of this blog, you'll know I'm a big advocate of what Microsoft are doing with SharePoint Syntex. In short, Syntex brings intelligent automation to every organisation for processes which involve documents - and in most businesses today, that's a substantial proportion of processes. We're in the middle of big shift in technology where cloud power is commoditising advanced AI and automation capabilities so that they are no longer restricted to expensive, specialist, and often industry-specific tools. Examples of this include the legal and engineering sectors, in which it has been common to invest in specialist proposal generation and contract automation software, but usually at significant cost. Instead, these AI and automation tools are now baked into core platforms such as Microsoft 365 and democratised so that 'ordinary' employees can tap into them - add-on licensing might be required, but these tools have never so widely available. 

As market awareness grows, Syntex is forming quite a few of my client conversations at the moment. Organisations are considering how this new tool could help them, and in common with other innovative technologies one of the challenges is identifying use cases within the business where Syntex could have a big impact. I have a lot of thoughts on this in general, but one thing Microsoft have done to help is release the Microsoft 365 Assessment Tool (gradually replacing what was the PnP Modernization Scanner) which now has a 'Syntex mode' specifically - this can be used to assess your tenant and usage for Syntex automation opportunities. The idea is that by scanning your SharePoint landscape and IA for certain characteristics, this could uncover areas of the business using SharePoint in certain ways where Syntex could help. In reality, Syntex really shines where documents are part of a complex or time-consuming process - and a tool can only go so far in identifying that. But the idea has merit, so let's explore what the tool provides and how it's used.

Later, we'll also consider a more rounded approach to identifying document automation and Syntex opportunities.

What the Syntex Assessment Tool provides

Once you've done the work to install and configure the tool (covered below), an assessment is run to scan your Microsoft 365 tenant in 'Syntex adoption' mode. This launches a scanning process across your entire SharePoint Online estate which, depending on tenant size, will take some time. Once execution is complete, a Power BI report is created as the output - allowing you to slice and drill around your data in later analysis. The theme of the tool is to identify areas of 'SharePoint intensity' - examples include your largest document libraries or document libraries where custom columns and/or content types have been created. Other insights include your most heavily used content types and libraries with names matching common Syntex usages (e.g. invoices and contracts). The full list of report elements and descriptions from the tool is:

  • Libraries with custom columns - Identify libraries where Syntex can automatically populate columns, improving consistency
  • Column usage - Identify patterns of column usage, to target Syntex models where they will have the maximum benefit 
  • Libraries with custom content types - Identify libraries using custom content types, where Syntex models can be used to automatically categorize files. 
  • Content type usage - Identify patterns of content type usage, to target Syntex models where they will have the maximum benefit
  • Libraries with retention labels - Identify libraries where retention labels are used, where Syntex can be used to automate and improve consistency
  • Library size - Identify large libraries where classification and metadata can improve the content discovery experience
  • Library modernization status - Identify libraries which may need to be modernized to fully make use of Syntex
  • Prebuilt model candidates - Identify libraries where names or content types suggest a prebuilt model could be applied
  • Syntex model usage - Review the current use of Syntex models in your site
So that's an overview but it's more helpful to look at the results of running the tool - the sections below dive in into the output, and then towards the end we'll zoom out again to consider the role of the tool overall. 

Looking at real-life results - the Power BI report from my tenant

The screenshots below show the report output from one of my tenants - this isn't a production tenant but does have 1000+ sites and several years of activity.

Assessment overview

Provides an overview of the assessment run you performed, covering how many sites were processed successfully vs. any failures. I had 15 failures out of 1169 site collections for example:

 

Libraries with custom columns

A fairly useful indicator of 'SharePoint intensity', because if lots of columns have been created it shows that tagging/metadata is important here. This could indicate that having Syntex extractors automatically tag each document could be powerful. I have 574 such libraries in my tenant:

 

Column usage

Similar to the above, but focused on re-use of your custom columns and most common custom column types:

 

Libraries with custom content types

Again, a potential sign that files here are important because the library is using custom content types:

 

Content type usage

Gives you insight into your top content types - how many lists each one is applied to, how many items are assigned to the content type etc.

 

Libraries with retention labels

You might ask 'how are retention labels relevant to Syntex'? Remember that a key Syntex capability relates to information governance - the ability to automatically recognise potentially sensitive documents from their contents (e.g. contracts, CVs, NDAs, HR documents etc.) and ensure they are retained (or disposed of) with appropriate compliance. Since this is a non-production tenant I don't have too much of this, but you may do:

 

Library size

Again, knowing where your biggest libraries are can help you understand SharePoint hot spots, where many documents potentially relate to a process:

 

SharePoint modernisation status

This one is less directly connected to Syntex (relating as it does to the modern/classic status of the library in general), but relevant because Syntex can only be used on modern libraries. If you find important libraries still in classic status, you'll need to modernise them for the Syntex options to show up:


Prebuilt model candidates

Syntex ships with prebuilt AI models for receipts and invoices. This report element is simple but can be highly effective - essentially, 'find all the libraries in my tenant which have receipt or invoice in the name'. Most likely your production tenant *will* have this content somewhere, and Syntex could help provide insights or automate processes here:

 

Syntex model usage

This last page in the report gives insight into any existing Syntex usage in your tenant. In my case I have 8 models, and because none of them have recently executed the number of items classified in the last 30 days shows as 0:

So that's the tool output. Now let's turn our attention to how to run it in your tenant. 

Running the Microsoft 365 assessment tool

The tool itself is command-line based, hosted on GitHub, and comes in Windows, macOS and Linux flavours. Here's what you'll need:
  • A machine to run the tool
  • The tool downloaded from GitHub - see Releases · pnp/pnpassessment · GitHub
  • To register an AAD app with certificate-based auth - you'll register this in your tenant to allow the tool read access to your sites and workflows
The tool itself has a mode to help create the self-signed cert and get the AAD app registered. The command to do this is detailed on the Authentication page in the documentation. Most likely you will want to register a dedicated app for this tool rather than piggy-back on something else, because any SPO throttling will then be first restricted to this app rather than a critical production solution you have. 

The permissions required may need some thought because the 'optimal' permissions (which are needed for a full assessment) for application scope are:
  • Graph: Sites.Read.All
  • SharePoint: Sites.FullControl.All
The tool can perform a less complete audit with more restrictive permissions however - you'll get a less informative report with some sections missing, and whether that provides enough decision-making info to you is for you to decide. All of this is documented on the Permission Requirements page in the docs.

Once you're set up with authentication, it's a matter of running the tool in Syntex mode with the --syntexfull flag (if that's the assessment type you're able to run):

microsoft365-assessment.exe start --mode syntex --authmode application ` --tenant chrisobrienXX.sharepoint.com --applicationid [AAD app ID] ` --certpath "[cert path and thumbprint]" ` --syntexfull
 
Your Power BI report will emerge once the tool has trawled through your SharePoint estate.

Conclusions

So we've seen how the tool is used and what the report provides. But what value does it provide in the real world?

My recommendation - use the tool as ONE input
It was a great idea to extend the Microsoft 365 assessment tool for Syntex, and I fully agree that there are certain indicators of SharePoint use that align strongly with Syntex. However, the tool is no substitute for real process mining in your organisation - and I've no doubt the tool creators (the Microsoft PnP team) take the same view. My recommendation is to use the tool if you can, but perhaps think of it as background research before talking to the business. When working with a client to identify possible scenarios to automate with Syntex it's useful to talk to different teams and functions. I like to ask questions like these to uncover situations where Syntex automation could add high value:
  • Which document types are most important to the business? Why?
  • What types of documents do you have which are time or labour intensive (for people to read and process or create)?
  • What types of documents do you have which have a significant process around them?
  • What types of documents do you create in large volumes?
  • Which documents are part of a transmittal or submittal process? In other words, which documents are exchanged with other parties rather than spend their entire lifecycle within your organisation?
  • Which documents contain sensitive information, and should therefore potentially have information protection policies applied to them to support compliance?

Hopefully this analysis of the Syntex assessment tool has been useful. Syntex is a powerful tool to bring automation to an organisation's critical processes and we're going to see a lot more of it in the future.

Tuesday, 10 May 2022

Automate creation of new documents with SharePoint Syntex Content Assembly

With the recent Content Assembly feature, SharePoint Syntex can now create templated documents as part of an automated process - all native to Microsoft 365. This is an important part of intelligent automation since so many processes involve a document - in the consumer world your mortgage documents, an insurance quote, a letter confirming a policy change and so on. While financial and ERP systems often take care of invoices, sales orders and purchase orders, the business world revolves around proposals, contracts, safety reports, compliance certificates, checklists, affiliate agreements and many other types of standard document. I know what you're thinking - sure, there have always been ways to automate document creation, but custom code or a 3rd party product have typically been needed. Often these solutions have been very expensive indeed, such as some of those for intelligent proposal generation. 

Another significant thing here is that I don't remember ever seeing a solution 100% baked into the core Microsoft experience in such an intuitive way - that means potentially everyone in the organisation can make use of automated document creation, and without training on specialist tools. This combination of things that make me believe that what Microsoft are doing with Syntex (and the journey they're on with Content Services in general) is extremely significant in the market. 

Scenario used in my example

Before I dive into showing how Syntex Content Assembly works, some details on my scenario - here I want to easily create documents for different open roles here at Content+Cloud. We have no less than 100+ open roles at the moment and these documents are used in internal hiring and to brief external agencies, so simplifying the creation of these documents and reducing the copying and pasting that people currently is a big win. The documents combine variable content (the role details) with standard content (our benefits and working environment etc.) Using Syntex Content Assembly, we can automate the document creation and Syntex ensures the respective details for the role are dropped into placeholders in the template. 

Let's start with the fundamentals.

Syntex Content Assembly ingredients
Understanding how Content Assembly works starts with understanding the two main ingredients needed:
  • A SharePoint list - this will contain an item for each document instance to create, with the values to drop in
  • A 'modern template' - this is an Office document specially created with Syntex support, providing the outer template for the document


Creating a Syntex modern template

Step 1 - creating the list


Since you'll automate the creation of documents from items in a SharePoint list, the first thing to do is to define and create that list. In my case it's a list for our open roles and I've added SharePoint columns for the variable pieces of information which will be dropped into the documents:  

It's very important to have this list available and ready (at least it's structure and columns) before going to create the modern template. The reason for this is because the template creation process involves you picking the columns and specifying the location in the document where this information should display. 

Step 2 - creating the template itself


To do this you'll need a user with a Syntex license. A licensed user will see a new option in a SharePoint document library to create a modern template:

Have your Word document ready when you click the item above - but note that it can be a "filled" version of your document rather than a specially prepared template. You'll upload it from your PC rather than select it from an existing SharePoint location:

Once your document has been selected we go to the next phase.
Adding content placeholders to the template
At this point you'll see a panel on the right which will help you drop the placeholders into the document template. Simply highlight some text and then the 'add placeholder' control appears on the right:

The placeholder control asks you to name the placeholder and specify what should be dropped into it when new documents are created from the template. There are two options here:

Text shown Use when
Enter text or select a date The user should enter some new text as each document is created from the template
Select from choices in a column of a list of library The text should be pulled from the list item as each document is created from the template

For the most part you'll probably be using list item values for the vast majority of values dropped into your documents - adding text ad-hoc may be useful sometimes but the real power is in more automated creation from your structured data. Let's look at each in more detail:

Option 1 - entering an ad-hoc value

When creating the template, you'll specify what kind of data can be used for this value, but note this isn't adding any kind of column to the list containing your list items - it just constrains the data that can be entered:

When using the template, a control is presented to the user (matching the data type) to allow the ad-hoc value to be entered:

Option 2 - selecting from a list item

Using this approach, when creating the template you'll browse to the SharePoint list you're using a choose the column which holds values for this 'field' in the document:
First choose the list:
...and then the column:
Once selected you'll see confirmation of the selected column, including if you come back in later to edit:
When using the template, once the user has selected an item in the list for the document instance being created, Syntex will automatically fill in this placeholder from your list item - as it will do for ALL of your placeholders in the document. A placeholder which is bound in this way can't be overtyped, but that makes sense because you are creating the document from a 'record' in your list:

Finalising the template
The template creation process from here is essentially repeating the above steps for each placeholder you need in your template. For each placeholder, either select that the value should be entered ad-hoc or (more likely) pulled from the list item. Once this is done you'll have lots of placeholders in your template, with each one listed on the right-hand side showing where the value will come from:
At this point the template can be published. Hit the publish button and see the confirmation:
So once we have the template in place, what does creating a document look like?

Creating documents using Syntex Content Assembly


At this point all of the one-off, upfront work to create the template is done - so creating new document instances from the template is quick and easy. No more copy and paste between documents! The critical point to note here is that currently document creation does involve a manual step. I think we can be sure that Power Automate actions will be here very soon to provide end-to-end automation, but for now here's how the process works - your Syntex modern template appears on the 'New' menu for your SharePoint document library:
Selecting the menu option takes you into creating a document instance from your template:
You can give the document a name (the default is the template name) using the editable name highlighted with the red arrow. On the right-hand side, all the placeholders are currently empty - the only reason the document in the image below has content is because in this case my template has content in it (as opposed to empty placeholder values). In any case, they will be overwritten in the next step. To create a document using one of my items in the SharePoint list, I simply choose a field (most likely the title field) and I see a picker of my list items:
Once an item is selected, all values are pulled from the list item and dropped into the corresponding placeholder in the document. You'll need to hit the 'Refresh to preview' button:
The document is now created!


As you'd expect, it lands in the document library where you created and applied the modern template:

So what do we take away from this?

 
The significance of Syntex Content Assembly
What we're seeing here is another big step in the democratization of automation solutions. Sure, there's some complexity as you get used to Syntex Content Assembly, but for the first time it's completely native and baked into Microsoft 365 (specifically the SharePoint document library interface) and doesn't require any coding, scripting, or complex configuration.

Every single organisation today has people copy/pasting into documents - statements of work, project plans, legal agreements, sales pitches, supplier/partner/employee contracts, role descriptions, and many more. In the best-case scenario, a lot of time is being wasted. In the worst-case, human error can creep in and in some cases can have a serious impact on work and people. With Syntex Content Assembly, Microsoft is making a serious play in document automation for the masses - and a segment of specialised product vendors are likely be worried.

Today there are quite a few limitations. A couple of clicks are needed to create the document (so arguably not full automation), there's a one-to-one relationship between the template and a document library, and for now it's only Word. Additionally, in my experience Syntex is a bit finicky in terms of what can be in your Word template. These are worth discussing in more detail, so I'll cover these and more in my next post:

Thursday, 22 April 2021

SharePoint Syntex - teaching AI to extract contents of structured documents with Form Processing

In previous articles on SharePoint Syntex I've talked mainly about the document processing approach - in this post I'll discuss it's counterpart, form processing. For those following along, my overall set of articles on this theme so far are:

Syntex - document processing

Syntex - general
That last article in particular is designed to help you understand the difference between the two models and when to use each one. As you read about Syntex you might form the view that "form processing is for things like invoices and order forms and document understanding is for everything else" - certainly some of the guidance infers this. However, that position is far too simplistic - there are differences in licensing, capabilities, supported file types and more - and you'll want to get this decision right to avoid having to rework AI models. My "tips for choosing" article might be helpful since it has a table of differences and details of licensing aspects to look out for. 

But today, we focus on form processing!

Syntex form processing - integrated AI Builder

As the briefest of recaps, Syntex form processing is typically more suited to highly-structured and consistent document formats compared to document processing, that much is true. Since the AI Builder technology within Microsoft's Power Platform is used, there are a few implications to consider:
  • To use, AI Builder credits are needed in addition to Syntex licenses (see AI Builder calculator). However, if your org has 300+ Syntex licenses you receive a generous allowance of 1m credits - this more than gets you started
  • Supported file types include JPG, PNG or PDF - but not Office files
  • Entire tables can be extracted from the document (in contrast to document processing)
  • The model is applied via a Power Automate Flow to the SharePoint document library where your documents reside (i.e. where you create the model from) - but there is no easy way to use this in other locations
In short, it's AI Builder conveniently built into SharePoint document libraries - so you don't have to do the integration or somehow pass each document to the model, it's taken care of for you.

Our invoice format

Before we get started on the process, it's worth seeing the format of documents used in this process. Like many classic examples of this type, they are invoices:


Implementing Syntex form processing

The approach followed here can be summarised as:
  1. Define the information to extract (i.e. teach Syntex what the fields are e.g. "Invoice Reference", "Invoice Date" etc.)
  2. Add documents for analysis
  3. Tag documents (i.e. teach Syntex where to find the relevant content in the document)
  4. Train the model
  5. Test
  6. Use in your document library
In your SharePoint document library, find "Automate" > "AI Builder" > "Create a model to process forms":



You'll see this message alerting you that AI Builder credits are needed:

Give your model a name - I'm using "COB invoice" for now. I want a new SharePoint content type to be created with this name so these documents are easily identified and classified amongst any others:

Syntex then begins to create your AI model:


Once the model has been created we define which information within the document we want to extract:

As the image shows, I start specifying some things I want to extract such as:
  • The invoice date
  • The invoice reference
  • The VAT number
Syntex now allows me to supply a collection of documents to train the model:

I create a new collection of documents for my invoice scenario:


I have some invoices ready to go, so I select those to upload:


Once uploaded I'm ready to analyze!


Once the analysis is complete we move into the tagging phase

The tagging phase

As you move your mouse, Syntex allows you to highlight portions of the document by drawing boxes around identified pieces of text. By doing this, you map them to the fields you defined at the beginning - these appear in a picker for selection, with a checkbox indicating whether you've already mapped this item. So I move through the document teaching Syntex what is the invoice reference, what is the date, the supplier name and so on.




As you can see, Syntex allows me to pick something as granular as an individual word or even character, or expand to pick a phrase or string of characters. Items with a green border are already tagged:

Tables can also be tagged in this way:
Once I'm done tagging I'm presented with a summary of the model, with a list of the fields I've defined:




We're now ready to move into the training phase

The training phase

We start by hitting the Train button:



Once the model has been training you can either run a quick test against a new document (not one used for training) or go ahead and publish it to your SharePoint document library:



Let's go ahead and publish the model. Once I have a published version, any subsequent changes will create a draft - this allows me to test things out (and get them wrong) whilst not disrupting the extraction that's already in place.

Once a model is published, we can go ahead and use it:


This makes the model available for use in a Power Automate Flow, and the person using will need to consent to the connections being used:


The resulting Flow looks like this:


If you're interested in the mechanics, the piece that does the extraction is this - the "Predict" action for AI Builder which links to the model we just created:


The results

So let's go back to the invoice format we are using:




When this file is uploaded to SharePoint, initially it's just any old document:


..but then after a couple of minutes the document is correctly identified and classified as a "COB Invoice" and the values I trained the model for are extracted:


Excellent. Now I can drag in many old invoices and have them properly classified and summarised:


..and after a couple of minutes:



Conclusion


Syntex is hugely powerful in automatically unlocking critical data from documents - it doesn't need to be buried inside any more. At the beginning of this series, we discussed how the best research suggests knowledge workers spend 20-30% of their time just searching for information or expertise, and many of us would recognise that having to open many documents to check their contents can contribute to this. As above, I can build SharePoint document library views so that information is readily-accessible or the view is sorted, filtered or grouped according to extracted information.

These benefits go far beyond search and views though. Having my documents correctly identified means that I can apply security and compliance policies to them, for example a conditional access policy which means employees can't print or download sensitive contracts from an unmanaged device, or a retention policy that means a Master Services Agreement is retained for 6 years. Syntex can drive these approaches so that policies are applied by the AI recognising the document, and this can work across documents of wildly varying formats so long as there's some consistency that a document understanding rule can be applied to.

Being able to automatically extract information also means I can build process automation around my documents, for example if something comes in for a certain region or above a certain value, I can route approval processes or notifications accordingly. There are many possibilities here alone. 

Ultimately it comes down to classification and extraction, and there are so many possible use cases around CVs, proposals, statements of work, RFPs, employee contracts, invoices, sales/purchase orders,  service agreements, HR policies and just about any other document type you can think of. This is democratised AI in action, and it's great to have it so accessible in SharePoint.