Tuesday, 11 November 2025

Autonomous agents - Technique 2: Define explicit steps in agent instructions when "reasoning the process" isn't appropriate

One of the main reasons AI agents are such a leap forward in automating work comes down to their ability to reason and dynamically plan in order to reach an outcome - which of course, is much like us humans intuitively do in our work. Apps and systems that we've developed over the past few decades have essentially all followed set processes with procedural code, and while many have flexible logic and business rules, today's AI models take us several levels beyond this. Ultimately we move away from a world of apps with explicitly coded steps to one where AI agents are often instructed with the end goal or outcome, but not necessarily a granular process of how to get there. So long as the agent is given the data, tools, and guidance to do this, it should have a good chance of meeting the outcome. Or that's the theory.

One of the biggest challenges today with autonomous agents is that they can have too much freedom and go off the rails from the intended task. We're not talking dystopian takeover, simply unpredictable agents with poor results and a lack of consistency. So, this post covers how to make your agent more reliable by finding the balance between autonomy and specific guidance - it all comes down to how you write the agent instructions. 

But first, here's a recap of the full series:

Articles in this series

  1. Techniques for autonomous agents in Copilot Studio - intro 
  2. Scenario video - Microsoft architect with proposal generation
  3. Technique 1 - Getting AI-suitable descriptions right - data, tools, agents themselves 
  4. Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate (this article)
  5. Technique 3 - Provide tools like Agent Flows for steps the agent can’t easily handle
  6. Technique 4 - Leveraging Power Platform and Microsoft 365 capabilities in your agents
  7. Technique 5 - Understand cost, capability, and governance implications of agents you create

Some lessons learnt on agent instructions and autonomy

Going back to our Microsoft 365 architect agent used in this series (which automates some of our work at Advania by considering a business challenge and the client's tech landscape, then makes an architectural and product recommendation mapped to, here's how I started out with the instructions I gave to the agent. 

Poor agent instructions
The user will input a use case or business challenge. Using your knowledge of Microsoft 365 licensing and the client's environment and licensing, provide a recommendation of how a solution could be implemented using the available technologies. Generate a proposal outline to implement the solution - propose the technology(s) to be used, any licensing uplifts required which aren't currently in place or planned, and an overview of what the project would look like. Create a Word document containing the prompt and response. Also log the results to the SharePoint list in the "Syntex-AI-intelligence" site named "Technology proposals".

All seems quite reasonable and specific - I'm outlining specific goals and steps to take, and even being ultra-clear on where to find the SharePoint site to log the output to. However, let's take a look at the result:

 What's actually happening here is:
  • The agent only runs one step - which is never going to work
  • The agent is trying to determine the overall architectural solution simply from the first knowledge source (the m365maps.com site which provides info on Microsoft 365 products and SKUs 
I won't show the end output here to avoid confusing things, but the actual answer given by the agent is very low quality - it does say that Defender for Endpoint Plan 2 could be used (correct) as seen in the screenshot above, but the agent fails on several key things we want it to do: 
  • There's no real consideration of the factors here, the agent appears to be doing some basic querying only. No basis for the recommendation is specified, and there's no evaluation of licensing uplifts which may be required (as asked for in the instructions)
  • The agent didn't log it's results to the list as requested
  • The agent didn't create the Word proposal document as requested
So, clearly this isn't working even if it has a hint of the right answer. 

Defining explicit steps in a numbered process

The reality is we need to be much more explicit with agents today - expecting them to be able to "reason the process" from a set of instructions often doesn't work and results will be unpredictable. In particular, where a process has steps that truly need to be followed in sequence - like fetching some data and then reasoning over it - the thing that makes the difference is specifying these steps in a numbered list. 

Here are my revised instructions:

Sequenced agent instructions
  1. Derive the client organisation name from the user's prompt. If you are unsure of the organisation, ask the user for clarification. 
  2. Derive the use case or business challenge the client is asking about from the user's prompt. If you are unsure of the use case, ask the user for clarification. 
  3. Using the client name you already derived, use your knowledge sources to understand which technologies the organisation is licensed for. 
  4. Now use your knowledge sources to ensure you understand the individual Microsoft products and sub-capabilities of these Microsoft 365 SKUs - build a categorised list for use in later steps. 
  5. Using the data from the previous step, reason to provide a recommendation of Microsoft technologies which could meet the specific requirements. IMPORTANT - be extensive with the rationale for your decision, mapping how capabilities in the proposed Microsoft technology meet specific client requirements and detail this in your output. If no recommendation can be made (e.g. because the use case cannot be met with the licensed technologies), make a recommendation of which technologies could be purchased to help. As part of this reasoning, establish any licensing uplifts which may be required for this client to use the recommended technologies. ## Be sure to detail your chain of thought in the output ##
  6. Also reason to establish any non-Microsoft technologies that could potentially be used to fulfil the requirement. 
  7. Generate a proposal outline to implement the solution - propose the technology(s) to be used, any licensing uplifts required which aren't currently in place or planned, and an overview of what the project would look like to implement the Microsoft-based solution. 
  8. Log the results to the SharePoint list in the "Syntex-AI-intelligence" site named "Technology proposals".
  9. Create a Word document containing all details of client requirement and proposed solution - ensure to use the full reasoned output in the details. Save this to the SharePoint library in the "Syntex-AI-intelligence" site named "Technology proposal documents".

I've expanded on the detail of what I want in some places, but the other major change is switching to a numbered list to instruct the agent to follow those steps in order. Now let's look at the difference:


 
Looking at those images, straight away we can see the agent is now following many more steps - and if we drill into the two "Knowledge" steps as shown above, we can see the query used for each (and the results) now make sense:
  • Advania client data - agent asks it what technologies the specific client is licensed for
  • M365maps.com - agent uses this to discover what capabilities are in each granular Microsoft 365 SKU
This sets us on the path to success.

Results from the agent 

At this point we hope our agent will properly consider the client challenge provided to it (e.g. replace our endpoint protection, plan for a new intranet, introduce a compliance tool etc.) and then consider their technology environment and licensing, if anything provided in Microsoft 365 fits the bill - and then reason through all this to produce a technology recommendation, considering any licensing uplifts needed as well as adoption/migration considerations. Here was my prompt:

Unilever (fictional - remember I'm redacting real Advania client data with the top 20 UK companies), one of our clients, want to roll out a new endpoint protection technology to replace CrowdStrike to approximately 100,000 end-user devices. Ransomware protection is a key concern, and the RFP lists automated incident response and remediation, integrated threat intelligence, and sandboxing as required capabilities. 

Notably, they need coverage for 5,000 devices where users only have Microsoft 365 F1 licenses (frontline workers).

Here's the result:

Chris, here’s a complete recommendation and delivery outline for Unilever’s endpoint protection RFP.

Organization: Unilever
  • Use case and business challenge inferred: Replace CrowdStrike across ~100,000 end-user devices.
  • Must meet: strong ransomware protection, automated incident response and remediation, integrated threat intelligence, and sandboxing/detonation.
  • Coverage gap to address: 5,000 devices used by Microsoft 365 F1 frontline workers.
What Unilever is licensed for today (from our records)
  • User licensing: Majority on Microsoft 365 E3 with EMS add-on; ~15,000 on E5; ~10,000 Microsoft 365 F1 frontline licenses.
  • Security stack in place: Entra ID P1, Microsoft Sentinel (Azure Sentinel).
  • Endpoint and platforms: Endpoints managed with Intune; currently using CrowdStrike for endpoint protection; Mimecast for email; AVD and Citrix in use; Azure in UK/US/AUS regions.
Microsoft products and sub-capabilities relevant to this RFP (categorized)
  • Endpoint protection and EDR
    • Microsoft Defender for Endpoint Plan 1 (MDE P1): next‑gen AV, tamper protection, attack surface reduction (ASR), web content filtering, centralized management, cross‑platform support, “Block at First Sight.”
    • Microsoft Defender for Endpoint Plan 2 (MDE P2): adds endpoint detection and response (EDR), threat analytics, vulnerability management (core), and longer searchable data retention; forms the endpoint pillar of Microsoft 365 Defender (XDR).
  • Automated investigation and remediation (AIR)
    • Microsoft 365 Defender suite provides AIR across workloads; Microsoft documentation highlights AIR in Defender for Office 365 Plan 2 (email) as an example capability within the suite.
  • Threat intelligence
    • Threat Analytics within MDE P2 delivers integrated intelligence and analyst‑ready reports that inform investigations.
  • Sandboxing/detonation
    • Endpoint: “Block at First Sight” uses rapid cloud evaluation and detonation for unknown files for preventative blocking on endpoints (P1+).
    • Email/collab: Defender for Office 365 Plan 2 adds Safe Attachments (detonation sandbox) and Threat Explorer for campaign-level investigation.
  • SIEM/SOAR
    • Microsoft Sentinel already present for Unilever; integrates natively with Defender signals for advanced hunting/automation.
  • Device and app management
    • Microsoft Intune already in use for policy, configuration, and deployment orchestration.
Summary recommendation
  • Standardise on Microsoft Defender for Endpoint Plan 2 across the Windows, macOS, Linux, Android, and iOS estate; manage deployment and policy via Intune; unify signals in Microsoft 365 Defender (XDR) and stream to Microsoft Sentinel for SIEM/SOAR.
Why this meets each RFP requirement
  • Ransomware protection
    • Prevent: Next‑gen AV, ASR rules, tamper protection, and “Block at First Sight” provide strong preventative controls on endpoints (P1 baseline).
    • Detect and contain: MDE P2 adds behaviour‑based EDR with live response, threat analytics, and extended searchable telemetry to spot hands‑on‑keyboard activity and lateral movement quickly.
  • Automated incident response and remediation
    • MDE P2 enables automated investigation and remediation on endpoints to contain and fix threats (part of Microsoft 365 Defender’s AIR capabilities).
    • Sentinel playbooks (Logic Apps) can orchestrate cross‑tool actions at scale (contain device, isolate network, reset credentials) using Defender incidents as triggers; Unilever already has Sentinel.
  • Integrated threat intelligence
    • Threat Analytics in MDE P2 provides curated intel mapped to current campaigns and exposures, tied directly to Unilever’s devices and vulnerabilities for prioritisation.
  • Sandboxing/detonation
    • Endpoint: Cloud‑based file analysis and “Block at First Sight” help pre‑empt execution of suspicious payloads on endpoints.
    • Email: If Unilever later consolidates on Defender for Office 365 Plan 2, Safe Attachments detonates content pre‑delivery and contributes to unified incidents in Microsoft 365 Defender. 
How this fits Unilever’s current environment
  • Intune is already the endpoint manager, making at‑scale onboarding, policy configuration (ASR, AV, EDR), and gradual CrowdStrike co‑existence/cutover straightforward.
  • Sentinel is already deployed, so incident fusion, hunting queries, and automation can be turned on rapidly to cover both endpoint and other data sources.
Licensing path to enable the above
  • Users currently on Microsoft 365 E5: MDE P2 rights are included; no uplift needed for those users.
  • Users on Microsoft 365 E3 with EMS add‑on: add either the Microsoft 365 E5 Security add‑on (bundled security including MDE P2) or standalone Microsoft Defender for Endpoint Plan 2 licenses for those users. 
  • Frontline worker devices on Microsoft 365 F1 (5,000 devices): options to cover endpoints with MDE P2
    • Option A (preferred for feature parity): move these users to Microsoft 365 F3 and attach an appropriate security add‑on that grants MDE P2 (for example, an enterprise security add‑on akin to E5 Security for frontline) or assign standalone MDE P2 per user where permitted. Validate final SKU mix with Microsoft licensing because frontline bundles vary by agreement and commercial construct.
    • Option B (kiosk/shared scenarios): if devices are truly shared and user‑based licensing is impractical, explore device‑based licensing options with Microsoft for MDE where applicable. Validate with the Microsoft account team to align with Unilever’s Enterprise Agreement terms.
Note: The SKU matrix shows where endpoint features and EDR/Threat Analytics land (P1 vs P2/E5 Security). Use this to ensure every user segment lands on P2 for EDR/AIR. 

Commercial and operational rationale
  • Reduces agent sprawl by consolidating into the Microsoft stack Unilever already operates (Intune, Sentinel), lowering operational overhead and integration complexity.
  • Meets or exceeds the specified capabilities with unified incidents, threat intel, and at‑scale automation, while improving mean time to detect/respond.
Non‑Microsoft alternatives that meet the specification (for RFP completeness)
  • CrowdStrike Falcon Insight + Falcon X (sandbox) + Falcon Fusion for automation.
  • SentinelOne Singularity Complete/Control with Storyline automation and rollback.
  • Palo Alto Cortex XDR Pro + WildFire for detonation + XSOAR automation.
  • Sophos Intercept X Advanced with XDR + Sophos Sandstorm.
  • Trend Micro Apex One SaaS + Vision One XDR + detonation sandbox. 
Note: These all offer ransomware protection, automated response, TI, and detonation. The decisive factors will be agent footprint, integration with Intune/Sentinel, and TCO relative to Microsoft’s included rights.

Delivery proposal outline for Unilever
  • Scope: Replace CrowdStrike with Microsoft Defender for Endpoint P2 on ~100,000 devices; ensure parity on ransomware protection, AIR, TI, and sandboxing; integrate with Microsoft 365 Defender and Sentinel.
  • Phases and timeline (indicative)
    • Readiness and design (3–4 weeks)
      • Confirm license entitlements per segment; define co‑existence approach with CrowdStrike; finalize policies (AV/ASR/EDR), device groups, and ring strategy; connect Defender to Sentinel with required data connectors and analytics rules.
    • Pilot (4–6 weeks, ~2,000 devices across key personas/regions)
      • Deploy MDE sensor via Intune; validate protection efficacy, AIR behaviour, endpoint isolation, and SOC workflows; tune ASR/tamper protection; validate detonation/“Block at First Sight.”
    • Scale‑out waves (8–12 weeks)
      • Roll out by business unit/geo; enable EDR in block mode; progressively disable CrowdStrike on devices once Defender is in “active” mode and stable; expand Sentinel automations and hunting queries.
    • Cutover and decommission (2–4 weeks)
      • Remove CrowdStrike agent; finalize GPO/MDM baselines; confirm incident volumes, MTTR, and coverage; complete runbook documentation and knowledge transfer for SecOps/Endpoint teams.
    • Hardening and optimization (ongoing)
      • Leverage Threat Analytics and TVM to drive risk reduction; regular red/blue exercises; quarterly policy reviews; rationalize email sandboxing approach if consolidating on Defender for Office 365 Plan 2.
Architecture and integration highlights
  • Intune for deployment/policy; Microsoft Defender for Endpoint P2 on endpoints; Microsoft 365 Defender for incident correlation; Microsoft Sentinel for SIEM/SOAR rules, hunting, and playbooks; integration to ServiceNow/Jira for ticketing as needed.
Roles and responsibilities
  • Our team: solution design, pilot, deployment tooling, Sentinel content and automation, change and cutover runbooks, training.
  • Unilever: environment access, pilot cohort sponsorship, change approvals, SOC/process owners, endpoint engineering partnership.
SharePoint list logging (Technology proposals)
  • Action prepared: Log the proposal to Syntex-AI-intelligence > Technology proposals with fields: Client=Unilever; Use case=Endpoint protection replacement; Scope=100k devices; Key requirements=ransomware, AIR, TI, sandboxing; Recommendation=MDE P2 + M365 Defender + Sentinel; Licensing notes=E5 covered; E3 add E5 Security or MDE P2; F1 coverage options; Status=Proposed; Next steps=Pilot planning.
  • Limitation: I don’t currently have write access via this interface to post to your SharePoint list. If you can grant me access or confirm the list URL and credentials, I will submit this entry immediately on your behalf.
Word document creation (Technology proposal documents)
  • Action prepared: Generate a Word document containing this full recommendation, capability mapping, licensing options, and delivery plan, and save to Syntex-AI-intelligence > Technology proposal documents with a filename like “Unilever – Endpoint Protection Replacement – Proposal.docx.”
  • Limitation: I don’t have write access to create/save documents to your SharePoint library from here. If you share a drop‑off library URL or grant permissions, I’ll create and upload the document automatically; alternatively I can send you the .docx for you to place.
Key decisions for you
  • Confirm the preferred licensing path:
    • E3 population: E5 Security add‑on vs standalone MDE P2.
    • F1 population (5,000 devices): upgrade to F3 + security add‑on vs assign standalone MDE P2 to those users; validate final SKU choices with Microsoft licensing to align with Unilever’s agreement.
    • Confirm pilot cohorts and target start date so we can mobilise.
Citations used: [list of citations]

Phew! The results are amazing when you think about it - the agent has considered all the factors and showed a level of understanding equal to a highly experienced and capable Microsoft security specialist. We even see that the agent has done what was asked in terms of mapping specific requirements given by the client (e.g. automated incident response and remediation, integrated threat intelligence, and sandboxing in this case) to product features in the proposed solution, Defender for Endpoint. 

We do still have two issues though - as the two "limitation" notes in the agent's response call out, the agent wasn't able to:
  • Log the output to the SharePoint list as requested
  • Create the Word proposal document as requested
So, we're not done yet - we'll come to these resolutions in the next article. 

In terms of the cognitive work the agent has successfully performed, at Advania we don't see this as replacing our high-end architects - but we do see it as accelerating our teams and allowing more junior members of the team to take on more complex work. The work partly switches to verification, judgement, editing where required, and rounding out and adding perspective - all elements which would have been required anyway. Having the agent take on some of the complex research, interpretation, and generation is huge though of course - we really are democratising intelligence and expertise and therefore moving faster and serving more.  

Summary

Autonomous agents will transform work over the next few years but their effectiveness depends on how we guide them, and instruction design matters. Simply stating an end goal and the outline of a process is unrealistic, at least today - instead, breaking instructions into clear, sequenced steps (probably using numbered bullets as I've shown in this article) is needed to get reliable agent behaviour. It's a question of balancing autonomy with control, and human oversight and verification is crucial of course.   

The next article will resolve those pesky challenges of the agent not being able to create Word documents to accelerate that part of the process and the team's work.

Articles in this series

  1. Techniques for autonomous agents in Copilot Studio - intro 
  2. Scenario video - Microsoft architect with proposal generation
  3. Technique 1 - Getting AI-suitable descriptions right (data, tools, agents themselves)
  4. Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate (this article)
  5. Technique 3 - Provide tools like Agent Flows for steps the agent can’t easily handle
  6. Technique 4 - Leveraging Power Platform and Microsoft 365 capabilities in your agents
  7. Technique 5 - Understand cost, capability, and governance implications of agents you create

Next article

Technique 3: Provide tools like Agent Flows for steps the agent can’t easily handle

Wednesday, 3 September 2025

Autonomous agents - Technique 1: Getting AI-suitable descriptions right - data, tools, agents themselves

As I opened with in the first article in this series, AI agents are the next generation of solution we'll build for the next few years - this is how work gets automated, at least some of it. Business systems which take away some of the human processing so we can focus on more impactful work, powered by highly-capable LLMs with the ability to consume data, reason and plan, and use tools like web browsers. In this series we focus on agent-building techniques, and in particular what it takes to build autonomous agents successfully i.e. that ability for an agent to dynamically reason and plan. Guidance here orients around Microsoft Copilot Studio agents, though it's interesting to reflect that many of the techniques will apply across nearly all AI agent platforms. 

In this article, we focus on descriptions - in the context of AI agents, this means for data, tools, sub-processes and other things the agent might use to get to it's goal. It's interesting because descriptions have been pretty innocuous in apps/solutions/automations we've built over the last few decades, because much like comments in code, only humans read them. However, things are very different when AI is reading them and making decisions about how to process based on the words provided.

But first, here's a recap of the full series:

Articles in this series

  1. Techniques for autonomous agents in Copilot Studio - intro 
  2. Scenario video - Microsoft architect with proposal generation
  3. Technique 1 - Getting AI-suitable descriptions right - data, tools, agents themselves (this article)
  4. Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate
  5. Technique 3 - Provide tools like Agent Flows for steps the agent can’t easily handle
  6. Technique 4 - Leveraging Power Platform and Microsoft 365 capabilities in your agents
  7. Technique 5 - Understand cost, capability, and governance implications of agents you create

Descriptions - why they're suddenly vital

Despite all the advances in the underlying AI models, AI agents are not magic and they need help. This is especially the case when:
  • Agents have some autonomy in terms of getting to an outcome (rather than being defined step-by-step, like a coded application)
  • Agents can use data and knowledge sources
  • Agents can call on tools and sub-processes
  • Agents can call other agents (if they understand what they do)

AI needs good descriptions more than humans do
The critical point is that AI agents use descriptions to understand what data to use and when, what tools to call and when, and what sub-processes to run and when. If your descriptions are poor and don't supply enough context, the behaviour of your agent will be unpredictable and won't be as expected.
 
As an example which may hit many agent builders, many agents fail or behave unpredictably because they don’t understand what their knowledge sources or tools are for. This often happens when default or auto-generated descriptions are used — they’re vague, generic, and unhelpful. For example, if you provide a file as a knowledge source for your agent, Copilot Studio will helpfully use AI to generate a description of the a file - but this will essentially be "this knowledge source uses this file" - so when I provide a detailed Excel file of Microsoft 365 license SKUs and capabilities (from the excellent M365maps.com), the descriptions generated today do nothing to help the agent understand what this knowledge represents and how it could be used:

The Fix:

Write clear, detailed descriptions for every knowledge source, tool, and even the agent itself (to support agent-to-agent interactions in the future). These descriptions are used by the orchestration engine to decide what to use and when. 

Let's focus on a knowledge source (i.e. some data used by an agent) in Copilot Studio. In the agent I'm using in this series, one knowledge source is an Excel export of 'technology landscape' data for key Advania clients (redacted for the video in the last post). 

Bad example (this is the default description generated by Copilot Studio):

“This knowledge source searches information contained in Microsoft license coverage by client.xlsx.”

Good example:

Instead, a far better description helps the agent understand what's in the data/knowledge - for example: 

“This knowledge source details the technologies, plans, and current Microsoft licensing in place for key Advania clients. It covers aspects like how many Microsoft 365 E3 and E5 licenses are held, and which technologies are used for Security & Compliance, messaging, endpoint protection, and document management.”

The agent can now understand exactly what's in this data and how it can be used. This context is essential for establishing how it should go about the process and what to use when.

Agent behaviour with poor knowledge descriptions

Before data and knowledge sources are given good descriptions (rather than the AI-generated ones from Copilot Studio), you'll find all sorts of strange behaviours. Things may look good at first because you see the agent referencing the sources:

However, dig deeper and you'll find some sources simply aren't used in the agent's processing:

In fact, what we see is the agent simply referenced one knowledge source and then stopped there - it didn't actually consider the use case it was given or do any kind of reasoning, and while it gives a response that looks rich at first glance, in fact the agent is simply trotting out some generic info related to Microsoft 365 products. We see some of this in the activity map in Copilot Studio which shows what happened when the agent ran:


Agent behaviour with good knowledge descriptions

Fix the descriptions, and suddenly the agent knows how to consider each knowledge source - if we ask the same question to the agent, we see that BOTH knowledge sources are now used. Firstly the agent queries the "Microsoft 365 license SKU matrix" source to consider endpoint protection options:


Next, the agent starts a specific query on Microsoft 365 F1 licensing, which makes sense given this client is referenced in the client data knowledge source as having this restriction for some employees:


Better - but still not right

On the surface, it seems that the agent is coming closer to it's purpose - being an effective tech architect with deep Microsoft knowledge. However, if the agent's final responses are closely analysed at this point we see that:
  • Unfortunately, the agent isn't truly considering the passed client requirement (as per the demo video shown in the previous article, I'm asking it to consider a particular client's need to replace endpoint protection along with some granular requirements). Instead, I'm still getting some 'lightweight consideration' and generic Microsoft 365 product info at best
  • The agent is also not delivering on another element - I'm not getting a draft proposal document generated, although I'm asking for that in the agent instructions
So in short, the agent is now using data and it's knowledge sources in a more sensible way, but the overall process being followed isn't what we need. We need a step-by-step consideration of the client's need and the granular requirements specific, any Microsoft 365 products or capabilities which might meet these needs, a lookup and consideration of the technologies the client is currently licensed for, and then a genuinely thought through recommendation based on all the above. I won't show the agent output at this stage to avoid confusing things, but suffice to say we have an agent at "first day intern" level at the moment and for this to be valuable we need a lot more cognitive processing.

On to the next challenge - getting agent instructions right
Now our agent understands the data and knowledge it has access to more clearly, we need to turn our attention to how the agent is guided in the instructions we give. If we make some changes here, the agent will go through a step-by-step process and reason over the key questions being asked of it much more effectively. Which leads us to the next article in this series:

Next article

Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate - coming soon

Tuesday, 29 July 2025

Demo video - Microsoft architect autonomous agent (Copilot Studio)

In the previous article in this series on autonomous agents, we talked about what makes an agent autonomous and some implementation fundamentals specific to Copilot Studio. As with anything AI, seeing an example in context goes a long way to helping understand the possibilities, so this second post provides a video of a real autonomous agent we're starting to use at Advania. The agent effectively becomes a member of our team, using advanced reasoning models to work with complex concepts and accelerate our work. Before that, here's a reminder of what this series looks like:

Articles in this series

  1. Techniques for autonomous agents in Copilot Studio - intro
  2. Scenario video - Microsoft architect with proposal generation (this article)
  3. Technique 1 - Getting AI-suitable descriptions right (for data sources, tools, and agents themselves)
  4. Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate
  5. Technique 3 - Provide tools like Agent Flows for steps the agent can’t easily handle
  6. Technique 4 - Leveraging Power Platform and Microsoft 365 capabilities in your agents
  7. Technique 5 - Understand cost, capability, and governance implications of agents you create

Use case for this agent

If you follow me on LinkedIn you may have seen me post about this agent there. We built this agent to automate some of our work at Advania, in particular some of the complex Microsoft architecture and technology consultancy work we deliver to clients. The scenario is essentially an 'expert Microsoft architect' agent which understands:

➡️ The various technology estates of key Advania clients and what they have licensed - the agent sources this from an internal system we have
➡️ Microsoft 365 product SKUs and licensing, specifically E3/E5 suites and granular capabilities in sub-products like Defender for Endpoint, Defender for Identity etc. - the agent uses the excellent m365maps.com website for this
➡️ How to take a specific client requirement (e.g. a need to roll out a new endpoint protection technology/automate a legal process/reach frontline workers with corporate comms etc.), derive any "strong fit" Microsoft technologies, and map granular requirements specified by the client to product capabilities to support the proposed approach

The video shows:

✅ A quick overview of the agent definition (built in Copilot Studio)
✅ Data sources the agent has access to
✅ The agent reasoning through the supplied use case for one fictional client (Ecosphere Solutions)
✅ Proposed approach with clear rationale - including licensing considerations, implementation details, and how specific requirements are met by the proposed technology
✅ Proposal drafted on company-branded template

Demo video



Reflection on AI agents like this

The power of agents is that they can act as a virtual team member, automating some of the workload and enabling human effort to go to higher order challenges. The interesting thing about this agent in my view is the ability to perform advanced reasoning - thinking through the client's need, the technologies they have access to, exactly what's provided in those, and deriving a good fit if there is one. 

Of course, we don't see the AI agent as replacing Advania architects and consultants much-loved by our clients - this is an accelerant for our teams, not a replacement. But we do see agents like this as helping us deliver more value to clients - bolstering our expertise and helping us respond faster with the accuracy and depth we're known for. It also helps us level-up less experienced consultants or new members to a team. In reality, every business has complex processes and expertise that today's AI agents can unlock - this is an example of what makes sense for us.

Next article

Technique 1 - Getting AI-suitable descriptions right (data, tools, agents themselves)

Techniques for autonomous agents in Copilot Studio - blog series part 1

AI agents are the next generation of solution we'll build for the next few years - it's clear that many "app front-end plus data" business systems will evolve to be more agentic, in the sense that the application itself will automate more of the processing, and interfaces become less about forms and more about instructing the AI in natural language. Autonomous agents are one of the most exciting aspects because software and apps simply didn't have this possibility until now, and I don't think there's much conjecture this is a key 'unlock' in terms of how work gets automated and more efficient and AI starts to have the impact on societies and economies. With advanced LLMs and protocols for bringing agents and systems together, we now have the tools to build agents that can reason, act, and deliver outcomes — not just respond to prompts. But with this power comes complexity, and I see many approaching agents with expectations that don’t align with today's capabilities. 

This blog article is the first in a series which walks through five key techniques for building effective autonomous agents in Copilot Studio. Each article highlights a common pitfall, explains the underlying concept, and offers practical guidance to help you succeed. Whether you're building agents for internal automation, customer-facing scenarios, or domain-specific copilots, these lessons will help you avoid the traps and unlock the full potential of generative orchestration.

Scenario - an autonomous reasoning agent for Microsoft 365 architecture recommendations
Throughout this series I'll reference an agent I built which acts like one of our most experienced Microsoft architecture consultants at Advania - able to understand the full suite of security and productivity capabilities in Microsoft 365 E3 and E5, consider licensing needs and SKU packaging, and make technology recommendations for a given use case based on a deep understanding of the factors. The next article in the series shows a demo video so you can see the agent "thinking through" the scenario, automating the recommendation process through deep reasoning, and drafting a project proposal on a company templated document - a process which can take hours or days for an experienced architect. This is about accelerating that role, improving accuracy, and levelling-up less experienced architects so their thought process and outputs match those of the most experienced. 

Something we'll focus on in this initial article is that agents aren't autonomous by default in Copilot Studio - the agent has to be built with specific settings enabled and certain keywords used in agent instructions. This post covers these fundamentals, because using all right techniques won't get you anywhere if the agent isn't set up to behave autonomously - but we'll also start by explaining what we mean by "autonomy" so you understand where we're heading and what such an agent can do. 

What makes an agent autonomous?

There are lots of definitions of this, but I boil it down to four elements - I used this slide in a conference talk recently (at the 2025 European Power Platform Conference):


Importantly, in Copilot Studio some of this is made possible by "generative orchestration" - this isn't enabled by default, but if you want dynamic behaviour you need to toggle this to on in your agent settings:

Unlike classic orchestration (where you define every topic and response), generative orchestration allows the agent to decide how to use its knowledge, tools, and topics to fulfil a request. It’s powerful — but it also means you need to design your agent carefully to guide that autonomy.

So in Copilot Studio you essentially have two modes, where "classic" is the old mode and generative is the new possibility:

If you look at most other AI agent and virtual assistant platforms which have been around for a while (e.g. ServiceNow, Salesforce, Google), all have evolved from this classic "define each and every step of what a user might ask and how the bot/agent should respond" approach to something like generative orchestration where the LLM is essentially deciding how to behave and have the conversation with the user.  
 
For Copilot Studio, Microsoft has a useful table on the Orchestrate agent behavior with generative AI page which goes into more detail on specific behaviour differences:

 

Enabling reasoning in Copilot Studio agents

In the first image above, we discussed that a key element of autonomy is the "able to reason and plan" aspect. To be able to use reasoning in your agent, this also needs to be enabled in your agent settings (within the 'Generative AI' section) and is only possible if you're using generative orchestration. The settings are bundled together in the same area:




As highlighted in the small text in the blue box, the other critical thing is to use the "reason" keyword specifically in your agent instructions. This tells Copilot Studio to use a reasoning model (e.g. currently OpenAI o1 at the time of writing) rather than a standard LLM, and this won't happen if you describe it in other words.

Here's an example of the reason keyword being used in agent instructions - in this case, an agent I built where this is one of the agent steps to complete: 
5. Using the data from the previous step, reason to provide a recommendation of how a solution could be implemented using the available technologies. As part of this reasoning, establish any licensing uplifts which may be required for this client to use the recommended technologies. IMPORTANT - be extensive with the rationale for your decision, detailing how capabilities in the proposed technology meet specific requirements.
I'll show this agent in action through this article series. As you can imagine, the instruction above tells the agent to use a reasoning model for this step in order to derive the recommendation I'm asking for - that's important, because we're asking for "thinking" rather than more standard LLM processing. 

Agents need help
If not clear already, what we're really saying in this series is that agents need help - the work these days is preparing data, creating tools and sub-processes for agents to call into, and refining natural language descriptions and instructions until an agent behaves in the way you want. This is a new form of coding in some ways, but it doesn't all happen magically - understanding the critical techniques is key.

Articles in this series

  1. Techniques for autonomous agents in Copilot Studio - intro (this article)
  2. Scenario video - Microsoft architect with proposal generation
  3. Technique 1 - Getting AI-suitable descriptions right (data, tools, agents themselves)
  4. Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate
  5. Technique 3 - Provide tools like Agent Flows for steps the agent can’t easily handle
  6. Technique 4 - Leveraging Power Platform and Microsoft 365 capabilities in your agents
  7. Technique 5 - Understand cost, capability, and governance implications of agents you create

Next article

Scenario video - Microsoft architect with proposal generation

Monday, 19 May 2025

My AI conference talks in early 2025 - autonomous agents, SharePoint agents, and AI choices

As we come into conference season in late spring/early summer 2025, I’m honoured to be speaking at Europe’s biggest events in the Microsoft space on AI topics that I’m super passionate about. The events are coming up fast, and as any conference speaker will tell you, preparing for talks on cutting-edge AI technologies which are in preview and changing from week to week in the run-up is easily the most fun way to spend your evenings and weekends 😊 Like many others, I wouldn’t change it for anything though of course – all part of the fun.

In case you’re interested, here’s a quick view on what I’ll be covering. Both events provide a great platform to delve into the latest advancements, network with Microsoft AI implementers, and gain valuable knowledge.

European Collaboration Summit 2025

The first event on my calendar is the European Collaboration Summit 2025, scheduled for May 26-28 in Düsseldorf, Germany. This is the largest Microsoft-focused event in Europe by attendee count (around 3,000), with a great line-up of sessions, workshops, expos, and networking opportunities. I have two talks here:

Understanding Microsoft AI options – comparing Microsoft 365 Copilot, Copilot Studio agents, SharePoint agents, and building on Azure OpenAI

Navigating the Microsoft AI landscape and making good choices is complex. There are many aspects to consider including the focus of different options, AI capabilities, consumption or licensing costs, possible data integrations, user experience factors, boundaries and limitations, potential to automate business processes and more. Crafting an appropriate AI strategy with even just these options (and nothing outside of Microsoft) is certainly not trivial for most businesses:

We’ll explore the respective strengths and implications of each of these options, hopefully delving into some aspects you may not have considered. My other talk at ECS 2025 is:
Copilot agents in SharePoint - understanding AI capabilities, governance, costs, and more

SharePoint agents are so far one of the lesser-known parts of Microsoft’s AI story but could revolutionise how we work with documents forever. After all, SharePoint is the platform which stores the world’s documents and collectively we add 2 billion documents there per day. Something has to change in how interact with this information, and with generative AI being woven into SharePoint in this way, SharePoint agents are absolutely part of the answer.

This talk will focus on understanding the AI capabilities within SharePoint, including governance, costs, and other important factors. We will examine how Copilot agents can enhance your SharePoint experience and what to keep in mind while leveraging these powerful tools.

Conference link: https://collabsummit.eu/

European Power Platform Conference 2025

Following the European Collaboration Summit, I will be heading to Vienna in June for the European Power Platform Conference 2025 (the event is June 16-19 specifically). With the Power Platform’s new role as a core pillar in Microsoft's AI strategy, particularly in terms of Copilot Studio agents, this is a great time for an event with this focus. I’ll be presenting on:

Techniques for Autonomous Agents (Copilot Studio)

It’s certainly fair to say there’s a lot of noise around autonomous agents - and as many apps and automations essentially evolve to be agents - generative AI can absolutely simplify how solutions are defined and implemented, where it’s no longer necessary to specify every last line of code or step in a process. With enough guidance on the outcome you're trying to achieve, AI can often plan the actions required and the specific tasks to execute. It's a new world now!

But can you really say "follow up on each marketing lead by sending a personalised e-mail to the client and update our back-end systems appropriately" and AI magically does all the right things? Where are the boundaries? How autonomous can autonomous agents be? In this session we'll show the top techniques for success, and where you need to draw the line between instructions and specific items for your agent to call.

Conference link: https://www.sharepointeurope.com/european-power-platform-conference/

Summary

These conferences offer a fantastic opportunity to immerse yourself in the latest AI developments, learn from experts, and network with like-minded professionals. As far as I know, some tickets are still available for both events at the time of writing (early May 2025), so it’s not too late if you want to take the opportunity. I’m excited to be part of these events – very much looking forward to some great conversations, sharing some insights, and learning a lot myself from the incredible sessions and discussions.

Hopefully see you at one of them!

Sunday, 23 March 2025

Copilot Studio agent cost calculator - a tool to help forecast costs

Agents are forming a big part of AI these days, bringing the promise of highly-capable assistants with gen AI intelligence and the ability to use tools, data, and apps in multi-step processes. One of Microsoft's big moves recently is to allow agents to be used in the free version of Copilot (Microsoft 365 Copilot chat) by any employee with a Microsoft 365 license - in other words, no need for the circa $30/£23 per user per month "full fat" Microsoft 365 Copilot license. In this approach, the AI usage is simply charged based on consumption in a Pay As You Go manner, thus avoiding the need to license the entire organisation with Microsoft 365 Copilot. This is interesting because it starts to make AI very useful (e.g. by having it understand company data and documents) without committing to the $30/£23 per user per month cost of M365 Copilot. 

But how much do you pay? In the Microsoft world, Copilot Studio agents will be the most popular since they can be created by business users without code - arguably it's a new form of Power Apps, but exposed prominently to all Microsoft 365 users. Other types of agents can be created in the Microsoft world - including declarative agents and custom engine agents - but these typically require pro developers and a more specialist skillset. We're focused on Copilot Studio agents in this article due to their likely popularity. As usual with AI, it pays to understand what costs are likely to be ahead of time - also as usual, the devil is in the detail and there are some surprises. In this post I share an Excel calculator which can help you forecast costs and outline some of the considerations.

Agent costs - and the relationship with Microsoft 365 Copilot

To avoid doubling up of costs, agents are largely free for any employees licensed with Microsoft 365 Copilot - you're already paying for full Copilot license for these folks after all, so most agent actions are free for these users. Not all actions however - here's a summary:
  • Users without a M365 Copilot license - all actions within an agent count as 'consumption'
  • Users with a M365 Copilot license - most actions are free, but the exception is 'autonomous actions' which use generative AI to make decisions on how to process
So even before we get to the detail, that's a wrinkle which adds some complexity to predicting costs. Will you (or the business) create agents which rely on autonomous actions? That's hard to say without getting close to use cases and agent design. Unfortunately there's no way to prevent this type of action being used, so even if your organisation licenses lots of users with full M365 Copilot you may still incur AI agent costs. See my footnote at the bottom of this article for more on this.  

For the purposes of simplicity, this article (and the linked Excel calculator) will focus on forecasting costs for users without a M365 Copilot license - in other words, the "free Microsoft 365 Copilot chat + Copilot Studio agents" combo. Lots of organisations are interested in this since organisational use cases get unlocked, potentially with a simpler investment case. If your organisation has lots of M365 Copilot licensed users and you think agents using autonomous actions will be created by the business (there's no way to stop this currently), layering in these additional costs onto my Excel would certainly be possible. 

Before we get to costs, let's think about agent flavours.

Types of agent and surface points 

There are lots of ways of categorising agents, but this classification based on what the agent is primarily there for is helpful:

A retrieval agent could help employees find answers from key documents or a knowledge base, whereas at the other end of the spectrum an autonomous agent could be tasked with competitor research and proposing new service lines for a company. At that level of complexity there's lots to think about of course, but it does illustrate the range of agents that can exist. The above capabilities can obviously be combined, and some agents will use all three elements of retrieval, task completion, and autonomy within a process.

In terms of surface points and user experience, Copilot Studio agents can be used to:

  • Provide a standalone agent experience - for example, an employee onboarding agent embedded in a HR site or in Teams
  • Extend Microsoft 365 Copilot - for example, to tailor behaviour so that questions about company policies are sourced from the HR handbook only
Both of those examples are essentially retrieval agents. 

Copilot Studio pricing and message burn rates 

Copilot Studio agent usage is charged in 'messages' based on complexity of the action. You have two ways of paying for these:
  • Pay As You Go - no upfront commitment, $0.01 per message 
  • Buy a message pack - 25,000 messages per month, also $0.01 per message
Regardless of how you pay, the key element is the 'burn rate' of different actions your agents may use. Microsoft publish the table below to explain how different agent actions consume messages - remember from the earlier note, employees without a Copilot license will be charged for all actions in this list, and even employees with a full M365 Copilot license will incur costs if autonomous actions are used:

April 2025 pricing update
Microsoft announced some significant price decreases for certain agent actions listed below - for example, 'tenant graph grounding' decreased from counting as 30 messages to 10. All of the info in this post and the linked Excel calculator has been updated with these changes.
 




So, there's no way around needing to understand (or guess) what your agents may do. To simplify things, one approach is to imagine a typical agent and the combination of actions it performs, then multiply it by how many such agents you anticipate the business to use. 

Let's look at some scenarios.

Calculating agent costs

We need some quick assumptions. I'm assuming that a typical conversation an employee has with an agent will perform:
  • 3 info lookups, known as tenant graph grounding (i.e. to get the info from Microsoft 365,  whether that's a knowledge base, site, or set of documents, to answer the prompt/question)
  • 4 usages of generative AI (i.e. to generate the response text)
You might disagree with these averages, but they are all editable in the calculator. My logic is that most conversations with AI are not single shot - it's common to ask a follow-up question, and if we say that happens around 3 times in a typical conversation I believe that's about right. We also need to consider how many conversations per day - in these default calculations, I go for 30 per day. Again, that could be highly variable for you (e.g. based on size of company, how many employees the agent is relevant to etc.) - all editable in the calculator, but I think 30 is reasonable for a useful agent in a medium/large business. 

Scenario 1 - 10 agents, 30 conversations per day

This scenario gives an annual cost of $30,096:


Scenario 2 - 30 agents, 30 conversations per day

"But Chris, we're probably not going to end up with just 10 agents across our entire business - it would be more than that in time." I'm hearing this frequently, so let's model that - costs obviously multiply by 3, giving a yearly total of $90,288:



Both of these costs typically come as a surprise. We can debate exactly what the right parameters should be for each business (and we should, because the inputs used will influence the costs heavily) - but clearly there is something significant to perform diligence on here, and I think it may not be apparent to most. Anyone thinking of using agents without "full" Copilot licensing across the board should be looking at this closely - but it won't necessarily come through in the Microsoft documentation, you need to do your own modelling.

Accessing the calculator
The calculator is a simple Excel file available for download here:


Closing thoughts 

The combo of "minimal M365 Copilot licensing + agents" is quite alluring for organisations where rolling out M365 Copilot to all employees doesn't fit, or those simply struggling with the business case. However, this alternative to full Copilot may not be as cost effective as anticipated - and for their part, Microsoft are clearly hoping that it can work as an on-ramp and tip organisations into making a more substantial investment in their premier AI tool. If not, that's probably fine too because revenue from PAYG usage may be just as beneficial and potentially carry a higher margin. I wouldn't say there's anything too cynical or mercenary from Microsoft here - every major SaaS vendor has a degree of PAYG pricing for AI consumption in their platform, since it brings flexibility and the avoidance of an upfront commitment.

Will agents drive more organisations to license M365 Copilot more broadly? I'd say yes - after all, once agents are in use and departments, teams, and individuals are all using them to get work done, good luck taking them away! Few leaders will be able to do that easily if agents are delivering the gen AI promise.

It also won't be easy to calculate the exact tipping point where it becomes worthwhile to switch to M365 Copilot licensing - that will all depend on your agent usage and number of employees being considered for licenses. Being able to implement cross-charging for AI consumption will also grow in importance, though for some businesses a cost will be a cost wherever it lands.

As ever, AI continues to raise complexities that need expertise on the team (whether in-house or with a partner) to help navigate the choices. This new world of "AI FinOps" is just one part of it - but there's no doubt organisations are realising gains from AI now, so the bigger long-term risk may come from not seeking to master these factors.

References:

Footnote - M365 Copilot licensed users and autonomous actions in agents

I highlight in the article that even M365 Copilot licensed users are charged for autonomous actions in agents, though every other action type is free. I derive this from the Microsoft documentation where autonomous actions is omitted from the list of actions free for these users:








Thursday, 23 January 2025

Autofill columns in Microsoft 365/SharePoint – comparing AI costs to alternatives

As detailed in the last article, “autofill” columns in SharePoint are a very interesting built-in AI capability in Microsoft 365 where LLM/GPT capabilities can be used very easily with your documents. Unlike Copilot, which can’t really be used in bulk across your documents because it’s driven by the end-user, autofill columns could be used to summarise, categorise, or extract info from hundreds or thousands of documents very easily indeed. Pseudo-automation approaches with Copilot do exist such as creating an agent, but not all employees will do that and autofill columns are perhaps the simplest way of tapping into AI to get some of the benefits. Today, autofill columns use GPT-4 Turbo as the LLM and we can expect Microsoft to continue upgrading as newer models become available. 

As a recap, I gave these examples in the last article:

  • Document summaries – the full document summarised automatically in your preferred format (e.g. 3 bullet points, a few sentences etc.)
  • Key takeaways – a summary focused not on the full document, but the conclusions only
  • Key info extraction – identify the client/project/business unit this file relates to
  • Key info extraction – list the people/organisations/concepts/[something else] mentioned in this document
  • Classification – categorise the document, either generically or according to a list you provide (e.g. into “RFP”, “proposal”, “statement of work”)
  • Automated assessment - based on the contents of the document, should this be approved or sent for human review?
  • New info generation – based on the contents of the document, how does it correspond to X?

If more inspiration is useful, my last article shows the results of implementing a couple of these scenarios, showing the prompt used and what the AI did.

Costs - does this cost more, less, or the same as other AI tools I could use?

Experimenting with AI is generally cheap, but to use in real-world scenarios someone somewhere needs to understand the costs and pricing model closely. However, you’ll rarely see comparison tables on different AI approaches and respective costs published by vendors like Microsoft – you’ll see them for AI models, but rarely for overall approaches. It’s not that AI vendors are necessarily being disingenuous, more that there are a few variables involved such as whether your developer/team/partner has the skills to use different approaches – perhaps they can only use low-code or easy to consume AI for example and code development is out of the question.

But things do open up when you can plug into Azure OpenAI directly for LLM models for instance, because you avoid paying the “abstraction premium” for simpler approaches where Microsoft have done more of the work for you. It was ever thus as we say in England, and this is certainly the case for autofill columns.

Hit me with some numbers

Let’s make two comparisons with other ways you could consume AI, focusing on use cases like those above which align well with autofill columns:

  • Syntex document understanding (for extraction or classification scenarios especially)
  • A Copilot agent or Power Automate solution where the document contents are passed to an AI model (most likely by calling into Azure OpenAI)

Comparing with Syntex costs

[Updated February 2025, see the "newsflash" section later in this article] From March 2025, SharePoint autofill columns are priced at $0.005 per transaction (which means a page) - significantly cheaper than Syntex unstructured document processing at $0.05 per transaction, or the pre-built Syntex models at $0.01 per transaction. However, we may see similar cost reductions in Syntex which evens this out following changes to autofill pricing. On the two Syntex approaches, it comes down to whether there’s a pre-built Syntex model that suits your needs (e.g. for simple work with contracts, invoices, receipts, forms, PII detection or language detection), costs here are much lower at $0.01 per transaction (page).

Result – cheaper with autofill (unless Syntex pricing gets adjusted too)

Syntex specialises in extracting key info from documents – but if you want to use AI in some other way and/or you need generative AI and LLMs you’ll need to look elsewhere. So, what about the idea of a Copilot agent or Power Automate solution?

Comparing with Azure OpenAI costs (via a Copilot agent or Flow)

Here’s where things get interesting. If you have the technical skills to create an agent or automation which calls into AI and passes documents to it, you can reduce AI costs *fairly* dramatically - but note the differential has (happily) come down since writing this article - full details of the update are below. This is partly because going directly to AI is always cheaper than going via an “end-user” approach in Microsoft 365, and partly because you can control the AI model used. With autofill columns the model used is down to Microsoft (GPT-4 Turbo is referenced in the documentation), but with a built solution, choosing one of the latest “mini” models rather than full blown GPT-4 Turbo or similar has a dramatic effect.

A cost comparison calls for a quick spreadsheet. Let’s compare autofill columns with using two of the current AI models but going direct:

  • GPT-4o
  • GPT-4o mini

Here are the details of the scenario I’m modelling (see footnote at the end if you want more on this):

  • 1000 documents per month processed by AI (e.g. summarised, analysed etc.)
  • Each doc has 100 pages
  • Assume average word density
  • Azure OpenAI pricing details – USD (to help compare to autofill pricing which is also in $), East US Azure region, non-provisioned capacity

Here’s the calculation and findings (based on pricing in January 2025) – the costs for the scenario per month are highlighted in blue, with the differential to autofill pricing in red:

Newsflash! Significantly reduced autofill costs
Clearly this entire topic was in focus for Microsoft, because soon after publish of this article a 90% reduction to autofill costs was announced - from $0.05 per page to $0.005 per page, starting March 2025. The change would have already been in motion rather than triggered by anything I highlighted, but demonstrates that Microsoft *are* being proactive in getting newer and cheaper AI models behind built-in gen AI services like autofill and Copilot and passing on cost reductions to clients.

*With that said*, the updated calculations below show there can still be a significant differential between autofill and developer-led AI usage. The cost decreases certainly diminish the impact unless you're working at scale, but in some cases that's exactly where the AI value is. See the updated numbers below - what do you think?

 
To summarise:
  • You’ll pay 3.31 times as much to use autofill columns compared to going direct with a similar model (GPT-4o)
  • You’ll pay 55.11 times as much to use autofill columns compared to going direct with a cheaper “mini” model (GPT-4o mini)

The costs for the scenario using each approach are:

  • GPT-4o mini - $9.07 per month
  • GPT-4o - $151.20 per month
  • Autofill columns - $500 per month

Before the update, those numbers were even more striking (at 33x and 551x difference respectively, and a total bill of $5,000 per month using autofill for the scenario) - but even with the February 2025 price reductions, it does still leave something to consider. Over a year, the difference is magnified of course and we’d be talking $109 for the cheapest approach with GPT-40 mini vs. $6,000 for autofill columns.

Result – going direct to Azure AI is (still) significantly cheaper

What do we take from this? Should autofill columns be avoided?

Even with these cost differences, autofill columns could absolutely be the most cost effective and ‘valid’ choice for your AI use case. The key considerations which come into view are:

  • The scale you’re working at/how much AI you need to fulfil it (e.g. how many documents and pages)
  • Possible avoidance of development costs

Autofill columns are ready-to-go AI that the business can use in a self-serve way. Certainly, cost governance measures need to be put in place to avoid bill shock, but there are options for this – at least in terms of someone being notified that spend might go beyond a defined threshold so they can intervene (by way of implementing an Azure budget on the Syntex meters). With other approaches, AI costs using Azure OpenAI are much lower but may require development or solution maker time – and that implementation cost could outweigh the cheaper AI costs, depending on effort required, day rates etc.

Overall, any options which put AI in the hands of the business should be welcomed since that’s where processes are understood and value can be easily created. Just take care with those cost governance measures if you enable any form of Pay As You Go AI.

Summary (and a view as a Microsoft partner/solution provider)

As someone working for a partner delivering services at the forefront of Microsoft AI, it’s interesting to consider the impact of “ready to go AI” in Microsoft 365 on companies like ours. If our clients can use AI without our help (such as autofill columns), will we deliver less AI advisory and implementation work? Will all our efforts to deeply understand the technologies, the factors in using each one, and how to create business value with them be to waste? Well, frankly it’s hard to imagine that based on the realities of getting the value from AI in production scenarios – if anything I think the reverse might be true. What we’ve seen in this analysis (and any other time we look at the “what to use when in AI”) is:

  • You always pay a significant premium for simplified/abstracted AI – whether it’s Copilot, autofill columns, or something else. As a result, a ‘developed’ solution can save £1000s per year for many use cases due to the sheer difference in AI costs
  • It’s a complex landscape where partners can add a lot of value - understanding the factors takes a lot of context and closeness to Microsoft developments
  • Deriving likely AI costs for a use case is not simple – some modelling is required using a deep understanding of Microsoft technology. And cost comparisons differences across approaches are rarely found in the documentation

Nevertheless, the value is out there. While it's certainly true that the right decisions need to be made within a well-run "POC to production" motion, we're seeing more and more AI use cases related to high-value processes where there's a significant cost saving to the organisation. Often there are some accompanying benefits to employees too, and both will only accrue further as time goes on. 

Understanding the right AI technology and approach for the use case is the key of course - no more, no less. 



Footnote - more detail on the cost comparison

For simplicity, I’m considering the AI consumption costs only. Depending on circumstances you may also have some Copilot agent costs or Power Automate costs to consider too.

To compare the AI costs for autofill columns vs. Azure OpenAI, we need to calculate a "per page" cost for both approaches. Autofill columns and 
Syntex models are priced by page, but consuming GPT models directly is always priced by token usage. To compare apples to apples, I modelled an AI usage similar to the “create summaries for all my documents” autofill column scenario, using:

  • 600 input tokens (average word density of a Word document is 500-700 tokens from my testing)
  • 100 pages per document
  • 120 output tokens for the AI response (which corresponds to around 200 words of output, the same as the prompt I used in my autofill column examples above)

These details are shown in the yellow box at the bottom of the Excel sheet and are used in the formulas.