Tuesday, 11 November 2025

Autonomous agents - Technique 2: Define explicit steps in agent instructions when "reasoning the process" isn't appropriate

One of the main reasons AI agents are such a leap forward in automating work comes down to their ability to reason and dynamically plan in order to reach an outcome - which of course, is much like us humans intuitively do in our work. Apps and systems that we've developed over the past few decades have essentially all followed set processes with procedural code, and while many have flexible logic and business rules, today's AI models take us several levels beyond this. Ultimately we move away from a world of apps with explicitly coded steps to one where AI agents are often instructed with the end goal or outcome, but not necessarily a granular process of how to get there. So long as the agent is given the data, tools, and guidance to do this, it should have a good chance of meeting the outcome. Or that's the theory.

One of the biggest challenges today with autonomous agents is that they can have too much freedom and go off the rails from the intended task. We're not talking dystopian takeover, simply unpredictable agents with poor results and a lack of consistency. So, this post covers how to make your agent more reliable by finding the balance between autonomy and specific guidance - it all comes down to how you write the agent instructions. 

But first, here's a recap of the full series:

Articles in this series

  1. Techniques for autonomous agents in Copilot Studio - intro 
  2. Scenario video - Microsoft architect with proposal generation
  3. Technique 1 - Getting AI-suitable descriptions right - data, tools, agents themselves 
  4. Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate (this article)
  5. Technique 3 - Provide tools like Agent Flows for steps the agent can’t easily handle
  6. Technique 4 - Leveraging Power Platform and Microsoft 365 capabilities in your agents
  7. Technique 5 - Understand cost, capability, and governance implications of agents you create

Some lessons learnt on agent instructions and autonomy

Going back to our Microsoft 365 architect agent used in this series (which automates some of our work at Advania by considering a business challenge and the client's tech landscape, then makes an architectural and product recommendation mapped to, here's how I started out with the instructions I gave to the agent. 

Poor agent instructions
The user will input a use case or business challenge. Using your knowledge of Microsoft 365 licensing and the client's environment and licensing, provide a recommendation of how a solution could be implemented using the available technologies. Generate a proposal outline to implement the solution - propose the technology(s) to be used, any licensing uplifts required which aren't currently in place or planned, and an overview of what the project would look like. Create a Word document containing the prompt and response. Also log the results to the SharePoint list in the "Syntex-AI-intelligence" site named "Technology proposals".

All seems quite reasonable and specific - I'm outlining specific goals and steps to take, and even being ultra-clear on where to find the SharePoint site to log the output to. However, let's take a look at the result:

 What's actually happening here is:
  • The agent only runs one step - which is never going to work
  • The agent is trying to determine the overall architectural solution simply from the first knowledge source (the m365maps.com site which provides info on Microsoft 365 products and SKUs 
I won't show the end output here to avoid confusing things, but the actual answer given by the agent is very low quality - it does say that Defender for Endpoint Plan 2 could be used (correct) as seen in the screenshot above, but the agent fails on several key things we want it to do: 
  • There's no real consideration of the factors here, the agent appears to be doing some basic querying only. No basis for the recommendation is specified, and there's no evaluation of licensing uplifts which may be required (as asked for in the instructions)
  • The agent didn't log it's results to the list as requested
  • The agent didn't create the Word proposal document as requested
So, clearly this isn't working even if it has a hint of the right answer. 

Defining explicit steps in a numbered process

The reality is we need to be much more explicit with agents today - expecting them to be able to "reason the process" from a set of instructions often doesn't work and results will be unpredictable. In particular, where a process has steps that truly need to be followed in sequence - like fetching some data and then reasoning over it - the thing that makes the difference is specifying these steps in a numbered list. 

Here are my revised instructions:

Sequenced agent instructions
  1. Derive the client organisation name from the user's prompt. If you are unsure of the organisation, ask the user for clarification. 
  2. Derive the use case or business challenge the client is asking about from the user's prompt. If you are unsure of the use case, ask the user for clarification. 
  3. Using the client name you already derived, use your knowledge sources to understand which technologies the organisation is licensed for. 
  4. Now use your knowledge sources to ensure you understand the individual Microsoft products and sub-capabilities of these Microsoft 365 SKUs - build a categorised list for use in later steps. 
  5. Using the data from the previous step, reason to provide a recommendation of Microsoft technologies which could meet the specific requirements. IMPORTANT - be extensive with the rationale for your decision, mapping how capabilities in the proposed Microsoft technology meet specific client requirements and detail this in your output. If no recommendation can be made (e.g. because the use case cannot be met with the licensed technologies), make a recommendation of which technologies could be purchased to help. As part of this reasoning, establish any licensing uplifts which may be required for this client to use the recommended technologies. ## Be sure to detail your chain of thought in the output ##
  6. Also reason to establish any non-Microsoft technologies that could potentially be used to fulfil the requirement. 
  7. Generate a proposal outline to implement the solution - propose the technology(s) to be used, any licensing uplifts required which aren't currently in place or planned, and an overview of what the project would look like to implement the Microsoft-based solution. 
  8. Log the results to the SharePoint list in the "Syntex-AI-intelligence" site named "Technology proposals".
  9. Create a Word document containing all details of client requirement and proposed solution - ensure to use the full reasoned output in the details. Save this to the SharePoint library in the "Syntex-AI-intelligence" site named "Technology proposal documents".

I've expanded on the detail of what I want in some places, but the other major change is switching to a numbered list to instruct the agent to follow those steps in order. Now let's look at the difference:


 
Looking at those images, straight away we can see the agent is now following many more steps - and if we drill into the two "Knowledge" steps as shown above, we can see the query used for each (and the results) now make sense:
  • Advania client data - agent asks it what technologies the specific client is licensed for
  • M365maps.com - agent uses this to discover what capabilities are in each granular Microsoft 365 SKU
This sets us on the path to success.

Results from the agent 

At this point we hope our agent will properly consider the client challenge provided to it (e.g. replace our endpoint protection, plan for a new intranet, introduce a compliance tool etc.) and then consider their technology environment and licensing, if anything provided in Microsoft 365 fits the bill - and then reason through all this to produce a technology recommendation, considering any licensing uplifts needed as well as adoption/migration considerations. Here was my prompt:

Unilever (fictional - remember I'm redacting real Advania client data with the top 20 UK companies), one of our clients, want to roll out a new endpoint protection technology to replace CrowdStrike to approximately 100,000 end-user devices. Ransomware protection is a key concern, and the RFP lists automated incident response and remediation, integrated threat intelligence, and sandboxing as required capabilities. 

Notably, they need coverage for 5,000 devices where users only have Microsoft 365 F1 licenses (frontline workers).

Here's the result:

Chris, here’s a complete recommendation and delivery outline for Unilever’s endpoint protection RFP.

Organization: Unilever
  • Use case and business challenge inferred: Replace CrowdStrike across ~100,000 end-user devices.
  • Must meet: strong ransomware protection, automated incident response and remediation, integrated threat intelligence, and sandboxing/detonation.
  • Coverage gap to address: 5,000 devices used by Microsoft 365 F1 frontline workers.
What Unilever is licensed for today (from our records)
  • User licensing: Majority on Microsoft 365 E3 with EMS add-on; ~15,000 on E5; ~10,000 Microsoft 365 F1 frontline licenses.
  • Security stack in place: Entra ID P1, Microsoft Sentinel (Azure Sentinel).
  • Endpoint and platforms: Endpoints managed with Intune; currently using CrowdStrike for endpoint protection; Mimecast for email; AVD and Citrix in use; Azure in UK/US/AUS regions.
Microsoft products and sub-capabilities relevant to this RFP (categorized)
  • Endpoint protection and EDR
    • Microsoft Defender for Endpoint Plan 1 (MDE P1): next‑gen AV, tamper protection, attack surface reduction (ASR), web content filtering, centralized management, cross‑platform support, “Block at First Sight.”
    • Microsoft Defender for Endpoint Plan 2 (MDE P2): adds endpoint detection and response (EDR), threat analytics, vulnerability management (core), and longer searchable data retention; forms the endpoint pillar of Microsoft 365 Defender (XDR).
  • Automated investigation and remediation (AIR)
    • Microsoft 365 Defender suite provides AIR across workloads; Microsoft documentation highlights AIR in Defender for Office 365 Plan 2 (email) as an example capability within the suite.
  • Threat intelligence
    • Threat Analytics within MDE P2 delivers integrated intelligence and analyst‑ready reports that inform investigations.
  • Sandboxing/detonation
    • Endpoint: “Block at First Sight” uses rapid cloud evaluation and detonation for unknown files for preventative blocking on endpoints (P1+).
    • Email/collab: Defender for Office 365 Plan 2 adds Safe Attachments (detonation sandbox) and Threat Explorer for campaign-level investigation.
  • SIEM/SOAR
    • Microsoft Sentinel already present for Unilever; integrates natively with Defender signals for advanced hunting/automation.
  • Device and app management
    • Microsoft Intune already in use for policy, configuration, and deployment orchestration.
Summary recommendation
  • Standardise on Microsoft Defender for Endpoint Plan 2 across the Windows, macOS, Linux, Android, and iOS estate; manage deployment and policy via Intune; unify signals in Microsoft 365 Defender (XDR) and stream to Microsoft Sentinel for SIEM/SOAR.
Why this meets each RFP requirement
  • Ransomware protection
    • Prevent: Next‑gen AV, ASR rules, tamper protection, and “Block at First Sight” provide strong preventative controls on endpoints (P1 baseline).
    • Detect and contain: MDE P2 adds behaviour‑based EDR with live response, threat analytics, and extended searchable telemetry to spot hands‑on‑keyboard activity and lateral movement quickly.
  • Automated incident response and remediation
    • MDE P2 enables automated investigation and remediation on endpoints to contain and fix threats (part of Microsoft 365 Defender’s AIR capabilities).
    • Sentinel playbooks (Logic Apps) can orchestrate cross‑tool actions at scale (contain device, isolate network, reset credentials) using Defender incidents as triggers; Unilever already has Sentinel.
  • Integrated threat intelligence
    • Threat Analytics in MDE P2 provides curated intel mapped to current campaigns and exposures, tied directly to Unilever’s devices and vulnerabilities for prioritisation.
  • Sandboxing/detonation
    • Endpoint: Cloud‑based file analysis and “Block at First Sight” help pre‑empt execution of suspicious payloads on endpoints.
    • Email: If Unilever later consolidates on Defender for Office 365 Plan 2, Safe Attachments detonates content pre‑delivery and contributes to unified incidents in Microsoft 365 Defender. 
How this fits Unilever’s current environment
  • Intune is already the endpoint manager, making at‑scale onboarding, policy configuration (ASR, AV, EDR), and gradual CrowdStrike co‑existence/cutover straightforward.
  • Sentinel is already deployed, so incident fusion, hunting queries, and automation can be turned on rapidly to cover both endpoint and other data sources.
Licensing path to enable the above
  • Users currently on Microsoft 365 E5: MDE P2 rights are included; no uplift needed for those users.
  • Users on Microsoft 365 E3 with EMS add‑on: add either the Microsoft 365 E5 Security add‑on (bundled security including MDE P2) or standalone Microsoft Defender for Endpoint Plan 2 licenses for those users. 
  • Frontline worker devices on Microsoft 365 F1 (5,000 devices): options to cover endpoints with MDE P2
    • Option A (preferred for feature parity): move these users to Microsoft 365 F3 and attach an appropriate security add‑on that grants MDE P2 (for example, an enterprise security add‑on akin to E5 Security for frontline) or assign standalone MDE P2 per user where permitted. Validate final SKU mix with Microsoft licensing because frontline bundles vary by agreement and commercial construct.
    • Option B (kiosk/shared scenarios): if devices are truly shared and user‑based licensing is impractical, explore device‑based licensing options with Microsoft for MDE where applicable. Validate with the Microsoft account team to align with Unilever’s Enterprise Agreement terms.
Note: The SKU matrix shows where endpoint features and EDR/Threat Analytics land (P1 vs P2/E5 Security). Use this to ensure every user segment lands on P2 for EDR/AIR. 

Commercial and operational rationale
  • Reduces agent sprawl by consolidating into the Microsoft stack Unilever already operates (Intune, Sentinel), lowering operational overhead and integration complexity.
  • Meets or exceeds the specified capabilities with unified incidents, threat intel, and at‑scale automation, while improving mean time to detect/respond.
Non‑Microsoft alternatives that meet the specification (for RFP completeness)
  • CrowdStrike Falcon Insight + Falcon X (sandbox) + Falcon Fusion for automation.
  • SentinelOne Singularity Complete/Control with Storyline automation and rollback.
  • Palo Alto Cortex XDR Pro + WildFire for detonation + XSOAR automation.
  • Sophos Intercept X Advanced with XDR + Sophos Sandstorm.
  • Trend Micro Apex One SaaS + Vision One XDR + detonation sandbox. 
Note: These all offer ransomware protection, automated response, TI, and detonation. The decisive factors will be agent footprint, integration with Intune/Sentinel, and TCO relative to Microsoft’s included rights.

Delivery proposal outline for Unilever
  • Scope: Replace CrowdStrike with Microsoft Defender for Endpoint P2 on ~100,000 devices; ensure parity on ransomware protection, AIR, TI, and sandboxing; integrate with Microsoft 365 Defender and Sentinel.
  • Phases and timeline (indicative)
    • Readiness and design (3–4 weeks)
      • Confirm license entitlements per segment; define co‑existence approach with CrowdStrike; finalize policies (AV/ASR/EDR), device groups, and ring strategy; connect Defender to Sentinel with required data connectors and analytics rules.
    • Pilot (4–6 weeks, ~2,000 devices across key personas/regions)
      • Deploy MDE sensor via Intune; validate protection efficacy, AIR behaviour, endpoint isolation, and SOC workflows; tune ASR/tamper protection; validate detonation/“Block at First Sight.”
    • Scale‑out waves (8–12 weeks)
      • Roll out by business unit/geo; enable EDR in block mode; progressively disable CrowdStrike on devices once Defender is in “active” mode and stable; expand Sentinel automations and hunting queries.
    • Cutover and decommission (2–4 weeks)
      • Remove CrowdStrike agent; finalize GPO/MDM baselines; confirm incident volumes, MTTR, and coverage; complete runbook documentation and knowledge transfer for SecOps/Endpoint teams.
    • Hardening and optimization (ongoing)
      • Leverage Threat Analytics and TVM to drive risk reduction; regular red/blue exercises; quarterly policy reviews; rationalize email sandboxing approach if consolidating on Defender for Office 365 Plan 2.
Architecture and integration highlights
  • Intune for deployment/policy; Microsoft Defender for Endpoint P2 on endpoints; Microsoft 365 Defender for incident correlation; Microsoft Sentinel for SIEM/SOAR rules, hunting, and playbooks; integration to ServiceNow/Jira for ticketing as needed.
Roles and responsibilities
  • Our team: solution design, pilot, deployment tooling, Sentinel content and automation, change and cutover runbooks, training.
  • Unilever: environment access, pilot cohort sponsorship, change approvals, SOC/process owners, endpoint engineering partnership.
SharePoint list logging (Technology proposals)
  • Action prepared: Log the proposal to Syntex-AI-intelligence > Technology proposals with fields: Client=Unilever; Use case=Endpoint protection replacement; Scope=100k devices; Key requirements=ransomware, AIR, TI, sandboxing; Recommendation=MDE P2 + M365 Defender + Sentinel; Licensing notes=E5 covered; E3 add E5 Security or MDE P2; F1 coverage options; Status=Proposed; Next steps=Pilot planning.
  • Limitation: I don’t currently have write access via this interface to post to your SharePoint list. If you can grant me access or confirm the list URL and credentials, I will submit this entry immediately on your behalf.
Word document creation (Technology proposal documents)
  • Action prepared: Generate a Word document containing this full recommendation, capability mapping, licensing options, and delivery plan, and save to Syntex-AI-intelligence > Technology proposal documents with a filename like “Unilever – Endpoint Protection Replacement – Proposal.docx.”
  • Limitation: I don’t have write access to create/save documents to your SharePoint library from here. If you share a drop‑off library URL or grant permissions, I’ll create and upload the document automatically; alternatively I can send you the .docx for you to place.
Key decisions for you
  • Confirm the preferred licensing path:
    • E3 population: E5 Security add‑on vs standalone MDE P2.
    • F1 population (5,000 devices): upgrade to F3 + security add‑on vs assign standalone MDE P2 to those users; validate final SKU choices with Microsoft licensing to align with Unilever’s agreement.
    • Confirm pilot cohorts and target start date so we can mobilise.
Citations used: [list of citations]

Phew! The results are amazing when you think about it - the agent has considered all the factors and showed a level of understanding equal to a highly experienced and capable Microsoft security specialist. We even see that the agent has done what was asked in terms of mapping specific requirements given by the client (e.g. automated incident response and remediation, integrated threat intelligence, and sandboxing in this case) to product features in the proposed solution, Defender for Endpoint. 

We do still have two issues though - as the two "limitation" notes in the agent's response call out, the agent wasn't able to:
  • Log the output to the SharePoint list as requested
  • Create the Word proposal document as requested
So, we're not done yet - we'll come to these resolutions in the next article. 

In terms of the cognitive work the agent has successfully performed, at Advania we don't see this as replacing our high-end architects - but we do see it as accelerating our teams and allowing more junior members of the team to take on more complex work. The work partly switches to verification, judgement, editing where required, and rounding out and adding perspective - all elements which would have been required anyway. Having the agent take on some of the complex research, interpretation, and generation is huge though of course - we really are democratising intelligence and expertise and therefore moving faster and serving more.  

Summary

Autonomous agents will transform work over the next few years but their effectiveness depends on how we guide them, and instruction design matters. Simply stating an end goal and the outline of a process is unrealistic, at least today - instead, breaking instructions into clear, sequenced steps (probably using numbered bullets as I've shown in this article) is needed to get reliable agent behaviour. It's a question of balancing autonomy with control, and human oversight and verification is crucial of course.   

The next article will resolve those pesky challenges of the agent not being able to create Word documents to accelerate that part of the process and the team's work.

Articles in this series

  1. Techniques for autonomous agents in Copilot Studio - intro 
  2. Scenario video - Microsoft architect with proposal generation
  3. Technique 1 - Getting AI-suitable descriptions right (data, tools, agents themselves)
  4. Technique 2 - Define explicit steps in agent instructions when "reasoning the process" isn't appropriate (this article)
  5. Technique 3 - Provide tools like Agent Flows for steps the agent can’t easily handle
  6. Technique 4 - Leveraging Power Platform and Microsoft 365 capabilities in your agents
  7. Technique 5 - Understand cost, capability, and governance implications of agents you create

Next article

Technique 3: Provide tools like Agent Flows for steps the agent can’t easily handle

No comments: