Define value before you build your agent

The best investment you can make in an agent's business value starts before any configuration begins. This article guides you through the prebuild work in the Copilot Studio agent lifecycle, so you're ready to start development with clear success criteria and the baseline you need to prove value.

Ask four discovery questions before you build

A strong agent program begins with four questions you answer in sequence, before any technical work starts.

Strategic goal: What business challenge are you addressing? Name a board-visible objective such as revenue growth, cost optimization, customer experience, or compliance posture. If the strategic goal can't fit on a single slide, the agent won't pass the first budget review.
Stakeholder map: Who benefits from the agent, and what matters to each of them? Identify your sponsor, your operator, your end user, and your governor, and recognize that each has a different key performance indicator (KPI). When those KPIs conflict, address the issue now rather than at launch.
Success metrics: How will you recognize that the agent works? Define the before-state quantitatively. For example, a statement like "handle time is 8 minutes, measured over 90 days, with a P90 of 16 minutes" might be your baseline. "Our handle time is too long" isn't a baseline.
Build readiness: Are you ready to build? With a shared definition of success and a documented baseline in hand, your technical team can build toward a clear target. When you skip the first three steps, the fourth becomes a gamble.

Learn more:

Pick use cases with the highest return

You'll typically start with more candidate use cases than you have capacity to build. A short impact-and-effort review helps you prioritize your portfolio in an afternoon rather than a quarter.

How to define impact

Impact is the projected annual value the agent returns once it reaches target adoption.

Estimate it from four inputs:

The volume of work the agent touches.
The share of that work the agent can resolve or accelerate.
The unit value of each completed task (hours saved, errors avoided, revenue retained, or capacity reinvested).
The attribution discount you apply where the agent is one of several contributors to the outcome.

High impact means the annual value at target adoption is large enough to move a named KPI on your executive scorecard. For example, $1 million or more in Agent Assisted Value per year, or a double-digit point shift in a function-level KPI.

Low impact means the value is real but narrow, and the work wouldn't be missed if it continued to be done manually.

How to define effort

Effort is the total build cost plus the change burden, measured in engineering weeks and person-hours rather than dollars.

Estimate it from four inputs:

The complexity of the integrations.
The depth of the knowledge corpus the agent needs.
The orchestration pattern (single topic, multi-turn, or generative orchestration).
The size of the enablement program the rollout requires.

Low effort typically means you can build, test, and deploy within a quarter (about three months), with no new integrations, and with enablement delivered in a single training cycle.

High effort typically means the work takes more than a quarter to build, requires a new connector or data pipeline, or spans multiple functions.

How to prioritize the four quadrants

Quadrant	When to act	How to fund it
Quick wins (high impact, low effort)	Build first. Stand up three to five in parallel so your measurement program has real data within the first two quarters.	Fund from operating budget. Quick wins build support for strategic investments later.
Strategic investments (high impact, high effort)	Fund one or two at a time, after at least one quick win reaches baseline.	Request dedicated program funding. These agents justify an AI center of excellence.
Nice to have (low impact, low effort)	Queue for your citizen-maker community or for an internal agent-store listing.	No central funding. Let makers iterate as a skill-building activity.
Reconsider (low impact, high effort)	Don't fund now. Revisit in six months, when better data or lower-cost options might reduce effort.	Park and review later.

Agree on metrics before development starts

Before you reach the build stage, agree on a short list of metric targets for each use case. Use this table to choose the targets.

Metric	Description	Source
Engagement rate	The share of analytics sessions that triggered a custom topic, Escalate, Fallback, or Conversational Boosting.	Measure agent engagement
Resolution rate	The share of engaged sessions that ended with a resolved outcome, either confirmed by the user or implied by the flow.	Measure agent outcomes
Escalation rate	The share of engaged sessions that handed off through Escalate or a Transfer conversation node.	Measure agent outcomes
Abandon rate	The share of engaged sessions that ended after 60 minutes of inactivity without resolution or escalation.	Deflection overview
Customer satisfaction score (CSAT)	The average customer satisfaction score from the End of Conversation survey.	Analyze conversational agents
Routine adoption rate	The share of eligible users who use the agent on four or more active days in any rolling four-week window.	Copilot Dashboard in Viva Insights
Deflection rate	The share of requests resolved through self-service rather than handed off to a human.	Deflection overview
Monthly savings	Time or money saved by successful agent runs, entered per run or per tool by the agent owner.	Analyze time and cost savings for agents

Learn more:

Agent metrics reference

Capture your telemetry baseline before launch

A value claim without a telemetry baseline won't survive business review. Store at least 90 days of operational data with your agent artifacts so the before state isn't overwritten.

A defensible baseline covers the following measurements:

Transaction or session volume broken out by category, channel, and priority
Cycle-time distribution reported as median, P90, and P99
Error rate broken out by error type, so you can later attribute the specific errors the agent reduces
Fully loaded cost per transaction, including benefits, overhead, and allocated infrastructure
A voice-of-customer signal such as CSAT, net promoter score (NPS), or an equivalent recurring survey
The productive-hour value your team currently invests in the process today

Important

If your baseline isn't captured in a telemetry system, a ticketing export, or an enterprise resource planning (ERP) extract, it won't hold up in the first quarterly review. Surveys can supplement the baseline, but they can't replace it.

How to read P90 and P99 statistics in your baseline

Use median, P90, and P99 together to understand the process: the median is the typical case, P90 is the slow tail, and P99 is the worst-case outlier. You need all three because a median that looks healthy can coexist with a P90 that's slow enough to drive churn.

If you sort every handle time for the period from fastest to slowest, P90 is the value at the 90th percentile position—meaning 90 percent of your cases are at least that fast—and P99 sits at the 99th percentile. A median handle time of four minutes with a P90 of 20 minutes tells you that one case in 10 runs five times longer than the median—that's the tail your agent is often tasked to improve.

How to measure the productive time your team currently spends

Your current process takes measurable productive team time. Measure that time to understand what your agent is helping people reinvest.

Count the full-time-equivalent (FTE) headcount that touches the process, including supervisors, quality assurance reviewers, and schedulers if they participate.
Pick a fully loaded compensation figure for each role—typically base salary multiplied by 1.3 to 1.5 to account for benefits, payroll taxes, tools, training, and allocated office overhead. When you don't have role-specific numbers, start with the $72 per hour default from the Copilot Studio agents report.
Estimate each person's share of time spent on this process by using timesheets, workforce management data, or a structured sample.
Multiply. Productive hours invested per month equal the number of FTEs involved multiplied by the hours each FTE spends on the process each month. The productive-hour value equals that total multiplied by your fully loaded hourly rate.

Example: If five HR specialists each spend 30 hours a week on Tier-1 HR tickets at a fully loaded rate of $60 an hour, your team spends 5 × 30 × 4.3 weeks × $60 = $38,700 a month, or about $464,400 a year. That's the productive-hour investment your HR agent helps your team redirect toward higher-value work. Use that number in your first business case review.

Plan your Copilot Studio licensing and consumption

Before you approve an agent program, you need a realistic view of what it costs to build and run. Microsoft publishes official licensing, billing, and consumption information on Microsoft Learn to help you estimate your commitment in three steps.

Important

Licensing options for Microsoft 365 Copilot and Copilot Studio can change. Refer to the Microsoft documentation rather than relying on a static summary. You can look up the current rates in Billing rates and management and in the Microsoft Copilot Studio licensing guide.

Forecast consumption before you commit

Use the Copilot Studio agent usage estimator to estimate Copilot Credit consumption for your business case. Enter the agent type, expected traffic, orchestration mode, knowledge sources, and tools to get a monthly credit estimate that supports your case. Learn more about the latest rate assumptions in Agent usage estimator.

Learn more:

Put governance in place before you build

Governance isn't a tax on top of your ROI—it's the layer your ROI depends on. Use the following checklist before an agent enters the build stage.

Name an executive accountable for the agent portfolio

Agents don't reach their potential without a named leader who owns the portfolio. This leader makes prioritization, funding, risk, and retirement decisions. In practice, this role belongs to the CEO, COO, CIO, or a dedicated Chief AI Officer who runs regular portfolio reviews.

Run a Responsible AI impact assessment

Before you deploy, document the agent's decision scope, who it affects, the risks it introduces, and the mitigations you put in place. Use the Responsible AI overview for Copilot Studio to understand the Microsoft Responsible AI principles your agent should align with. Use the guidance on applying responsible AI principles to review questions during design, build, and deployment.

Enforce data residency and sensitivity labels

Data residency defines where the data your agent touches is stored, processed, and backed up. Data residency matters for compliance in many jurisdictions. Review the residency options for your Copilot Studio environment in the Power Platform regions overview. Apply sensitivity classifications in Microsoft Purview to the knowledge your agent uses so only content the user is authorized to see flows through the agent's answers.

Publish an escalation path your users can see

Visible governance builds trust. For every agent, publish what the agent does, what it doesn't do, how to escalate to a human, and how to report a problem. Include this message in the agent's greeting, the escalation topic, and in your rollout communications.

Schedule a quarterly value review

Schedule 30-day, 90-day, and 365-day value reviews before the agent goes live. Use each review to retest ROI inputs against telemetry so you know whether adoption, recapture, and cost match your plan.

Define the decommission criterion at build time

Decide when to turn off the agent. Write that criterion in the agent record at build time: the routine adoption floor, the CSAT floor, the cost ceiling, or the regulatory change that triggers decommissioning. Agents you can retire with confidence are agents you can also scale with confidence.

Learn more:

Next step

Once your baseline, prioritized use cases, and governance are in place, organize your measurement around the four value pillars and apply Microsoft's Agent Assisted Hours formula to your agent.

Measure the impact of your agents

Feedback

Was this page helpful?

Last updated on 2026-06-04

Define value before you build your agent

Ask four discovery questions before you build

Pick use cases with the highest return

How to define impact

How to define effort

How to prioritize the four quadrants

Agree on metrics before development starts

Capture your telemetry baseline before launch

How to read P90 and P99 statistics in your baseline

How to measure the productive time your team currently spends

Plan your Copilot Studio licensing and consumption

Forecast consumption before you commit

Put governance in place before you build

Name an executive accountable for the agent portfolio

Run a Responsible AI impact assessment

Enforce data residency and sensitivity labels

Publish an escalation path your users can see

Schedule a quarterly value review

Define the decommission criterion at build time

Next step

Feedback

Additional resources