Reduce custom app permissions and improve setup reliability by sellakumaran · Pull Request #409 · microsoft/Agent365-devTools

sellakumaran · 2026-05-08T01:19:55Z

Removes DelegatedPermissionGrant.ReadWrite.All and AgentIdentity.Create.All from the required CLI app permission set. Agent identity creation now uses Blueprint app-only credentials (AgentIdentity.CreateAsManager auto-granted to Blueprint apps). Principal-scoped oauth2 grants use AgentIdentityBlueprint.UpdateAuthProperties.All. EnsureServicePrincipalForAppIdAsync eliminated for agent identity SPs (id == appId for ServiceIdentity type), removing the Application.ReadWrite.All dependency.

Adds exponential back-off retry loops for AADSTS700016 and Authorization_IdentityNotFound propagation errors on fresh blueprint setups. All propagation-lag retry logs downgraded to Debug (not user-actionable).

Additional fixes:

--authmode obo with --aiteammate warns instead of hard-erroring
Messaging endpoint summary shows not-configured vs failed correctly
Explicit null guard on AgentBlueprintClientSecret before UnprotectSecret
Stale error message referencing removed permissions corrected
Retry loop convention aligned (maxAttempts / < throughout)
ConfigService omits null values from ExtractDynamicProperties to prevent null-overwrite cycle on re-run (issue 408 fix)

Validated end-to-end across base, --aiteammate, --m365, and --authmode both paths as Agent ID Developer role with no Application.ReadWrite.All, DelegatedPermissionGrant.ReadWrite.All, AgentIdentity.ReadWrite.All, or AgentIdentity.Create.All on the custom app.

Removes DelegatedPermissionGrant.ReadWrite.All and AgentIdentity.Create.All from the required CLI app permission set. Agent identity creation now uses Blueprint app-only credentials (AgentIdentity.CreateAsManager auto-granted to Blueprint apps). Principal-scoped oauth2 grants use AgentIdentityBlueprint.UpdateAuthProperties.All. EnsureServicePrincipalForAppIdAsync eliminated for agent identity SPs (id == appId for ServiceIdentity type), removing the Application.ReadWrite.All dependency. Adds exponential back-off retry loops for AADSTS700016 and Authorization_IdentityNotFound propagation errors on fresh blueprint setups. All propagation-lag retry logs downgraded to Debug (not user-actionable). Additional fixes: - --authmode obo with --aiteammate warns instead of hard-erroring - Messaging endpoint summary shows not-configured vs failed correctly - Explicit null guard on AgentBlueprintClientSecret before UnprotectSecret - Stale error message referencing removed permissions corrected - Retry loop convention aligned (maxAttempts / < throughout) - ConfigService omits null values from ExtractDynamicProperties to prevent null-overwrite cycle on re-run (issue 408 fix) Validated end-to-end across base, --aiteammate, --m365, and --authmode both paths as Agent ID Developer role with no Application.ReadWrite.All, DelegatedPermissionGrant.ReadWrite.All, AgentIdentity.ReadWrite.All, or AgentIdentity.Create.All on the custom app. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-08T01:20:09Z

⚠️ Deprecation Warning: The deny-licenses option is deprecated for possible removal in the next major release. For more information, see issue 997.

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Copilot

Pull request overview

Updates a365 setup flows to reduce required permissions on the custom CLI app and improve reliability on fresh blueprint setups (replication-lag retries, clearer messaging), while aligning generated config behavior to avoid null-overwrite cycles.

Changes:

Switch agent identity creation to use Blueprint app-only credentials and remove DelegatedPermissionGrant.ReadWrite.All from required permission lists.
Add exponential backoff retries for transient Entra propagation errors and downgrade propagation-lag logs to Debug.
Improve setup validation/UX: --authmode obo --aiteammate warns (continues), messaging endpoint summary distinguishes “not configured”, and generated config omits null dynamic properties.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/Tests/Microsoft.Agents.A365.DevTools.Cli.Tests/Services/Agent365ConfigServiceTests.cs	Adds regression tests ensuring null dynamic properties are omitted and non-null secrets persist.
src/Tests/Microsoft.Agents.A365.DevTools.Cli.Tests/Commands/SetupCommandTests.cs	Updates tests for `--authmode` + `--aiteammate` behavior (warning vs error) and validates incompatible modes still fail.
src/Microsoft.Agents.A365.DevTools.Cli/Services/GraphApiService.cs	Adds retry/backoff for blueprint token acquisition and agent identity creation; adjusts propagation-lag logging levels.
src/Microsoft.Agents.A365.DevTools.Cli/Services/ConfigService.cs	Filters out null values when extracting dynamic properties to avoid null-overwrite on reruns.
src/Microsoft.Agents.A365.DevTools.Cli/Constants/AuthenticationConstants.cs	Removes `DelegatedPermissionGrant.ReadWrite.All` from required permissions/scopes.
src/Microsoft.Agents.A365.DevTools.Cli/Commands/SetupSubcommands/SetupHelpers.cs	Improves summary output/action-required messaging for “messaging endpoint not configured”.
src/Microsoft.Agents.A365.DevTools.Cli/Commands/SetupSubcommands/NonDwBlueprintSetupOrchestrator.cs	Uses blueprint client secret for agent identity creation; removes agent identity SP “ensure” step.
src/Microsoft.Agents.A365.DevTools.Cli/Commands/SetupSubcommands/AllSubcommand.cs	Changes `--authmode obo --aiteammate` from hard error to warning; keeps other modes incompatible.
CHANGELOG.md	Documents permission reductions, retry behavior, and setup UX fixes.
.gitignore	Ignores `docs/min-permissions/`.

- Reduce required delegated scopes for a365 CLI client app: - Use AgentIdentityBlueprint.ReadWrite.All as umbrella for blueprint ops - Require AgentIdentityBlueprintPrincipal.Create for SP creation - Replace Directory.Read.All with Application.Read.All - Remove User.ReadWrite.All, broad blueprint sub-scopes, and AppRoleAssignment.ReadWrite.All - Update all code, logging, and user guidance to reference new scopes - Role checks now decode wids claim from MSAL token (no Graph call) - Improve token acquisition retry logic for blueprint creation - Update tests and documentation to match new permission model - Endpoint registration guidance now points to Teams Developer Portal - Reduces privilege footprint; 7-permission set validated across admin and developer roles

- Fix off-by-one in retry log {Max} argument: pass maxRetries/maxAttempts instead of maxRetries-1/maxAttempts-1 in three retry loops - Assert exit code is 0 (not just non-1) in WarnsAndContinues test - Replace brittle JSON string assertions with JsonNode parsing in SaveStateAsync_NonNullStringProperty_IsWrittenToJson - Remove misleading 'id == appId' comment in NonDwBlueprintSetupOrchestrator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 18 out of 19 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

src/Microsoft.Agents.A365.DevTools.Cli/Services/GraphApiService.cs:1526

GetBlueprintAccessTokenAsync increased maxRetries to 12 with exponential backoff capped at 60s. With baseDelaySeconds=5 this can sleep for ~8+ minutes total (5+10+20+40+60*7), which is a large behavioral/operational change and doesn’t match the PR description’s “~60s total”. Consider reducing attempts (or lowering baseDelaySeconds) and rename maxRetries to maxAttempts for clarity since the loop condition is attempt < maxRetries (total attempts).

            const int maxRetries = 12;
            const int baseDelaySeconds = 5;

            for (int attempt = 0; attempt < maxRetries; attempt++)
            {

- Fix XML comment in InteractiveGraphAuthServiceTests: AgentIdentityBlueprintPrincipal.Create is a separate required scope, not covered by the ReadWrite.All umbrella - Improve Contains() guard comment in AgentBlueprintService: explicitly states agent user cleanup is disabled (intentional) until create-instance is re-enabled - Document RequiredPermissionGrantScopes = [] intent: empty routes to standard AuthenticationService token path which already carries all required scopes via RequiredClientAppPermissions (PR #409) - Document RequiredS2SGrantScopes = [] intent: AppRoleAssignment.ReadWrite.All removed; admins have bypass, developers fall back to PowerShell instructions (PR #409) - Add detection rules E/F/G to pr-code-reviewer.md to catch these patterns in future reviews Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The "Run tests" step on Ubuntu has hung twice on this branch (4h 50m and 1h+ respectively) with no log surfaced because GitHub publishes job logs only on completion. Add two narrowly-scoped diagnostic guardrails so the next hang fails fast and tells us which test is stuck: - job-level `timeout-minutes: 20` — bounds the run to ~2x the Windows-local suite time instead of GitHub's 6-hour default. - `--blame-hang --blame-hang-timeout 5min` — produces a Sequence_*.xml hang report naming the stuck test method (and the test before it) when any single test exceeds 5 minutes. Also demote the MsalBrowserCredential "Failed to register persistent token cache" warning to Debug. The same exception was already logged at Debug on the line above; the warning text ("auth prompts may be repeated") was not actionable by the user (common cause on headless Linux is no D-Bus/Keychain) and produced noise in CI test output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 6 comments.

Six Copilot AI comments addressed: - GraphApiService: rewrite XML doc for CheckDirectoryRoleAsync and its two wrappers to describe the wids-claim implementation (no Graph call, no scope dependency) and document the group-assignment / PIM-eligible limitations. The previous doc still described the old transitiveMemberOf query path. - SetupHelpers: replace the misleading "uses AgentIdentityBlueprint.ReadWrite.All as umbrella" comment with an explicit note that permissionGrantScopes is intentionally empty and that empty arrays fall through to the standard token path. - AuthenticationConstants: delete unused AgentIdentityBlueprintDeleteRestoreAllScope and AgentIdentityBlueprintAddRemoveCredsAllScope constants — they contradicted the code that uses the ReadWrite.All umbrella for those operations, and grep confirmed no callers in src/. - CHANGELOG: correct the retry-timing claim from "~60s total" to several minutes worst case (12 attempts × 60s cap ≈ 8 min for the blueprint token retry). - GraphApiServiceTests: rename IsCurrentUserAdminAsync_GraphFails_ReturnsUnknown and IsCurrentUserAgentIdAdminAsync_GraphReturnsNull_ReturnsUnknown to *_TokenAcquisitionFails_ReturnsUnknown so the names match the now-token-based failure mode. - MessagingEndpointFailureReasons: extract the four string literals ("NotOwner", "BlueprintMissing", "NotConfigured", "Other") into a shared constant class in Constants/, replacing 11 string-literal usages across AllSubcommand, SetupHelpers, TeamsGraphBackendConfigurator, and AllSubcommandTests. CI fix: - MockToolingServerSubcommandTests: remove HandleStartServer_WithValidPort_LogsStartingMessage and HandleStartServer_WithNullPort_UsesDefaultPort. Both started a real Kestrel server via Server.Start() on a fire-and-forget LongRunning task that the test never tore down. On Linux CI this caused two failures: (a) the Theory port 1 case requires root and never binds, and (b) parallel tests collided on the leaked port 5309 binding. --blame-hang-timeout caught the deadlock on the previous run. Remaining tests still cover handler logic (dry-run, background, invalid port, verbose) without binding any port; a comment documents the decision to keep the regression from coming back. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous CI run hung in PermissionsSubcommandTests.ConfigureMcpPermissionsAsync_V1AndMetadataScopes_AreKnownAndProceed for 5+ minutes until --blame-hang-timeout aborted the run. Root cause: BatchPermissionsOrchestrator's pre-warm call, graph.GraphGetAsync(tenantId, "/v1.0/me?$select=id", ct, scopes: prewarmScopes) now receives scopes: [] because RequiredPermissionGrantScopes was emptied earlier on this branch. Empty scopes route EnsureGraphHeadersAsync to the standard token path (GetGraphAccessTokenAsync), and on a partial mock that falls through to the real MSAL AuthenticationService. On Linux CI with no cached credentials, that blocks waiting for browser/device-code auth. Windows masked it with cached tokens (2s test runtime). Fix: pre-stub three virtual GraphApiService methods (GraphGetAsync, IsCurrentUserAdminAsync, IsCurrentUserAgentIdAdminAsync) in the test class constructor so the orchestrator gets a null pre-warm response and short-circuits out of Phase 1/2/3 deterministically. Inline comment documents why so a future reader hitting the same pattern in another test class has the reasoning. Targeted test now runs in 178 ms (was 2 s on Windows, 5+ min hang on Linux). Full suite drops from 12.58 s to 5.18 s for 1392 passing tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 26 out of 27 changed files in this pull request and generated 2 comments.

Two workstreams combined in one commit per request. A. Copilot review-comment fixes on PR #409: - GraphApiService.CheckDirectoryRoleAsync: previously acquired the role-check token via AuthenticationService → PowerShell well-known clientId, which does NOT have the wids optional claim configured. The method always returned Unknown, causing BatchPermissionsOrchestrator to treat real Global Admins as non-admins (admin URL printed even when the signed-in user could grant inline). Now routes through _tokenProvider with CustomClientAppId and User.Read, so the JWT comes from the app that actually carries wids. - GraphApiService.EnsureGraphHeadersAsync: empty IEnumerable<string> previously fell through to the same PowerShell-clientId path. Routing changed to use _tokenProvider whenever (hasScopes || hasCustomApp). Bootstrap escape hatch preserved: no scopes AND no CustomClientAppId still uses legacy AuthenticationService so the initial app lookup doesn't hang on a null clientId. - GraphApiServiceTests: helper mocks now return the wids JWT via the token provider (matching the new production path). Production methods called by 8 existing tests still pass. - pr-code-reviewer.md: added Rule H — "JWT claim decoded → verify the token was issued by the app registration that has the claim configured." Cites PR #409 as the concrete example so reviewers ground future analysis. B. Remove a365 deploy references from CLI code: - PermissionsSubcommand.cs: help text and runtime "Next step" log no longer reference the long-removed 'a365 deploy'. Both now point at 'a365 publish', the actual next a365 command in the workflow. - PermissionsSubcommandTests.cs: assertion updated to pin "a365 publish". - NodeBuildFailedException.cs and NodeDependencyInstallException.cs deleted: dead code since a365 deploy was removed (no throw sites, no test refs). - ErrorCodes.cs: removed NodeBuildFailed and NodeDependencyInstallFailed (only callers were the deleted exception classes). - design.md: removed DeployCommand.cs row, removed the five deploy-era service rows from the Services folder tree, replaced the entire "Multiplatform Deployment Architecture" section (IPlatformBuilder interface + Deployment Pipeline mermaid + Restart Mode) with a tight "Multiplatform Project Detection" section that accurately describes what PlatformDetector does today (used by publish, not deploy). Fixed Program.cs sketch. - CHANGELOG.md: one bullet under [Unreleased] Fixed documenting the user-visible help/log change. Validation: - Unit suite: 1392/1392 pass, 7.2s total, no slow tests. - End-to-end Run 2-retest2 Minimum (cached cache, 8s): all role-check tokens from clientId 716ae110- (test custom app). - End-to-end Run 2-retest2 Medium (cleared cache, 1m 50s): bootstrap escape hatch correctly used legacy AuthenticationService, no Connect-MgGraph fallback; steady-state used custom app. Doc-side a365 deploy references in docs/ai-workflows/, docs/agent365-guided- setup/, CLAUDE.md, DEVELOPER.md, and two folder READMEs deferred to a follow-on PR (per user's plan scope). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 8, 2026 01:19

sellakumaran requested review from a team as code owners May 8, 2026 01:19

github-actions Bot added the documentation Improvements or additions to documentation label May 8, 2026

Copilot started reviewing on behalf of sellakumaran May 8, 2026 01:20 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

gwharris7 previously approved these changes May 8, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings May 10, 2026 15:04

sellakumaran dismissed gwharris7’s stale review via 9ed98b5 May 10, 2026 15:04

Copilot started reviewing on behalf of sellakumaran May 10, 2026 15:05 View session

Copilot AI reviewed May 10, 2026

View reviewed changes

sellakumaran enabled auto-merge (squash) May 10, 2026 18:53

Copilot AI review requested due to automatic review settings May 10, 2026 19:59

Copilot started reviewing on behalf of sellakumaran May 10, 2026 20:00 View session

Copilot AI reviewed May 10, 2026

View reviewed changes

sellakumaran and others added 2 commits May 11, 2026 06:22

Copilot AI review requested due to automatic review settings May 11, 2026 13:38

Copilot started reviewing on behalf of sellakumaran May 11, 2026 13:39 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Comment thread src/Microsoft.Agents.A365.DevTools.Cli/Services/GraphApiService.cs Outdated

Comment thread src/Microsoft.Agents.A365.DevTools.Cli/Constants/AuthenticationConstants.cs

biswapm approved these changes May 11, 2026

View reviewed changes

gwharris7 approved these changes May 11, 2026

View reviewed changes

sellakumaran merged commit 3177ae6 into main May 11, 2026
9 checks passed

sellakumaran deleted the users/sellak/min-permissions branch May 11, 2026 15:31

Conversation

sellakumaran commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented May 8, 2026 •

edited

Loading