Skip to content

vault: integrate linking service identity resolution#21715

Merged
prashantkumar1982 merged 26 commits intodevelopfrom
codex/pr8-orgresolver-linking
Apr 2, 2026
Merged

vault: integrate linking service identity resolution#21715
prashantkumar1982 merged 26 commits intodevelopfrom
codex/pr8-orgresolver-linking

Conversation

@prashantkumar1982
Copy link
Copy Markdown
Contributor

@prashantkumar1982 prashantkumar1982 commented Mar 26, 2026

Summary

Integrates Vault with the linking service flow so workflow-driven Vault requests can carry both OrgId and WorkflowOwner.

The workflow engine now propagates OrgID through RequestMetadata and forwards it into Vault GetSecretsRequest payloads. That OrgID forwarding is gated by VaultOrgIdAsSecretOwnerEnabled, so when the gate is disabled the workflow side omits OrgID.

On the Vault capability side, all gateway request paths (create, update, delete, and list) go through the linker layer. The linker resolves or verifies the org_id / workflow_owner pair and rejects requests when the values do not match correctly.

Execute(GetSecrets) is kept simpler: it treats the incoming GetSecretsRequest identity fields as trusted and validates secret owners against the request WorkflowOwner directly.

@github-actions
Copy link
Copy Markdown
Contributor

👋 prashantkumar1982, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

✅ No conflicts with other open PRs targeting develop

@trunk-io
Copy link
Copy Markdown

trunk-io bot commented Mar 26, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

@prashantkumar1982 prashantkumar1982 requested a review from a team as a code owner March 26, 2026 08:08
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

CORA - Pending Reviewers

All codeowners have approved! ✅

Legend: ✅ Approved | ❌ Changes Requested | 💬 Commented | 🚫 Dismissed | ⏳ Pending | ❓ Unknown

For more details, see the full review summary.

@prashantkumar1982 prashantkumar1982 changed the title vault: link org ids via org resolver vault: integrate linking service identity resolution Mar 26, 2026
s.lggr.Errorw("get secrets request owner mismatch", "index", idx, "secretOwner", req.Id.Owner, "orgID", resolvedIdentity.OrgID)
return capabilities.CapabilityResponse{}, fmt.Errorf("secret identifier owner %q does not match org_id %q at index %d", req.Id.Owner, resolvedIdentity.OrgID, idx)
}
case req.Id.Owner != "":
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this case go first? All other ones depend on it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this case means both workflow and orgID were set as nil in the request, which should never happen.
This is just a saniuty check condition.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this also allows all to be empty though, I think we should just put a default clause here

i.e. either owner or org ID is non-empty, or we error (nothing else is allowed)

return capabilities.CapabilityResponse{}, fmt.Errorf("secret identifier owner %q does not match workflow owner %q at index %d", req.Id.Owner, resolvedIdentity.WorkflowOwner, idx)
}
case resolvedIdentity.OrgID != "":
if req.Id.Owner != resolvedIdentity.OrgID {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand the difference between case normalizedWorkflowOwner != "" and case resolvedIdentity.OrgID != "". Is there any case when both the normalizedWorkflowOwner and resolvedIdentity.OrgID are non-empty?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I see this logic is pretty convoluted.
This code path is within Execute(), meaning it will only be called from a workflow, while fetching secrets via GetSecret().
So CLI will never use this code path.

This call is already trusted because comes from within workflow engine with F+1 consensus.

Now, the workflow engine, for a given request, will have both workflowOwner and OrgID. Today, it only sends the workflowOwner, but I am considering it can send both workflowOwner and orgID. This will be helpful for the time we are in auto-migration mode.

So theoretically we don't need linking service to be even invoked for this code path, as we trust inputs here.
@cedric-cordenier what do you think? Should we invoke Linking Service for each GetSecret() path and validate the mapping?
I think it adds extra rpc for each read call, whicih you wanted to avoid?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I don't think we need to invoke the linking service here, we should be able to trust the input coming from the engine if we land in this part of the code

Comment on lines +69 to +71
if workflowOwner == "" {
return LinkedVaultRequestIdentity{OrgID: orgID, WorkflowOwner: workflowOwner}, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be extracted as an independent condition about this if statement, because it will be checked again on L95.


if req.Id != nil && normalizeOwner(req.Id.Owner) != normalizedWorkflowOwner {
return capabilities.CapabilityResponse{}, fmt.Errorf("secret identifier owner %q does not match workflow owner %q at index %d", req.Id.Owner, request.Metadata.WorkflowOwner, idx)
if req.Id != nil && normalizeOwner(req.Id.Owner) != normalizeOwner(r.WorkflowOwner) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's safer to use the request.Metadata.WorkflowOwner because that is always provided by the engine; I think using the request.WorkflowOwner is fine because we provide that in the SecretsFetcher (but please check!) but that might change in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this method Execute() is only called by the workflows, and they will set this field after picking it up from request.Metadata.WorkflowOwner. So it is the same thing imo.
But still switching to request.Metadata.WorkflowOwner.

if strings.TrimSpace(resolvedOrgID) == "" {
return LinkedVaultRequestIdentity{}, fmt.Errorf("resolved empty org_id for workflow owner %q", workflowOwner)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Why are we fetching the orgID here? Does this duplicate what we're doing on line 86?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, refactored now

}

// Link resolves or verifies the request identity from the caller-provided org and workflow owner.
func (l *OrgIDToWorkflowOwnerLinker) Link(ctx context.Context, orgID string, workflowOwner string) (LinkedVaultRequestIdentity, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clean up some of the defensive coding throughout this file? I think we've taken it way too far.

see line 53 -- vaultOrgIDAsSecretOwnerEnabled will not be nil if created via the constructor since we'll error if that the case. Same with checking l == nil and the orgResolver; let's just do this check once in the constructor if we need to.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment it's very difficult to understand what the important error conditions are and what we're trying to guard

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. cleaned up the fn now.

return common.HexToAddress(owner).Hex(), nil
}

func vaultOrgIDAsSecretOwnerEnabled(ctx context.Context, gate limits.GateLimiter) (bool, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe LimitOrFalse since there's nothing vault specific about this, it's a helper

@@ -66,15 +70,18 @@ func NewSecretsFetcher(
lggr = logger.Named(lggr, "WorkflowEngine.SecretsFetcher")
lggr = logger.With(lggr, "workflowID", workflowID, "workflowName", workflowName, "workflowOwner", workflowOwner, "phaseID", phaseID)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding the orgID here :) ?

ibrajer
ibrajer previously approved these changes Apr 2, 2026
@prashantkumar1982 prashantkumar1982 added this pull request to the merge queue Apr 2, 2026
@cl-sonarqube-production
Copy link
Copy Markdown

Merged via the queue into develop with commit 980066b Apr 2, 2026
232 of 234 checks passed
@prashantkumar1982 prashantkumar1982 deleted the codex/pr8-orgresolver-linking branch April 2, 2026 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants