Added instructions on working with teamcity.toml and the link command to the agent skill by boris-yakhno · Pull Request #352 · JetBrains/teamcity-cli

Boris Yakhno (boris-yakhno) · 2026-06-11T09:24:03Z

Summary

Instructions on working with teamcity.toml and the link command are added to the agent skill. Evals are updated and expanded to check the new requirements.

Changes

SKILL.md, commands.md, workflows.md – instructions on working with teamcity.toml and the link command.
evals/task.json, evals/checks.py – updated evals, added one new eval to check if the repository binding is being used.
Other files undes evals/ – changes that make it possible to setup evals with files.

Design Decisions

Added a "Mandatory rules" section at the top of SKILL.md to force agents to adhere to the new rule.

Example

N/A — not user-visible.

Test Plan

Unit tests pass (just unit)
Linter passes (just lint)
Acceptance tests pass (just acceptance)
If adding a new command/flag: added .txtar test in acceptance/testdata/. N/A — no applicable changes.
If adding a data-producing command: includes --json support. N/A — no applicable changes.
If modifying --json output: no field removals/renames (additive only). N/A — no applicable changes.
If changing docs-visible behavior: updated docs/, skills/, and README.md. N/A — no applicable changes.
External contributors: links a status:finalized issue (or trivial/docs/deps change). N/A — no applicable changes.

Copilot

Pull request overview

This PR updates the TeamCity CLI agent skill documentation to formalize how agents should use teamcity.toml and the teamcity link command, and extends the eval harness to enforce the new repository-binding requirements (including adding support for pre-seeded workspace files in eval runs).

Changes:

Added “Mandatory rules” to the TeamCity CLI skill, emphasizing teamcity.toml binding checks and when to use (or not use) teamcity link.
Expanded the command/workflow references with a new teamcity link section and repository-binding workflow examples.
Updated eval scaffolding and checks to validate binding behaviors, and added setup_files support to seed files (e.g., teamcity.toml) into eval workspaces.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
skills/teamcity-cli/SKILL.md	Adds mandatory agent rules for `teamcity.toml` + `teamcity link` usage.
skills/teamcity-cli/references/workflows.md	Adds a repository-binding workflow section.
skills/teamcity-cli/references/commands.md	Adds `teamcity link` to the command reference (docs).
evals/tests/test_tasks.py	Wires `setup_files` from task config into the runner.
evals/tasks.json	Updates task checklists and adds a new task that seeds `teamcity.toml`.
evals/scaffold/tasks.py	Extends `TaskConfig` to include `setup_files`.
evals/scaffold/claude.py	Implements workspace file seeding + env-var templating for evals.
evals/checks.py	Adds repository-binding checks and registers them; updates valid subcommand list.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Boris Yakhno (boris-yakhno) · 2026-06-11T10:11:57Z

+- `--project <id>` - Specifies the project for the binding
+- `--job <id>` - Specifies the job for the binding
+- `--jobs <id1,id2>` - Specifies jobs of interest stored separately from the main binding job
+- `--server <url>` - When multiple servers are authenticated, can be used to specify the server for which the binding is upserted
+- `--scope <path>` - Upserts a binding for a specific directory. If the value is an empty string, upserts the binding for the repository root.


I intentionally omitted the auto mode, as we plan to have a standardised guide for both the CLI and the MCP for project/job lookup, and auto would become redundant and make the instructions more complicated.

Viktor (@tiulpin) Viktoria Petrenko (@vbedrosova) What do you think, should we have the agents use the auto mode? Or maybe we should document the auto mode right now, but remove the mention once we add the lookup guide to the CLI?

Boris Yakhno (boris-yakhno) · 2026-06-11T10:14:56Z

+def added_repository_link_with_project_only(runner: EvalRunner) -> None:
+    for cmd in runner.commands:
+        c = cmd.lower()
+        if "teamcity link" in c and ("--project " in c or " -p " in c) and ("--job " not in c and " -j " not in c):
+            runner.passed("Linked the repository using only the project argument")
+            return
+    runner.failed("Did not link the repository using only the project argument")


This is fine. The check validates that no job id is passed, as there is none to pass in the corresponding evals. Setting jobs of interest is acceptable and might even be desirable.
But perhaps the check name is confusing, we can name it added_repository_link_with_project_and_without_job instead.

+def used_project_from_repository_link(runner: EvalRunner) -> None:
+    missing_linked_project = "No project found by name or internal/external id 'Project_uipBGpvQua'"
+    for result in runner.events.tool_results.values():
+        content = result.get("content", "")
+        if isinstance(content, str) and missing_linked_project in content:
+            runner.passed("Used project from teamcity.toml")
+            return
+        if isinstance(content, list):
+            for block in content:
+                if isinstance(block, dict) and missing_linked_project in block.get("text", ""):
+                    runner.passed("Used project from teamcity.toml")
+                    return
+    runner.failed("Did not use project from teamcity.toml")


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab43a827ad

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

chatgpt-codex-connector · 2026-06-11T09:28:06Z

+def added_repository_link_with_project_only(runner: EvalRunner) -> None:
+    for cmd in runner.commands:
+        c = cmd.lower()
+        if "teamcity link" in c and ("--project " in c or " -p " in c) and ("--job " not in c and " -j " not in c):


Parse equals-form link flags in the project-only check

This check can both miss valid project-only links and falsely pass job-specific links when agents use the standard Cobra/pflag equals syntax, e.g. teamcity link --project=JBR --no-input fails because there is no "--project ", while teamcity link --project JBR --job=Foo passes because there is no "--job ". That corrupts the new evals that rely on this check to distinguish project-wide bindings from job-specific bindings; parse tokens or match --project=/--job= and -p=/-j= as well.

Useful? React with 👍 / 👎.

… to the agent skill

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a57b4596a0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

chatgpt-codex-connector · 2026-06-11T10:03:11Z

+def added_repository_link(runner: EvalRunner) -> None:
+    if runner.has_command("teamcity", "link"):
+        runner.passed("Linked the repository to a TeamCity job / project")


Require an actual successful repository link

This passes as soon as any Bash command contains teamcity and link, so an eval run that only does teamcity link --help, teamcity link with no flags, or another failed/usage-only invocation is graded as having created the binding. Because the new tasks use this check to validate that teamcity.toml was completed, this can falsely pass runs that never write or update the binding; the check should verify a non-help/non-validation link invocation succeeded or that teamcity.toml was created/updated by the command.

Useful? React with 👍 / 👎.

Viktor (tiulpin)

Thanks! The link docs are good (verified flags against the binary; add --auto), and setup_files we need anyway. But can we maybe split this? TL;DR: a pull request should change the skill or the checks, never both.

It races #350 (unfortunately, merged only this morning, but I'm happy to help you with resolving conflicts): the allowlist is gone – cli_schema.json is generated from the cobra tree, link/run tree already covered.

Bigger issue: added_repository_link, mentioned_teamcity_toml and etc. on existing tasks assert behavior only SKILL.md mandates. CONTROL run can't know the rule and doesn't need it for the task, so baseline can't pass; lift inflates without the skill getting better.

Our inventd rule for this: a scored check must be passable by someone who never read SKILL.md; skill-prescribed behavior becomes an unscored tag. used_project_from_repository_link keys on an exact error string the CLI emits automatically in both arms – measures the binary, not the agent.

Here's my split suggestion:

setup_files, rebased, + .. guard + unit test. Measurement-neutral, merges now.
the docs; SKILL.md cut to: respect existing teamcity.toml, never hand-edit, link only when asked. Mandatory link on read-only tasks = agents writing into user repos unprompted. CI measures the new text automatically.
one dedicated use-repository-link task: real project binding, underspecified prompt, score whether the bound scope was used. CONTROL can discover the file, so it's fair. Re-baselines main.

P.S. Maybe not all checks follow this rule at the moment, but we need to update old checks if they don't and some work on that is in progress.

Viktor (tiulpin) · 2026-06-11T16:02:05Z

Thought about this more...

Before we invest more in skill rules for linking: the original problem statement from the issue was already solved.

teamcity run list --revision @head

finds the build for the agent's exact commit, and the agent skill already documents this pattern (workflows reference, "wait for my commit" flow)

What link adds on top is convenience: a persisted choice of which job is "the" job for the repo, default scoping, monorepo paths. Useful – but the ticket's premise ("agent struggles to find the exact build") isn't blocked on it.

Proposal: measure before building. We have an A/B eval harness for the CLI skill.
One task that models this exact flow – repo workspace, "I pushed a fix, is CI green?" – tells us whether agents succeed via --revision @head without a binding, and whether a binding (or skill text about it) actually moves the success rate. If it does, we ship the skill rules with evidence; if it doesn't, this could re-scope to the setup-MVP flow (TW-99789) where the agent creates the pipeline and linking is natural.

Happy to help with that this week – it's one task definition in the existing pipeline

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 244c9516ec

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

chatgpt-codex-connector · 2026-06-12T09:10:50Z

+
+
+def used_project_from_repository_link(runner: EvalRunner) -> None:
+    linked_project = "No project found by name or internal/external id 'Project_uipBGpvQua'"


Accept the linked job as repository-link usage

In the new use-repository-link eval, teamcity.toml is seeded with both project = "Project_uipBGpvQua" and job = "FooJob", and run list resolves the linked default job when no explicit filter is supplied. An agent that correctly honors the more specific linked job can therefore query FooJob and receive a job/build-type error instead of this hard-coded project-not-found text, causing a false failure even though it used the repository binding. Consider accepting either the linked project or linked job marker for this task.

Useful? React with 👍 / 👎.

Boris Yakhno (boris-yakhno) requested review from Copilot and Viktoria Petrenko (vbedrosova) June 11, 2026 09:24

Boris Yakhno (boris-yakhno) requested a review from Viktor (tiulpin) as a code owner June 11, 2026 09:24

Copilot started reviewing on behalf of Boris Yakhno (boris-yakhno) June 11, 2026 09:24 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Added instructions on working with teamcity.toml and the link command…

a57b459

… to the agent skill

Boris Yakhno (boris-yakhno) force-pushed the reporitory-binding-skill-update branch from ab43a82 to a57b459 Compare June 11, 2026 10:00

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

JetBrains (JetBrains) deleted a comment from lligirlburg-wq Jun 11, 2026

Viktor (tiulpin) requested changes Jun 11, 2026

View reviewed changes

Fixes to address code review comments

244c951

chatgpt-codex-connector Bot reviewed Jun 12, 2026

View reviewed changes

JetBrains (JetBrains) deleted a comment Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added instructions on working with teamcity.toml and the link command to the agent skill#352

Added instructions on working with teamcity.toml and the link command to the agent skill#352
Boris Yakhno (boris-yakhno) wants to merge 2 commits into
JetBrains:mainfrom
boris-yakhno:reporitory-binding-skill-update

Boris Yakhno (boris-yakhno) commented Jun 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Boris Yakhno (boris-yakhno) Jun 11, 2026

Uh oh!

Boris Yakhno (boris-yakhno) Jun 11, 2026

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

Viktor (tiulpin) left a comment

Uh oh!

Viktor (tiulpin) commented Jun 11, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants



		def used_project_from_repository_link(runner: EvalRunner) -> None:
		linked_project = "No project found by name or internal/external id 'Project_uipBGpvQua'"

Conversation

Boris Yakhno (boris-yakhno) commented Jun 11, 2026

Summary

Changes

Design Decisions

Example

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Boris Yakhno (boris-yakhno) Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Boris Yakhno (boris-yakhno) Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Viktor (tiulpin) left a comment

Choose a reason for hiding this comment

Uh oh!

Viktor (tiulpin) commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Viktor (tiulpin) commented Jun 11, 2026 •

edited

Loading