Expose cached token counts in ResponseMetaInfo

# Feature Request: Expose cached token counts in ResponseMetaInfo

## Problem

`ResponseMetaInfo` does not include cached input token counts, even though Koog already parses this data from provider responses.

I'm building a chat application that tracks costs per conversation. OpenAI returns `prompt_tokens_details.cached_tokens` in responses, and Koog correctly deserializes this into `PromptTokensDetails.cachedTokens`. However, this data is dropped in `createMetaInfo()` and never reaches my application code.

## Current Behavior

**AbstractOpenAILLMClient.kt:482-487**
```kotlin
protected fun createMetaInfo(usage: OpenAIUsage?): ResponseMetaInfo = ResponseMetaInfo.create(
    clock,
    totalTokensCount = usage?.totalTokens,
    inputTokensCount = usage?.promptTokens,
    outputTokensCount = usage?.completionTokens
    // promptTokensDetails.cachedTokens is not passed
)
```

The `metadata` parameter exists in `ResponseMetaInfo.create()` but is never used.

## Data That Gets Lost

**OpenAIDataModels.kt:916-920** - Parsed but not exposed:
```kotlin
@Serializable
public class PromptTokensDetails(
    public val audioTokens: Int? = null,
    public val cachedTokens: Int? = null,  // This field exists
)
```

**OpenAIDataModels.kt:901-907** - Also parsed but not exposed:
```kotlin
@Serializable
public class CompletionTokensDetails(
    public val acceptedPredictionTokens: Int? = null,
    public val audioTokens: Int? = null,
    public val reasoningTokens: Int? = null,
    public val rejectedPredictionTokens: Int? = null,
)
```

## Proposed Solution

Populate the existing `metadata: JsonObject?` field in `ResponseMetaInfo`:

```kotlin
protected fun createMetaInfo(usage: OpenAIUsage?): ResponseMetaInfo {
    val tokenDetails = usage?.let {
        buildJsonObject {
            it.promptTokensDetails?.cachedTokens?.let { put("cachedInputTokens", it) }
            it.completionTokensDetails?.reasoningTokens?.let { put("reasoningTokens", it) }
        }.takeIf { it.isNotEmpty() }
    }

    return ResponseMetaInfo.create(
        clock,
        totalTokensCount = usage?.totalTokens,
        inputTokensCount = usage?.promptTokens,
        outputTokensCount = usage?.completionTokens,
        metadata = tokenDetails
    )
}
```

This is non-breaking since `metadata` already exists and accepts `JsonObject?`.

## Why This Matters

Cached tokens cost 50-90% less than regular input tokens depending on provider:
- OpenAI: 50% discount
- Anthropic: 90% discount
- Gemini: 75% discount

Without this data, applications either overcharge users or absorb the difference. For a typical chat session with 5K cached tokens and 1K new tokens using Claude Sonnet, the cost difference is ~$0.013 per request.

## Files to Modify

1. `AbstractOpenAILLMClient.kt` - Update `createMetaInfo()`
2. Similar changes needed in Anthropic and Google clients for their cache fields

## References

- OpenAI API: https://platform.openai.com/docs/api-reference/chat/object#chat/object-usage
- Koog source: `OpenAIDataModels.kt:916-920`, `Message.kt:398`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose cached token counts in ResponseMetaInfo #1275

Feature Request: Expose cached token counts in ResponseMetaInfo

Problem

Current Behavior

Data That Gets Lost

Proposed Solution

Why This Matters

Files to Modify

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Expose cached token counts in ResponseMetaInfo #1275

Description

Feature Request: Expose cached token counts in ResponseMetaInfo

Problem

Current Behavior

Data That Gets Lost

Proposed Solution

Why This Matters

Files to Modify

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions