Question regarding the role of xim in overdispersion theta estimation


### Context

Hi,

I am using `glmGamPoi` to model single-cell RNA-seq count data using the Negative Binomial distribution. The workflow runs successfully, but I have a question regarding the internal implementation details of overdispersion estimation.

In the paper supplementary, the quadratic variance-to-mean relationship is defined as:


$$\sigma^2 = \mu + \theta \mu^2$$

However, while digging into the codebase for estimating $\theta$, I noticed the introduction of a variable named [xim](https://github.com/const-ae/glmGamPoi/blob/95cffb79b02ce1239112d3c8b4adabfb448940bf/R/overdispersion.R#L274). When running tests to check the estimated value of $\theta$ without factoring in `xim`, the output does not seem to show a significant difference.

### Questions

1. What is the explicit mathematical or computational role of the `xim` variable during the estimation of $\theta$?
2. Why it doesn't strictly follow the standard variance-mean function above? (e.g., Is it a stabilization parameter, a transformation step, or handling a specific edge case for zero-inflation/low counts)

### Minimal Code Context

The command I used as:

```r
fit <- glmGamPoi::glm_gp(
  data         = umi,
  design       = ~1,
  col_data     = data,
  offset       = log_umi,
  size_factors = FALSE
)

```

I would love to understand the underlying intuition behind this design choice. Thank you for developing such a fantastic and high-performance package!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding the role of xim in overdispersion theta estimation #75

Context

Questions

Minimal Code Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question regarding the role of xim in overdispersion theta estimation #75

Description

Context

Questions

Minimal Code Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions