Mesmer (another David van Valen product) has a parameter to specify microns per pixel. CellSAM seems lacking, but it looks important.
Typically, neural networks perform best on test data that is similar to the training data. In the realm of biological imaging, the most common difference between datasets is the resolution of the data measured in microns per pixel. The resolution of the input data can be specified in app.predict using the image_mpp option.
In this example, CellSAM puts about a half-dozen individual cells into each predicted cell. Input image is 1 micron per pixel.