tidyvpc#411
Conversation
|
@A2P2 This isn't ready to merge yet, but at long last it's at the point where comments would be useful. The tidyvpc integration works with all tests passing except vdiffr, but we don't yet have tests for the new functionality. Prediction correction is supported, the code seems robust to some of the variations we get for some controllers (e.g. one NM controller in the test suite creates simulated data with different column names and numbers of rows), but it does not yet support binless VPC. The core of the code is in place (i.e. it can call library(ggPMX)
ctr <- theophylline()
# vary number of bins, no prediction correction
p5 <- pmx_plot_vpc(ctr, predcorr = FALSE, bin = pmx_vpc_bin(style = "quantile", nbins = 5))
p10 <- pmx_plot_vpc(ctr, predcorr = FALSE, bin = pmx_vpc_bin(style = "quantile", nbins = 10))
# vary number of bins, add prediction correction
pc5 <- pmx_plot_vpc(ctr, predcorr = TRUE, bin = pmx_vpc_bin(style = "quantile", nbins = 5))
pc10 <- pmx_plot_vpc(ctr, predcorr = TRUE, bin = pmx_vpc_bin(style = "quantile", nbins = 10))
# type = scatter, with and without predcorr
ps <- pmx_plot_vpc(ctr, type = "scatter", predcorr = FALSE)
pcs <- pmx_plot_vpc(ctr, type = "scatter", predcorr = TRUE)
# does not currently work: although tidyvpc::binless is included in the processing
# pipeline, the rest of the code is not robust to the change in data structure
pbinless <- pmx_plot_vpc(ctr, predcorr = FALSE, bin = pmx_vpc_bin(style = "binless", nbins = 5))
|
|
|
||
| pred_data <- simulated_data %>% | ||
| dplyr::group_by(ID, !!sym(x$idv)) %>% | ||
| dplyr::summarise(PRED = mean(!!sym(x$dv))) %>% |
There was a problem hiding this comment.
This technically may be not exactly the same as PRED that comes from Monolix, since you are recalculating it here, right? ctr$data$predictions$PRED
| observed_data <- observed_data %>% | ||
| dplyr::left_join( | ||
| y = pred_data, | ||
| by = dplyr::join_by(ID, TIME) |
There was a problem hiding this comment.
join_by(TIME) might be fragile. I've fixed before one issue before where TIME in observed and simulated dataset was rounded to different number of significant digits #404
There are probably no alternative, but I'm curios to hear your thoughts.
There was a problem hiding this comment.
I don't have a simple answer for this one. On the face of it this feels like something we could solve with a rolling join, but dplyr only supports rolling joins based on inequalities, e.g., join_by(x$TIME <= y$TIME) so we would have to do precomputation to decide direction. An alternative might be to inspect the two data sets to infer the number of significant digits, create temporary variables by rounding both to the lower-precision version if they differ, and then join based on that. It might be more robust, but I'm not sure
| # out[, zmax := pmax(get(nn[[2]]), value)] | ||
| # out[, zmin := pmin(get(nn[[1]]), value)] | ||
|
|
||
| rug_dt <- data.frame(x = as.numeric(vpc_stats$stats$xbin), y = 1) |
There was a problem hiding this comment.
Should it be y=1 or y = NA_real_ (as in previous code) ?
| percentile = as.character(percentile), | ||
| percentile = gsub("^q", "", percentile), | ||
| percentile = as.numeric(percentile) * 100, | ||
| percentile = paste0("p", percentile) |
There was a problem hiding this comment.
Might want to introduce a helper function for all percentile calculations
| ) | ||
|
|
||
| # Danielle: filter out rows in simulated that are not in observed to prevent | ||
| # tidyvpc erroring |
There was a problem hiding this comment.
I believe tidyvpc also errors when there is TIME=0 point. Should there also be a filter for that?
| # within strat = TRUE as default in order to avoid bugs | ||
|
|
||
| # set seed for reproducible binning | ||
| # Danielle: would it make more sense to preserve the seed in the return value, and |
There was a problem hiding this comment.
Feel free to change if it makes more sense
| ) | ||
|
|
||
| # extract/parse arguments needed for tidyvpc | ||
| nbins <- ifelse(is.null(x$bin$n), 10, x$bin$n) # what should be a default values for nbins? |
There was a problem hiding this comment.
10 sounds ok? Related thing, it's possible to set nbins both via nbins and n:
`ctr%>%pmx_plot_vpc(bin = pmx_vpc_bin(style = "quantile", n = 8))
ctr%>%pmx_plot_vpc(bin = pmx_vpc_bin(style = "quantile", nbins = 8))`
Is it robust?
| if (!is.null(obs)) { | ||
| params <- append( | ||
| list( | ||
| mapping = aes(y = .data[[dv]], x = .data[[idv]]), |
There was a problem hiding this comment.
Just to be sure, where does dv comes from in this case?
There was a problem hiding this comment.
@djnavarro Hi Danielle, thank you, I'was able to run the examples as you described. I've submitted a few comments regarding the code. Hope you can make further progress in implementing the res of tidyvpc arguments. Let me know if I should check or help with something.
second pass at #386, building on initial one