BenchmarkScore implementation#36
Conversation
- 'task_id' and 'learner_id' have to be factors
|
@sebffischer still needs some discussion about documentation, etc. before reviewing |
* remove independent parameter * better doc * no check for ntasks as this class is the container only
|
@berndbischl @sebffischer please review - it's just the new container Class. I will do a new PR with the refactoring of BenchmarkAggr and the separate tests. |
|
So I removed the |
sebffischer
left a comment
There was a problem hiding this comment.
Thanks a lot already, looking good! :) Just let me know when you disagree on something
| #' for example the result of [mlr3::BenchmarkResult]`$aggregate` or via [as_benchmark_aggr], | ||
| #' or by passing in a custom dataset of results. Custom datasets must include at the very least, | ||
| #' a character column for learner ids, a character column for task ids, and numeric columns for | ||
| #' a factor column for learner ids, a factor column for task ids, and numeric columns for |
There was a problem hiding this comment.
was this incorrectly documented or why was this changed?
| #' @param task_id (`factor(1)`) \cr | ||
| #' String specifying name of task id column. | ||
| #' @param learner_id (`character(1)`)\cr | ||
| #' @param learner_id (`factor(1)`)\cr |
There was a problem hiding this comment.
why must this be a factor? I don't want to wrote learner_id = factor("regr.lm") I want to write learner_id = "regr.lm"
|
|
||
| #' @description Subsets the data by given tasks or learners. | ||
| #' Returns data as [data.table::data.table]. | ||
| #' @param task (`character()`) \cr |
There was a problem hiding this comment.
I might even tend to remove this subset method as it is rather a class method than an instance method (from which you would expect it to modify the instance in-place I guess, at least that would be in line with the rest of mlr3). What do you think?
| #' # equivalently | ||
| #' as_benchmark_score(df, task_id = "tasks", learner_id = "learners", iteration = "iters") | ||
| #' | ||
| #' if (requireNamespaces(c("mlr3", "rpart"))) { |
There was a problem hiding this comment.
there is require_namespaces() in mlr3misc, I think I would use it and remove the requireNamespaces helper function from mlr3benchmark as they do the same thing, no need to keep it twice.
| #' | ||
| #' @details This class is used as a container of benchmarking results where | ||
| #' multiple learners (models) have been tested against multiple tasks (datasets) | ||
| #' using a resampling scheme. The results stored are the per-resampling |
There was a problem hiding this comment.
here it seems like the resampling scheme is the same for all task-learner combinations. Is this the case? Otherwise rephrase.
| #' @field iterations `(numeric())` \cr Unique resampling iterations. | ||
| iterations = function() unique(private$.dt[[self$col_roles$iteration]]), | ||
| #' @field measures `(character())` \cr Unique measure names. | ||
| measures = function() setdiff(colnames(private$.dt), unlist(self$col_roles)), |
There was a problem hiding this comment.
we don't really support adding measures, right? so we can also compute this once and return it here
|
|
||
| private = list( | ||
| .col_roles = character(0), | ||
| .dt = data.table() |
There was a problem hiding this comment.
usually we initialize everything to NULL
| ) | ||
| ) | ||
|
|
||
| #' @title Coercions to BenchmarkScore |
There was a problem hiding this comment.
| #' @title Coercions to BenchmarkScore | |
| #' @title Conversion to BenchmarkScore |
|
|
||
| #' @title Coercions to BenchmarkScore | ||
| #' | ||
| #' @description Coercion methods to [BenchmarkScore]. |
There was a problem hiding this comment.
| #' @description Coercion methods to [BenchmarkScore]. | |
| #' @description Conversion methods to [BenchmarkScore]. |
| loss$lower = loss[, meas] - se * stats::qnorm(1 - (1 - level) / 2) | ||
| loss$upper = loss[, meas] + se * stats::qnorm(1 - (1 - level) / 2) | ||
| ggplot(data = loss, aes_string(x = object$col_roles$learner_id, y = meas)) + | ||
| ggplot(data = loss, aes(x = .data[[object$col_roles$learner_id]], y = .data[[meas]])) + |
There was a problem hiding this comment.
you need to set .data to NULL otherwise some tools will complain that it does not exist
No description provided.