I should give a more realistic example in documentation about the fact that boot_specieslevel or boot_networklevel expect a list of one or more data frames of interactions. Each interaction (row in the data frame) must be repeated as many times as it was observed. E.g. if the interaction species_1 x species_2 was observed 5 times, then repeat that row 5 times within the data frame.
One misleading workflow in data preparation now is to build a web matrix with the table function from the row data, that most likely contains an enumeration of interactions. The table function will not consider the abundance of the interactions, unless they are repeated within the raw data. So, there is high risk to lose data. Then, once that web is build with the table function, the user goes to using web_matrix_to_df, which kind of completes a vicious data processing circle.
So, the user tends to build the web matrix from a data frame and then transform the matrix back into an expanded data frame. This is an unnecessary journey and I guess was inspired somehow from how I constructed the example with Safariland from bipartite. But a more real case is to take the raw data with the interactions, enumerate/explode the rows based on some abundance column and then that data is already good to use directly for boot_specieslevel or boot_networklevel. No need to use table, especially that creates a misleading way towards data loss.
So, try to make a more realistic simple usage example of boot_specieslevelandboot_networklevel, without needing to use web_matrix_to_df`, which seems to be reserved rather in rare cases. The user tends to have the raw data more as data frame from Excel than as a web matrix/community matrix.
I should give a more realistic example in documentation about the fact that
boot_specieslevelorboot_networklevelexpect a list of one or more data frames of interactions. Each interaction (row in the data frame) must be repeated as many times as it was observed. E.g. if the interaction species_1 x species_2 was observed 5 times, then repeat that row 5 times within the data frame.One misleading workflow in data preparation now is to build a web matrix with the
tablefunction from the row data, that most likely contains an enumeration of interactions. Thetablefunction will not consider the abundance of the interactions, unless they are repeated within the raw data. So, there is high risk to lose data. Then, once that web is build with thetablefunction, the user goes to usingweb_matrix_to_df, which kind of completes a vicious data processing circle.So, the user tends to build the web matrix from a data frame and then transform the matrix back into an expanded data frame. This is an unnecessary journey and I guess was inspired somehow from how I constructed the example with Safariland from
bipartite. But a more real case is to take the raw data with the interactions, enumerate/explode the rows based on some abundance column and then that data is already good to use directly forboot_specieslevelorboot_networklevel. No need to usetable, especially that creates a misleading way towards data loss.So, try to make a more realistic simple usage example of boot_specieslevel
andboot_networklevel, without needing to useweb_matrix_to_df`, which seems to be reserved rather in rare cases. The user tends to have the raw data more as data frame from Excel than as a web matrix/community matrix.