Title: | Rescales Data Based on Other Data |
---|---|
Description: | Rescales columns in a data frame based on the columns in a second data frame. For example a column can be rescaled by subtracting the mean and dividing by the standard deviation. The user can pass the names of functions which calculate the value to subtract and/or the value to divide by. The package was developed for making predictions based on models with rescaled variables. For the predictions to be valid the new data frame must have its predictor variables rescaled based on the original data. |
Authors: | Joe Thorley [cre, aut] , Ayla Pearson [ctb] , Nadine Hussein [ctb] , Poisson Consulting [fnd, cph] |
Maintainer: | Joe Thorley <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.0.9006 |
Built: | 2024-11-01 16:19:46 UTC |
Source: | https://github.com/poissonconsulting/rescale |
Check Valid Rescalers
check_valid_rescalers(x, x_name = substitute(x))
check_valid_rescalers(x, x_name = substitute(x))
x |
A character vector to test for valid rescalers. |
x_name |
A string of the object name. |
An informative error if the check failed otherwise the original object.
Get Rescaler Colnames
get_rescaler_colnames(x)
get_rescaler_colnames(x)
x |
A character vector. |
A character vector of the rescaler column names.
get_rescaler_colnames(c("log(mean)*", "sqrt(cc)="))
get_rescaler_colnames(c("log(mean)*", "sqrt(cc)="))
Centers and scales columns in a data frame based on the columns in a second
data frame. A column is centered by subtracting the mean and scaled by
dividing by the standard deviation. Columns in scale
are automatically
added to center
so they are standardised.
rescale(data, data2 = data, center = character(0), scale = character(0))
rescale(data, data2 = data, center = character(0), scale = character(0))
data |
The data frame to center and scale. |
data2 |
A data frame to use for the centering and scaling. |
center |
A character vector of the columns to center. |
scale |
A character vector of the columns to scale (after centering). |
The data frame with rescaled columns.
scale
, rescale_f
and rescale_c
rescale(datasets::mtcars, scale = "mpg")
rescale(datasets::mtcars, scale = "mpg")
Rescales transformed columns in a data frame based on the transformed columns in a second data frame. The columns are rescaled by subtracting values and then dividing by values.
rescale_c(data, data2 = data, colnames = character(0))
rescale_c(data, data2 = data, colnames = character(0))
data |
The data frame to rescale. |
data2 |
A data frame to use for the rescaling. |
colnames |
A character vector of column names to transform and/or rescale. |
The column names can include a single function for the transform as well as the following suffices: + (subtract mean), - (subtract min), =(subtract min and add 1), / (divide by sd) and * (subtract mean and divide by sd).
The data frame with transformed and rescaled columns.
Rescales transformed columns in a data frame based on the transformed columns in a second data frame. The columns are rescaled by subtracting values and then dividing by values.
rescale_f( data, data2 = data, transform = list(), subtract = list(), divide_by = list() )
rescale_f( data, data2 = data, transform = list(), subtract = list(), divide_by = list() )
data |
The data frame to rescale. |
data2 |
A data frame to use for the rescaling. |
transform |
A named list where the name(s) indicate the function(s) to use for the transformations and the elements indicate the columns to transform. |
subtract |
A named list where the name(s) indicate the function(s) to use when determining the value to subtract and the elements indicate the columns. |
divide_by |
A named list where the name(s) indicate the function(s) to use when determining the value to divide by and the elements indicate the columns. |
The data frame with transformed and rescaled columns.
rescale_f(datasets::mtcars, transform = list(log = "mpg"), subtract = list(mean = c("mpg", "disp"), min = "gear"), divide_by = list(sd = c("mpg", "hp")) )
rescale_f(datasets::mtcars, transform = list(log = "mpg"), subtract = list(mean = c("mpg", "disp"), min = "gear"), divide_by = list(sd = c("mpg", "hp")) )