Skip to contents

Instead of writing multiple if-clauses to recode values into a new variable, you can use formats to recode a variable into a new one.

Usage

recode(data_frame, new_var, ...)

recode_multi(data_frame, ...)

Arguments

data_frame

A data frame which contains the the original variables to recode.

new_var

The name of the newly created and recoded variable.

...

recode() Pass in the original variable name that should be recoded along with the corresponding format container in the form: variable = format.

In recode_multi() multiple variables can be recoded in one go and multilabels can be applied. This overwrites the original variables and duplicates rows if multilabels are applied. In occasions were you want to use format containers to afterwards perform operations with other packages, you can make use of this principle with this function.

Value

Returns a data frame with the newly recoded variable.

Details

recode() is based on the 'SAS' function put(), which provides an efficient and readable way, to generate new variables with the help of formats.

When creating a format you can basically write code like you think: This new category consists of these original values. And after that you just apply these new categories to the original values to create a new variable. No need for multiple if_else statements.

See also

Creating formats: discrete_format() and interval_format().

Functions that also make use of formats: frequencies(), crosstabs(), any_table().

Examples

# Example formats
age. <- discrete_format(
    "under 18"       = 0:17,
    "18 to under 25" = 18:24,
    "25 to under 55" = 25:54,
    "55 to under 65" = 55:64,
    "65 and older"   = 65:100)

# Example data frame
my_data <- dummy_data(1000)

# Call function
my_data <- my_data |> recode("age_group1", age = age.)

# Formats can also be passed as characters
my_data <- my_data |> recode("age_group2", age = "age.")

# Multilabel recode
sex. <- discrete_format(
    "Total"  = 1:2,
    "Male"   = 1,
    "Female" = 2)

income. <- interval_format(
    "Total"              = 0:99999,
    "below 500"          = 0:499,
    "500 to under 1000"  = 500:999,
    "1000 to under 2000" = 1000:1999,
    "2000 and more"      = 2000:99999)

multi_data <- my_data |> recode_multi(sex = sex., income = income.)