Home:ALL Converter>R generate lagged value based on outstanding calculated value in data frame / tibble

R generate lagged value based on outstanding calculated value in data frame / tibble

Ask Time:2020-05-05T03:00:55         Author:lwarode

Json Formatter

Since I am not that experienced using loops and related loop-equivalent functions in R (e.g. purrr and apply() related functions, I think that my problem is rather easy to solve. However, spending some time without any certain results, asking the community seems to be more reasonable.

In order to understand the problem imagine using the data from mtcars package. I want to create a variable, which is based on a certain value, e.g. wt(weight). Therefore the data frame is arranged in descending order as follows:

library(tidyverse)
library(mtcars)

df <- mtcars %>% 
  arrange(desc(wt)) 

In the following I would like to create a variable which is based on the highest value of wt. I want to divide every value based on the lagged value with a certain divisor (2), which is yet outstanding. However, imagine the values not to be calculated, the code would look like:

df <- mtcars %>% 
  arrange(desc(wt)) %>% 
  mutate(wt_2 = if_else(wt == max(wt),
                        wt,
                        lag(wt_2) / 2))

I know that mutate is not properly working since wt_2 needs to be created for the else argument, but it would work if one would specify in a new section of code. This would imply:

df <- mtcars %>% 
  arrange(desc(wt)) %>% 
  mutate(wt_2 = if_else(wt == max(wt),
                        wt,
                        0)) %>% 
  mutate(wt_2 = if_else(wt_2 != max(wt),
                        lag(wt_2) / 2,
                        wt_2))

However, only the second observation gets assigned with a calculated value. The problem is that the values which should be assigned to the variable needs to be calculated in advance. Therefore I think a certain kind of looping mechanism is necessary. With the mentioned code used only the second observation is getting assigned with a calculated value:

glimpse(df$wt_2)
 num [1:32] 5.425 2.71 1.36 0.68 0.34 ...

The third value should be 2.71 / 2 = 1.355. The fourth value 1.355 / 2 and so one…

The new variable wt_2 should not refer to wt except for the highest value (5.42 or 5.425 not rounded). Each observation should be assigned with the lagged (using the logic of lag) value of the previous observation of the same variable divided by 2 (or another value but for this example I decided to choose 2 as the divisor). However, the problem is that it is not possible using the code since only the first observation or the first and the second observation are getting assigned with the values. It would be possible to calculate every value manually, but it should be also possible to get the values calculated more easily with using a loop related function.

Author:lwarode,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/61599652/r-generate-lagged-value-based-on-outstanding-calculated-value-in-data-frame-ti
yy