Sometimes your sampling rate is too high; group_every allows you to
down-sample by creating "bins" which can subsequently be summarised on. When using n
, data
needs to be regularly sampled; if there are gaps in time, the bin duration will differ.
Works well with calculate_summary()
for movement data.
Examples
## Group by every 5 seconds
df_time <- data.frame(
time = seq(from = 0.02, to = 100, by = 1/30), # time at 30Hz, slightly offset
y = rnorm(3000)) # random numbers
df_time |>
group_every(seconds = 5) |> # group for every 5 seconds
dplyr::summarise(time = min(time), # summarise for time and y
mean_y = mean(y)) |>
dplyr::mutate(time = floor(time)) # floor to get the round second number
#> # A tibble: 20 × 3
#> bin time mean_y
#> <dbl> <dbl> <dbl>
#> 1 0 0 -0.0869
#> 2 1 5 -0.159
#> 3 2 10 0.0381
#> 4 3 15 0.000112
#> 5 4 20 -0.180
#> 6 5 25 -0.116
#> 7 6 30 0.103
#> 8 7 35 0.0454
#> 9 8 40 0.121
#> 10 9 45 -0.0381
#> 11 10 50 -0.000178
#> 12 11 55 -0.0674
#> 13 12 60 0.0563
#> 14 13 65 -0.000104
#> 15 14 70 0.00515
#> 16 15 75 0.0320
#> 17 16 80 0.114
#> 18 17 85 0.0628
#> 19 18 90 0.00866
#> 20 19 95 0.0418
# Group every n observations
df <- data.frame(
x = seq(1:1000),
y = rnorm(1000))
df |>
group_every(n = 30) |> # group every 30 observations together
dplyr::summarise(mean_x = mean(x),
mean_y = mean(y))
#> # A tibble: 34 × 3
#> bin mean_x mean_y
#> <dbl> <dbl> <dbl>
#> 1 1 15.5 -0.208
#> 2 2 45.5 -0.274
#> 3 3 75.5 -0.463
#> 4 4 106. -0.0246
#> 5 5 136. -0.211
#> 6 6 166. -0.0850
#> 7 7 196. 0.0452
#> 8 8 226. -0.0140
#> 9 9 256. -0.163
#> 10 10 286. 0.219
#> # ℹ 24 more rows