Skip to content

best method for downsampling? #2

@jeff-goldsmith

Description

@jeff-goldsmith

if you have functions measured over a rich grid, you might want to downsample (e.g. go from minute-level wearable device data to 5 minute or one-hour increments). if that's your goal, you might prefer to average over bins -- but i don't think there's a good way to do that right now, is there?

tf_evaluate lets you evaluate on a new domain, but uses interpolation rather than averaging. and tf_integrate could work, sort of, in that you get average value between lower and upper -- but it produces a scalar, and you'd have to do some kind of loop.

for what it's worth, my current work around is to unnest, aggregate, then nest and re-merge. something like:

hour_data = 
  activity_df %>% 
  select(id, activity) %>% 
  tf_unnest(activity) %>% 
  mutate(hour = floor((activity_arg -1) / 60)) %>% 
  group_by(id, hour) %>% 
  summarize(act_hour = mean(activity_value)) %>% 
  tf_nest(.id = id, .arg = hour)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions