Description
Hi @robitalec ,
let's take the following case
datetime <- as.POSIXct(c("2019-06-01 00:00:01",
"2019-06-01 00:01:38",
"2019-06-01 00:05:49",
"2019-06-01 00:06:01",
"2019-06-01 00:07:59",
"2019-06-01 00:11:04"), tz = "UTC")
user <- c(rep("A", 5), "B")
dt <- data.table(datetime, user)
What I would like to find out is whether there are any records of users that are <= 5 minutes apart from a record of another user (and also colocated, but let's leave that out for the moment)
The spatsoc
way would be to slice the datetime into 5 minute pieces and see if records fall into those.
group_times(dt, datetime = "datetime", threshold = "5 minutes")
Result
datetime user minutes imegroup
1: 2019-06-01 00:00:01 A 0 1
2: 2019-06-01 00:01:38 A 0 1
3: 2019-06-01 00:05:49 A 5 2
4: 2019-06-01 00:06:01 A 5 2
5: 2019-06-01 00:07:59 A 5 2
6: 2019-06-01 00:11:04 B 10 3
This, however, puts rows 5 and 6 into different groups — although their observations are within the defined threshold. So, in the end, I would rather like to get something like
datetime user timegroup
1: 2019-06-01 00:07:59 A 1
2: 2019-06-01 00:11:04 B 1
or even
ego_datetime ego_user alter_datetime alter_user
1: 2019-06-01 00:07:59 A 2019-06-01 00:11:04 B
Are there any plans to do something like this? Or maybe I've missed something in the library?
There are robust methods to deal with that type of stuff, but they are quite brute-force (copious use of left joins...) and maybe / probably there is a more efficient way.