Function for generating RLE-like groups

I've seen this scenario come across quite a few times on SO:

``` r
require(data.table)
set.seed(2L)
DT <- data.table(x=sample(3,10,TRUE), y=1:10)
#     x  y
#  1: 1  1
#  2: 3  2
#  3: 2  3
#  4: 1  4
#  5: 3  5
#  6: 3  6
#  7: 1  7
#  8: 3  8
#  9: 2  9
#10: 2 10
```

Now add a column `z`, based on column `x`, that starts from `1` and retains the same value (or group) as long as the successive values are the same. That is, in this case, `z` is:

``` r
z <- as.integer(c(1, 2, 3, 4, 5, 5, 6, 7, 8, 8))
#  [1] 1 2 3 4 5 5 6 7 8 8
```

This can be accomplished quite easily with `data.table`'s internal utility functions `uniqlist` and `uniqlengths`. Here's a preliminary illustration:

``` r
rle_index <- function(vec) {
    ulist = data.table:::uniqlist(list(vec)) ## no copy in R 3.1.0+
    ulen = data.table:::uniqlengths(ulist, length(vec))
    rep(seq_along(ulist), ulen)
}
rle_index(DT$x)
#  [1] 1 2 3 4 5 5 6 7 8 8
```

So, the usage would be typically:

``` r
DT[, z := rle_index(x)]
## or to use with grouping
DT[, sum(y), by=list(rle_index(x))]
```

Here's a [SO post](http://stackoverflow.com/q/21421047/559784) and [another](http://stackoverflow.com/q/21511257/559784).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Function for generating RLE-like groups #686

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Function for generating RLE-like groups #686

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions