sqrt_trans, scale limit expansion, and missing breaks

Prompted by a posting on the mailing list (https://groups.google.com/d/topic/ggplot2/IUje5H0jwm4).
# Summary

Specific problem: Breaks near 0 are not displayed when the square root transformation is applied to a scale.

General problem: Scale expansion in transformed coordinate space can lead to values which are not meaningfully (or correctly) invertable to data space leading to improperly excluded breaks.
# Reproducible example:

``` r
library("ggplot2")
library("scales")

DF <- data.frame(x = seq(0,1,by=0.1),
                 y = seq(0,1,by=0.1))

ggplot(DF, aes(x=x, y=y)) + 
  geom_point() + 
  scale_x_sqrt() +
  scale_y_continuous()
```
# Expected result

A plot with breaks labeled at 0, 0.25, 0.50, 0.75, and 1.00
# Actual results

![Actual results](https://cloud.githubusercontent.com/assets/463252/3401986/6130d7b2-fd56-11e3-8b1f-8c62e15899fd.png)

Note that there is no 0 break on the x-axis.
# Discussion

The error occurs because when the limits (in coordinate space) are expanded, there are negative values which, when transformed back to data space, give the incorrect limits from which breaks are determined (or at least limited). Stepping through the effective steps that occur for getting the breaks shows:

``` r
st <- sqrt_trans()
(x<-st$transform(c(0,1)))
## [1] 0 1
(x<-expand_range(x, 0.05, 0))
## [1] -0.05  1.05
(limits<-st$inverse(x))
## [1] 0.0025 1.1025
(breaks<-st$breaks(limits))
## [1] 0.00 0.25 0.50 0.75 1.00
st$trans(breaks)
## [1] 0.0000 0.5000 0.7071 0.8660 1.0000
st$trans(limits)
## [1] 0.05 1.05
censor(st$trans(breaks), st$trans(limits))
## [1]     NA 0.5000 0.7071 0.8660 1.0000
```

The real problem is that the result of the `expand_range` call lies outside the domain of the transformation. How should extra-domain values be treated? 
# Workarounds
## Don't square negative values

One solution to this problem is an alternative transformation, one that does not invert negative values. A transformation should be one-to-one (within its domain) and `sqrt_trans` is, but it happily will run the inverse on negative values which can not occur if everything is constrained within the domain. A simple approach is to just map all negative values to 0

``` r
mysqrt_trans <- function() {
  trans_new("mysqrt", 
            transform = base::sqrt,
            inverse = function(x) ifelse(x<0, 0, x^2),
            domain = c(0, Inf))
}
```
## Squish range before inverting

If we assume that all transformations are monotonic (I'm not sure if ggplot2/scales assume transformations are monotonic or just one-to-one; I can not come up with a useful transformation which is not, though I can create a pathological one.), then it is reasonable to squish any values outside the range (not domain) of the transformation. Bringing them back to the nearest extreme should be sufficient. Therefore a more general approach for an inverse would be

``` r
mysqrt_trans <- function() {
  domain <- c(0, Inf)
  transform <- base::sqrt
  range <- transform(domain)
  trans_new("mysqrt", 
            transform = transform,
            inverse = function(x) squish(x, range=range)^2,
            domain = domain)
}
```
## Squish to range whenever values are extended

This approach makes it the responsibility of the code which manipulates transformed (coordinate space) values to squish those to the appropriate range if there is any chance that that range is violated. If monotonicity is assumed, I think any interpolations should be safe, but any operation which can result in a value more extreme than the existing most extreme values would need to be squished. If this approach is taken, it would be worth adding an additional component `range` to the `trans` which is just the result of `transform(domain)`.

The transformation, then, could have its inverse just assume that the data is in the range or it can check that before proceeding (just as now transform may or may not check domain before proceeding). Ideally, the transformation should throw an error if either `transform` is called with values outside `domain` or `inverse` is called with values outside `range` and this would help pick out places where calling code is not behaving appropriately.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sqrt_trans, scale limit expansion, and missing breaks #980

Summary

Reproducible example:

Expected result

Actual results

Discussion

Workarounds

Don't square negative values

Squish range before inverting

Squish to range whenever values are extended

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

sqrt_trans, scale limit expansion, and missing breaks #980

Description

Summary

Reproducible example:

Expected result

Actual results

Discussion

Workarounds

Don't square negative values

Squish range before inverting

Squish to range whenever values are extended

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions