Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUGZILLA #17216] substring() and propagation of names #6391

Open
MichaelChirico opened this issue May 19, 2020 · 3 comments
Open

[BUGZILLA #17216] substring() and propagation of names #6391

MichaelChirico opened this issue May 19, 2020 · 3 comments

Comments

@MichaelChirico
Copy link
Owner

Quoting https://stat.ethz.ch/pipermail/r-devel/2014-January/068167.html :

-------------------
On 12/13/2013 01:07 AM, Hervé Pagès wrote:

Hi,

In R < 3.0.0, we used to get:

> substring(c(A="abcdefghij", B="123456789"), 2, 6:2)
A       B       A       B       A
"bcdef"  "2345"   "bcd"    "23"     "b"

But in R >= 3.0.0, we get:

> substring(c(A="abcdefghij", B="123456789"), 2, 6:2)
[1] "bcdef" "2345"  "bcd"   "23"    "b"

The names are not propagated anymore.

Looks like a regression introduced at commit 59891 where many functions
were modified to use rep_len() instead of rep() internally. Problem is,
unlike rep(), rep_len() does not propagate the names. Given the number
of files that were touched by this commit, a lot of functions could be
affected. For example complex():

In R < 3.0.0:

complex(modulus=c(a=-1, b=0.77))
        a        b
 -1.00+0i  0.77+0i

But in R >= 3.0.0:

complex(modulus=c(a=-1, b=0.77))
 [1] -1.00+0i  0.77+0i

etc...

Cheers,
H.


Is this an intended change or a bug? I can't find anything about
this in the NEWS file. The man page for substring() in R >= 3.0.0
still states:

Value:

...

For ‘substring’, a character vector of length the longest of the
arguments.  This will have names taken from ‘x’ (if it has any
after coercion, repeated as needed), and other attributes copied
from ‘x’ if it is the longest of the arguments).

Also note that the first argument of substring() is 'text' not 'x'.

Thanks,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
<CENSORING FROM DETECTED PHONE NUMBER ONWARDS; SEE BUGZILLA>


METADATA

  • Bug author - Suharto Anggono
  • Creation time - 2017-01-21 06:18:31 UTC
  • Bugzilla link
  • Status - ASSIGNED
  • Alias - None
  • Component - Misc
  • Version - R 3.3.*
  • Hardware - All All
  • Importance - P5 minor
  • Assignee - R-core
  • URL - https://stat.ethz.ch/pipermail/r-deve...
  • Modification time - 2020-02-03 08:23 UTC
@MichaelChirico
Copy link
Owner Author

For the
substring(c(A="abcdefghij", B="123456789"), 2, 6:2)

there is still a contradiction with the substring() documentation (which says names will be preserved).


METADATA

  • Comment author - elin.waring
  • Timestamp - 2020-01-17 02:59:37 UTC

@MichaelChirico
Copy link
Owner Author

Since this is a backward compatibility break that happened in 2013 or maybe 2017, I think that correcting would potentially cause more problems.
However, the incorrect documentation needs to be fixed.
This could be moved into the documentation category.


METADATA

  • Comment author - elin.waring
  • Timestamp - 2020-02-02 15:40:55 UTC

@MichaelChirico
Copy link
Owner Author

(In reply to elin.waring from comment #2)

Since this is a backward compatibility break that happened in 2013 or maybe
2017, I think that correcting would potentially cause more problems.
However, the incorrect documentation needs to be fixed. 
This could be moved into the documentation category.

Good question. Thank you for unearthing the issue.
Herve Page's analysis that this happened with 59891 is correct: Replacing rep() with rep_len() in many functions (on July 19, 2012, not later)
.. when rep_len() is documented to not return attributes, hence to drop names.

You are right, Elin, that fixing it now may cause quite a bit of problems in checks (e.g. of CRAN and Bioc packages).
OTOH, it is really "not R-like" to drop names in a transformation such as substring() -- or 'complex()' for that matter.
I'd argue we should change back to become "R-like" for consistency.

R-core -- and others: this is an RFC, .. maybe we one I should even voice on R-devel?


METADATA

  • Comment author - Martin Maechler
  • Timestamp - 2020-02-03 08:23:01 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant