Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format_col/format_list_item printing generics for customization #3414

Merged
merged 14 commits into from
Aug 6, 2021

Conversation

MichaelChirico
Copy link
Member

Successor to @mllg's original PR (I don't have push access there so just continuing here) #3338.

I've built upon @mllg's initial idea to allow two dimensions of print output customization -- one at the column level and one at the row level (for list columns). The latter was covered in #3338.

Currently test 1293 fails:

# Bug #5435 - print.data.table and digits option:
DT <- structure(list(fisyr = 1995:1996, er = list(c(1, 3), c(1, 3)),
    eg = c(0.0197315833926059, 0.0197315833926059), esal = list(
        c(2329.89763779528, 2423.6811023622), c(2263.07456978967,
        2354.16826003824)), fr = list(c(4, 4), c(4, 4)), fg =
c(0.039310363070415,
    0.039310363070415), fsal = list(c(2520.85433070866, 2520.85433070866
    ), c(2448.55449330784, 2448.55449330784)), mr = list(c(5,
    30), c(5, 30)), mg = c(0.0197779376457164, 0.0197779376457164
    ), msal = list(c(2571.70078740157, 4215.73622047244),
c(2497.94263862333,
    4094.82600382409))), .Names = c("fisyr", "er", "eg", "esal",
"fr", "fg", "fsal", "mr", "mg", "msal"), class = c("data.table",
"data.frame"), row.names = c(NA, -2L))

ans1 = capture.output(print(DT, digits=4, row.names=FALSE))
ans2 = c(" fisyr  er      eg      esal  fr      fg      fsal    mr      mg      msal",
         "  1995 1,3 0.01973 2330,2424 4,4 0.03931 2521,2521  5,30 0.01978 2572,4216",
         "  1996 1,3 0.01973 2263,2354 4,4 0.03931 2449,2449  5,30 0.01978 2498,4095")
test(1293, ans1, ans2)

IIUC the problem is that digits is passed on through ... format_list_item and this messes with the output since numerics in lists have digits applied as well.

In a sense this is a more consistent interpretation, so I didn't bother keeping the status quo behavior, but don't want to overwrite the old test behavior without consulting here first.

@mattdowle
Copy link
Member

Can you explain why not toString please? I don't follow the comments that followed #3338 (comment). R 3.1.0 has toString so no backporting is needed.

@MichaelChirico
Copy link
Member Author

Its just shy of being flexible enough; see your comment here:

#2562 (comment)

@mattdowle
Copy link
Member

I sort of see. Just shy meaning the space in collapse=", " in toString.default? This PR is defining our own methods though. Can't they be toString methods rather than our own new class? We will rarely be falling back to toString.default and if we do, we could always mask toString.default with our own that uses collapse=",".

@MichaelChirico
Copy link
Member Author

OK. For some reason I thought toString was somehow recent but apparently it's quite old:

https://github.com/wch/r-source/blame/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/library/base/R/toString.R

I'm still not quite convinced though, as I'm still hung up on the following:

This PR introduces two new generics: format_col and format_list_item. Are we aiming to replace both of these by toString? Seems like it reduces flexibility. Otherwise, we keep just one and then have e.g. format_col and toString for column/list item formatting?

@MichaelChirico
Copy link
Member Author

@mattdowle merged this to master and overwrote changes built in #3500 with what was originally proposed here. I think the approach here is cleaner. Also subsumed the NEWS note there into the item for this PR.

@codecov
Copy link

codecov bot commented May 3, 2019

Codecov Report

Merging #3414 (c57336b) into master (2791043) will decrease coverage by 2.29%.
The diff coverage is 100.00%.

❗ Current head c57336b differs from pull request most recent head 51389d4. Consider uploading reports for the commit 51389d4 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3414      +/-   ##
==========================================
- Coverage   99.47%   97.17%   -2.30%     
==========================================
  Files          75       66       -9     
  Lines       14808    12632    -2176     
==========================================
- Hits        14730    12275    -2455     
- Misses         78      357     +279     
Impacted Files Coverage Δ
R/print.data.table.R 92.30% <100.00%> (-7.70%) ⬇️
R/last.R 61.11% <0.00%> (-38.89%) ⬇️
src/fcast.c 72.88% <0.00%> (-27.12%) ⬇️
R/cedta.R 87.50% <0.00%> (-12.50%) ⬇️
src/fmelt.c 88.63% <0.00%> (-11.37%) ⬇️
R/fcast.R 90.42% <0.00%> (-9.58%) ⬇️
src/frank.c 91.15% <0.00%> (-8.85%) ⬇️
R/utils.R 91.66% <0.00%> (-8.34%) ⬇️
R/bmerge.R 93.22% <0.00%> (-6.78%) ⬇️
src/dogroups.c 93.03% <0.00%> (-6.64%) ⬇️
... and 62 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0820aab...51389d4. Read the comment docs.

@mllg
Copy link
Contributor

mllg commented Jun 27, 2019

Just checking back ... is this PR dead?

FWIW, here is the approach of custom printers for tibbles: https://pillar.r-lib.org/reference/pillar_shaft.html

@randomgambit
Copy link

happy to do some testing when ready. This is a must have

@MichaelChirico
Copy link
Member Author

Related: #605

@mattdowle mattdowle modified the milestones: 1.12.4, 1.13.0 Sep 19, 2019
@mattdowle mattdowle modified the milestones: 1.12.7, 1.12.9 Dec 8, 2019
@MichaelChirico
Copy link
Member Author

OK finally merged & updated this PR against current master. We fail one test:

DT2 = data.table(
  Dcol = as.Date('2016-01-01') + 0:2,
  Pcol = as.POSIXct('2016-01-01 01:00:00', tz = 'UTC') + 86400L*(0:2),
  gcol = TRUE, Icol = as.IDate(16801) + 0:2,
  ucol = `class<-`(1:3, 'asdf')
)
test(1610.2, capture.output(print(DT2, class=TRUE)),
     c("         Dcol                Pcol   gcol       Icol   ucol",
       "       <Date>              <POSc> <lgcl>     <IDat> <asdf>",
       "1: 2016-01-01 2016-01-01 01:00:00   TRUE 2016-01-01      1",
       "2: 2016-01-02 2016-01-02 01:00:00   TRUE 2016-01-02      2",
       "3: 2016-01-03 2016-01-03 01:00:00   TRUE 2016-01-03      3"))

That's because in this PR I removed the timezone argument... I originally did so before a release.

Now I guess we'd have to do a deprecation cycle. Or should we continue to support timezone argument?

Default behavior in this PR is to always print the time zone if it's there.

.ci/ci.R Outdated Show resolved Hide resolved
@MichaelChirico
Copy link
Member Author

MichaelChirico commented May 19, 2020

TODO: address expression column printing:

#4196 (comment)

nvm, it's already done

mattdowle added a commit that referenced this pull request Aug 5, 2021
@mattdowle
Copy link
Member

@MichaelChirico This looks good to me to merge now. You?

char.trunc <- function(x, trunc.char = getOption("datatable.prettyprint.char")) {
trunc.char = max(0L, suppressWarnings(as.integer(trunc.char[1L])), na.rm=TRUE)
if (!is.character(x) || trunc.char <= 0L) return(x)
idx = which(nchar(x) > trunc.char)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not crucial for this PR in particular, but wouldn't nchar(x, 'width') be more appropriate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Would need a test too in a new PR: your Chinese multi-byte data?

@MichaelChirico
Copy link
Member Author

Made a few tweaks, ready for merge now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants