Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lua: let List constructor split string argument on whitespace #9835

Closed
bpj opened this issue Jun 1, 2024 · 3 comments
Closed

Lua: let List constructor split string argument on whitespace #9835

bpj opened this issue Jun 1, 2024 · 3 comments

Comments

@bpj
Copy link

bpj commented Jun 1, 2024

Describe your proposed improvement and the problem it solves.

A little more DWIMery in the Lua API: let the List constructor when given a string as argument split that string on whitespace, so that for example List("foo bar baz") returns a three-element list {"foo", "bar", "baz"}.

This is useful in the (to me at least) common case where you want to create a list of classes, or when you want to loop over a set list of strings, e.g. e.g. keys of metadata fields to validate.

I'm the first to admit that this is very lazy,1 but the class field in a table passed to pandoc.Attr already works like this

Describe alternatives you've considered.

pandoc.Attr({class = string}).classes

works so-so if you already have a string in a variable.

Currently I'm using a function which uses string:gmatch('%S+') but complete with checking whether the argument is already table-ish and the loop around gmatch2 that's quite a bit of boilerplate in almost every filter I write. (I do have my own utilities library, but when I may be going to share the filter with others I always end up copying the functions I use into the filter file!)

Footnotes

  1. I also admit that I'm missing Perl's @array = qw/foo bar baz/ operator and @array = $string =~ /\S+/g construct!

  2. helper.to_list = function(val, pat)
      if 'table' ~= type(val) then
        local str = tostring(val)
        pat       = tostring(pat or '%S+')
        val       = { }
        for s in str:gmatch(pat) do
          val[#val + 1] = s
        end
      end
      return pandoc.List(val)
    end
    

    As you can see this function does a bit more in that it allows a custom pattern but I'm not asking for that!

@bpj bpj added the enhancement label Jun 1, 2024
@tarleb
Copy link
Collaborator

tarleb commented Jun 1, 2024

I wouldn't want to modify pandoc.List, but I'd be open to towards adding a pandoc.text.split function, e.g., by wrapping Data.Text.splitOn.

@tarleb
Copy link
Collaborator

tarleb commented Jun 2, 2024

Alternative: we could also make it easier to turn the gmatch iterator into a list:

pandoc.List.from_iterator(str:gmatch '%S+')

That could also be used with other iterators like io.lines, file:lines, pairs, etc.

local my_keys = pandoc.List.from_iterator(pairs(my_table))

We could probably overload pandoc.List for extra convenience.

pandoc.List(str:gmatch '%S+')

@tarleb tarleb closed this as completed in 05aa184 Sep 21, 2024
@bpj
Copy link
Author

bpj commented Sep 21, 2024

Great! 👍The Penlight List class constructor does something similar by calling its iterator generator on anything which isn't a table. That can actually be nasty since a string becomes a list of bytes, which is easy to forget. Taking an iterator as argument is much better!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants