|
| 1 | +# Namespace for Builtin Functions |
| 2 | + |
| 3 | +## Preamble |
| 4 | + |
| 5 | + Author: Paul Evans <PEVANS> |
| 6 | + Sponsor: |
| 7 | + ID: 0009 |
| 8 | + Status: Exploratory |
| 9 | + |
| 10 | +## Abstract |
| 11 | + |
| 12 | +Define a mechanism by which functions can be requested, for functions that are provided directly by the core interpreter. |
| 13 | + |
| 14 | +## Motivation |
| 15 | + |
| 16 | +The current way used to request utility functions in the language is the `use` statement, to either request some symbols be imported from a utility module (some of which are shipped with perl core; many more available on CPAN), or to enable some as named features with the `use feature` pragma module. |
| 17 | + |
| 18 | +```perl |
| 19 | +use List::Util qw( first any sum ); |
| 20 | +use Scalar::Util qw( refaddr reftype ); |
| 21 | +use feature qw( say fc ); |
| 22 | +... |
| 23 | +``` |
| 24 | + |
| 25 | +In this case, `say` and `fc` are provided by the core interpreter but everything else comes from other files loaded from disk. The `Scalar::Util` and `List::Util` modules happen to be dual-life shipped with core perl, but many others exist on CPAN. |
| 26 | + |
| 27 | +This document proposes a new mechanism for providing commonly-used functions that should be considered part of the language: |
| 28 | + |
| 29 | +```perl |
| 30 | +use builtin qw( reftype ); |
| 31 | +say "The reference type of ref is ", reftype($ref); |
| 32 | +``` |
| 33 | + |
| 34 | +or |
| 35 | + |
| 36 | +```perl |
| 37 | +# on a suitably-recent perl version |
| 38 | +say "The reference type of ref is ", builtin::reftype($ref); |
| 39 | +``` |
| 40 | + |
| 41 | +**Note:** This proposal largely concerns itself with the overal *mechanism* used to provide these functions, and expressly does not go into a full detailed list of individual proposed functions that ought to be provided. A short list is given containing a few likely candidates to start with, in order to experiment with the overall idea. Once found stable, it is expected that more functions can be added on a case-by-case basis; perhaps by using the RFC process or not, as individual cases require. In any case, it is anticipated that this list would be maintained on an ongoing basis as the language continues to evolve, with more functions being added at every release. |
| 42 | + |
| 43 | +## Rationale |
| 44 | + |
| 45 | +While true syntax additions such as infix operators or control-flow keywords have often been added to perl over the years, there has been little advancement in regular functions - or at least, operators that appear to work as regular functions. Where they have been added, the `use feature` mechanism has been used to enable them. This has two notable downsides: |
| 46 | + |
| 47 | +* It confuses users, by conflating the control-flow or true-syntax named features (such as `try` or `postderef`), with ones that simply add keywords that look and feel like regular functions (such as `fc`). |
| 48 | + |
| 49 | +* Because `feature` is a core-shipped module that is part of the interpreter it cannot be dual-life shipped to CPAN, so newly-added functions cannot easily be provided to older perls. |
| 50 | + |
| 51 | +As there have not been many new regular functions added to the core language itself, the preferred mechanism thus far has been to add functions to core-shipped dual-life modules such as [Scalar::Util](https://metacpan.org/pod/Scalar::Util). This itself is not without its downsides. Although it is possible to import functions lexically, almost no modules do this; instead opting on a package-level import into the caller. This has the effect of making every imported utility function visible from the caller's namespace - which can be especially problematic if the caller is attempting to provide a class with methods: |
| 52 | + |
| 53 | +```perl |
| 54 | +package A::Class { |
| 55 | + use List::Util 'sum'; |
| 56 | + ... |
| 57 | +} |
| 58 | + |
| 59 | +say A::Class->new->sum(1, 2, 3); # inadvertantly visible method |
| 60 | + |
| 61 | +# This will result in some large number being printed, because the |
| 62 | +# List::Util::sum() function will be invoked with four arguments - the |
| 63 | +# numerical value of the new instance reference, in addition to the three |
| 64 | +# small integers given. |
| 65 | + |
| 66 | +``` |
| 67 | + |
| 68 | +A related issue here is that the process of adding new named operators to the perl language is very involved and requires a lot of steps - many updates to the lexer, parser, core list of opcodes, and so on. This creates a high barrier-to-entry for any would-be implementors who wish to provide a new regular function. |
| 69 | + |
| 70 | +## Specification |
| 71 | + |
| 72 | +This document proposes an implementation formed from two main components as part of core perl: |
| 73 | + |
| 74 | +1. A new package namespace, `builtin::`, which is always available in the `perl` interpreter (and thus acts in a similar fashion to the existing `CORE::` and `Internals::` namespaces). |
| 75 | + |
| 76 | +2. A new pragma module, `use builtin`, which lexically imports functions from the builtin namespace into its calling scope. |
| 77 | + |
| 78 | +As named pragma modules are currently implemented by the same file import mechanism as regular modules, this necessitates the creation of a `builtin.pm` file to contain at least part of the implementation - perhaps the `&builtin::import` function itself. This being the case, a third component can be created: |
| 79 | + |
| 80 | +3. A new module `builtin.pm` which can be dual-life shipped as a CPAN distribution called `builtin`, to additionally contain an XS implementation of the provided functions. |
| 81 | + |
| 82 | +This combination of components has the following properties: |
| 83 | + |
| 84 | +* By use of the new package namespace, code written for a sufficiently-new version of perl can already make use of the provided functions by their fully-qualified name: |
| 85 | + |
| 86 | +```perl |
| 87 | +say "The reference type of anonymous arrays is ", builtin::reftype([]); |
| 88 | +``` |
| 89 | + |
| 90 | +* Code can be written which imports these functions to act as regular named functions, in a similar way familiar from utility modules like `Scalar::Util`: |
| 91 | + |
| 92 | +```perl |
| 93 | +use builtin 'reftype'; |
| 94 | +say "The reference type of anonymous arrays is ", reftype([]); |
| 95 | +``` |
| 96 | + |
| 97 | +* Named functions are imported lexically into the calling block, not symbolically into the calling package. This does differ from the behaviour of traditional utility function modules, but more closely matches the expectations from other pragma modules such as `feature`, `strict` and `warnings`. Overall it is felt this is justified by its lowercase name suggesting its status as a special pragma module. |
| 98 | + |
| 99 | +* Object classes do not inadvertantly expose them all as named methods: |
| 100 | + |
| 101 | +```perl |
| 102 | +package A::Class { |
| 103 | + use builtin 'sum'; |
| 104 | + ... |
| 105 | +} |
| 106 | + |
| 107 | +say A::Class->new->sum(1, 2, 3); # results in the usual "method not found" exception behaviour |
| 108 | +``` |
| 109 | + |
| 110 | +Although this document does not wish to fully debate the set of functions actually provided, some initial set is required in order to bootstrap the process and experiment with the mechanism. Rather than proposing any new functions with unclear design, I would recommend sticking simply to copying existing widely-used functions that already ship with core perl in utility modules: |
| 111 | + |
| 112 | +* From `Scalar::Util`, copy `blessed`, `refaddr`, `reftype`, `weaken`, `isweak`. |
| 113 | + |
| 114 | +* From `Internals`, copy `getcwd` (because it is used by some core unit tests, and it would be nice to remove it from the `Internals` namespace where it ought never have been in the first place). |
| 115 | + |
| 116 | +## Backwards Compatibility |
| 117 | + |
| 118 | +This proposal does not introduce any new syntax or behavioural change, aside from a new namespace for functions and a new pragma module. As previous perl versions do not have a `builtin::` namespace nor a `use builtin` pragma module, no existing code will be written expecting to make use of them. Thus there is not expected to be any compability concerns. |
| 119 | + |
| 120 | +As a related note, by creating a dual-life distribution containing the `builtin.pm` pragma module along with a polyfill implementation of any functions it ought to contain, this can be shipped to CPAN in order to allow code written using this new mechanism to be at least partly supported by older perl versions. Because the pragma still works as a regular module, code written using the `use builtin ...` syntax would work as intended on older versions of perl if the dual-life `builtin` distribution is installed. |
| 121 | + |
| 122 | +## Security Implications |
| 123 | + |
| 124 | +There are none anticipated security implications of the builtin function mechanism itself. However, individual functions that are added will have to be considered individually. |
| 125 | + |
| 126 | +## Examples |
| 127 | + |
| 128 | +## Prototype Implementation |
| 129 | + |
| 130 | +None yet. |
| 131 | + |
| 132 | +## Future Scope |
| 133 | + |
| 134 | +As this proposal does not go into a full list of what specific functions might be provided by the mechanism, this is the main area to address in future. As a suggestion, I would make the following comments: |
| 135 | + |
| 136 | +* Most of the `Scalar::Util` functions should be candidates |
| 137 | + |
| 138 | +* Most of the `List::Util` functions that do not act as higher-order functionals can probably be included. This would be functions like `sum`, `max`, `pairs`, `uniq`, `head`, etc. The higher-order functionals such as `reduce` or its specialisations like `first` and `any` would not be candidates, because of their "block-like function as first argument" parsing behaviour at compiletime. |
| 139 | + |
| 140 | +* Some of the `POSIX` functions that act abstractly as in-memory data utilities, such as `ceil` and `floor`. I would not recommend adding the bulk of the operating system interaction functions from POSIX. |
| 141 | + |
| 142 | +* There are other RFCs or Pre-RFC discussions that suggest adding new named functions that would be good candidates for this module. They can be considered on their own merit, by reference to this RFC. At time of writing this may include new functions to handle core-supported boolean types (RFC 0008) or the new module-loading function (RFC 0006). |
| 143 | + |
| 144 | +* Once a stable set of functions is defined, consider creating version-numbered bundles in a similar theme to those provided by `feature.pm`: |
| 145 | + |
| 146 | +```perl |
| 147 | +use builtin ':5.40'; # imports all those functions defined by perl v5.40 |
| 148 | +``` |
| 149 | + |
| 150 | +* Once version-numbered bundles exist, consider whether the main `use VERSION` syntax should also enable them; i.e. |
| 151 | + |
| 152 | +```perl |
| 153 | +use v5.40; # Does this imply use builtin ':5.40'; ? |
| 154 | +``` |
| 155 | + |
| 156 | +## Rejected Ideas |
| 157 | + |
| 158 | +### Multiple Namespaces |
| 159 | + |
| 160 | +An initial discussion had been to consider giving multiple namespaces for these functions to live in, such as `string::` or `ref::`. That was eventually rejected as being overly complex, and inviting a huge number of new functions. By sticking to a single namespace for all regular functions, we apply a certain amount of constraining pressure to limit the number of such functions that are provided. |
| 161 | + |
| 162 | +## Open Issues |
| 163 | + |
| 164 | +### Package Name |
| 165 | + |
| 166 | +What is the package name these functions are provided in? |
| 167 | + |
| 168 | +The discussion above used `builtin::`. Other proposed suggestions include `function::` or `std::`. |
| 169 | + |
| 170 | +### Pragma Module Name |
| 171 | + |
| 172 | +What is the module name for the lexical-import pragma? |
| 173 | + |
| 174 | +The discussion above used `use builtin`, to match the package name. Technically it does not have to match the package name. In particular, if during implementation or initial use it is found to be problematic that the name does match, the import module could use a plural form of the same word; as |
| 175 | + |
| 176 | +```perl |
| 177 | +use builtins qw( function names here ); |
| 178 | +``` |
| 179 | + |
| 180 | +### Version Numbering |
| 181 | + |
| 182 | +How to version number the `builtin` pragma module? |
| 183 | + |
| 184 | +This becomes an especially interesting when considering the dual-life module distribution provided as a polyfill for older perls. |
| 185 | + |
| 186 | +A case can be made that its version number should match the version of perl itself for which it provides polyfill functions. Thus, code could write: |
| 187 | + |
| 188 | +```perl |
| 189 | +use builtin v5.40; |
| 190 | +# and now all the builtin:: functions from perl v5.40 are available |
| 191 | +``` |
| 192 | + |
| 193 | +This does initially seem attractive, until one considers the possibility that a dual-life implementation of these polyfills might contain bugs that require later revisions to fix. How would the version numbering of the dual-life distribution reflect the fact that the implementation contains a bugfix on top of these? |
| 194 | + |
| 195 | +### Polyfill for Unavailable Semantics |
| 196 | + |
| 197 | +While not directly related to the question of how to provide builtin functions to new perls, by offering to provide a dual-life module on CPAN as a polyfill for older perl releases, the question arises on what to do if older perls cannot support the semantics of a provided function. The current suggestion of copying existing functions out of places like `Scalar::Util` does not cause this problem, but when we consider some of the additional RFCs we run into some more complex edge-cases. |
| 198 | + |
| 199 | +For example, RFC 0008 proposes adding new functions `true` and `false` to provide real language-level boolean values, and an `isbool` predicate function to enquire whether a given value has boolean intention. The first two can be easily provided on older perls, but polyfilling this latter function is not possible, because the question of "does this value have boolean intention?" is not a meaningful question to ask on such perls. There are a number of possible ways to handle this situation: |
| 200 | + |
| 201 | +1. Refuse to import the symbol - `use builtin 'isbool'` would fail at compiletime |
| 202 | + |
| 203 | +2. Import, but refuse to invoke the function - `if( isbool $x ) { ... }` would throw an exception |
| 204 | + |
| 205 | +3. Give a meaningful but inaccurate answer - `isbool $x` would always return false, as the concept of "boolean intention" does not exist here |
| 206 | + |
| 207 | +Each of these could be argued as the correct behaviour. While it is not directly a question this RFC needs to answer, it is at least acknowledged that some added polyfill functions would have this question, and it would be encouraged that all polyfilled functions should attempt to act as consistently as reasonably possible in this regard. |
| 208 | + |
| 209 | +## Copyright |
| 210 | + |
| 211 | +Copyright (C) 2021, Paul Evans. |
| 212 | + |
| 213 | +This document and code and documentation within it may be used, redistributed and/or modified under the same terms as Perl itself. |
0 commit comments