Skip to content

Conversation

@dumbbell
Copy link
Collaborator

@dumbbell dumbbell commented Feb 1, 2022

Let's take the following code:

-export([outer_function/2]).

outer_function(#my_record{field = ...} = MyRecord, Term) ->
    (...),
    inner_function(MyRecord, Term);
outer_function(_, _) ->
    (...).

inner_function(#my_record{field = ...} = MyRecord, Term) ->
    (...);
inner_function(_, _) ->
    (...).

Both outer_function/2 and inner_function/2 take a tuple as their first argument and do a pattern matching with it. outer_function/2 is exported and inner_function/2 is only called by outer_function/2.

In this situation, the compiler may perform an optimization on inner_function/2: it is sure that the function will take a tuple of the expected form because it was already checked in outer_function/2 and it is the only caller. Therefore, it will skip any verifications and will read the content of the tuple directly using the get_tuple_element instruction.

When compiled from source, the compiler adds annotations of the following form at the beginning of inner_function/2 assembly:

{'%', {var_info, {x,0}, [{type, {t_tuple, 2, true, #{...}}}]}}.

However, the assembly we recover from an already compiled module misses those annotations. And when we compile the same assembly without those annotations, the compilation fails with:

{bad_type,
 {needed, {t_tuple, _Size, _, _Fields} = NeededType},
 {actual, any}}

This is caused by checks that we are only passing tuples to a get_tuple_element instruction for instance.

This patch handle this compilation failure by forging a var_info annotation based on the error message and adding it at the beginning of the failing function. The compilation is then retried.

The recreated annotation doesn't exactly match the original one. But it is sufficient to please the compiler. I make the assumption this is fine because the code we extract was already compiled successfully once. Therefore, there should be no situation where for example get_tuple_element would be called with something else than a tuple.

If the annotated code still fails to compile with a similar error for the same variable/register annotation, the extraction aborts. This is to prevent infinite retry loops because of an incorrectly forged var_info.

This added error handling matches specifically the situation described above for now. We may add more in the future based on real situations we encounter.

Let's take the following code:

    -export([outer_function/2]).

    outer_function(#my_record{field = ...} = MyRecord, Term) ->
        (...),
        inner_function(MyRecord, Term);
    outer_function(_, _) ->
        (...).

    inner_function(#my_record{field = ...} = MyRecord, Term) ->
        (...);
    inner_function(_, _) ->
        (...).

Both `outer_function/2` and `inner_function/2` take a tuple as their
first argument and do a pattern matching with it. `outer_function/2` is
exported and `inner_function/2` is only called by `outer_function/2`.

In this situation, the compiler may perform an optimization on
`inner_function/2`: it is sure that the function will take a tuple of
the expected form because it was already checked in `outer_function/2`
and it is the only caller. Therefore, it will skip any verifications and
will read the content of the tuple directly using the
`get_tuple_element` instruction.

When compiled from source, the compiler adds annotations of the
following form at the beginning of `inner_function/2` assembly:

    {'%', {var_info, {x,0}, [{type, {t_tuple, 2, true, #{...}}}]}}.

However, the assembly we recover from an already compiled module misses
those annotations. And when we compile the same assembly without those
annotations, the compilation fails with:

    {bad_type,
     {needed, {t_tuple, _Size, _, _Fields} = NeededType},
     {actual, any}}

This is caused by checks that we are only passing tuples to a
`get_tuple_element` instruction for instance.

This patch handles this compilation failure by forging a `var_info`
annotation based on the error message and adding it at the beginning of
the failing function. The compilation is then retried.

The recreated annotation doesn't exactly match the original one. But it
is sufficient to please the compiler. I make the assumption this is fine
because the code we extract was already compiled successfully once.
Therefore, there should be no situation where for example
`get_tuple_element` would be called with something else than a tuple.

If the annotated code still fails to compile with a similar error for
the same variable/register annotation, the extraction aborts. This is to
prevent infinite retry loops because of an incorrectly forged `var_info`.

This added error handling matches specifically the situation described
above for now. We may add more in the future based on real situations we
encounter.
An external unexported function is `fun module:function/0` when `module`
has `function/0` but doesn't export it.

We can't allow to extract this type of function because its assembly
code might not work outside of the context it was compiled for. For
instance, it could have some optimizations which make Erlang crash
because they only work in the original module.

Without this, extracting and executing `inner_function/2` in the
testsuite added in the previous commit woudl cause Erlang to termine
with a Segmentation fault.
@dumbbell dumbbell added the enhancement New feature or request label Feb 1, 2022
@dumbbell dumbbell self-assigned this Feb 1, 2022
@dumbbell dumbbell added bug Something isn't working and removed enhancement New feature or request labels Feb 1, 2022
@coveralls
Copy link

coveralls commented Feb 1, 2022

Pull Request Test Coverage Report for Build 267

  • 21 of 27 (77.78%) changed or added relevant lines in 1 file are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.03%) to 81.637%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/khepri_fun.erl 21 27 77.78%
Files with Coverage Reduction New Missed Lines %
src/khepri_tx.erl 1 88.33%
Totals Coverage Status
Change from base Build 264: -0.03%
Covered Lines: 1267
Relevant Lines: 1552

💛 - Coveralls

@dumbbell dumbbell marked this pull request as ready for review February 2, 2022 08:49
@dumbbell dumbbell merged commit fd9bb17 into main Feb 2, 2022
@dumbbell dumbbell deleted the add-var_info-to-assembly branch February 2, 2022 08:49
the-mikedavis added a commit to the-mikedavis/khepri that referenced this pull request Feb 4, 2022
Given some code like the following:

    trim_leading_dash1(<<$-, Rest/binary>>) -> trim_leading_dash1(Rest);
    trim_leading_dash1(Binary)              -> Binary.

The compiler will add an annotation for the {x,0} register

    {'%',{var_info,{x,0},[accepts_match_context]}}.

This annotation is not available in the disasm, so when we recompile
this function, the compiler gives an error.

This change detects which x-register should accept the match context
by peeking the first instruction after the entrypoint label. This
instruction can either be a {test,bs_start_match3,..} instruction or
bs_start_match4 (if the compiler can determine that the function will
_always_ be passed a binary), both of which contain the register
which will be matched.

closes rabbitmq#44
see also rabbitmq#36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants