Skip to content

Trade-offs in Control Flow Analysis #9998

Open
@RyanCavanaugh

Description

Some rough notes from a conversation @ahejlsberg and I had earlier about trade-offs in the control flow analysis work based on running the real-world code (RWC) tests. For obvious reasons I'll be comparing what Flow does with similar examples to compare and contrast possible outcomes.

The primary question is: When a function is invoked, what should we assume its side effects are?

One option is to be pessimistic and reset all narrowings, assuming that any function might mutate any object it could possibly get its hands on. Another option is to be optimistic and assume the function doesn't modify any state. Both of these seem to be bad.

This problem spans both locals (which might be subject to some "closed over or not" analysis) and object fields.

Optimistic: Bad behavior on locals

The TypeScript compiler has code like this:

enum Token { Alpha, Beta, Gamma }
let token = Token.Alpha;
function nextToken() {
    token = Token.Beta;
}
function maybeNextToken() {
    if (... something ...) {
        nextToken();
    }
}

function doSomething() {
    if (token !== Token.Alpha) {
        maybeNextToken();
    }
    // is this possible?
    if (token === Token.Alpha) {
        // something happens
    }
}

Optimistically assuming token isn't modified by maybeNextToken incorrectly flags token === Token.Alpha as an impossibility. However, in other cases, this is a good check to do! See later examples.

Optimistic: Bad behavior on fields

The RWC suite picked up a "bug" that looked like this:

// Function somewhere else
declare function tryDoSomething(x: string, result: { success: boolean; value: number; }): void;

function myFunc(x: string) {
    let result = { success: false, value: 0 };

    tryDoSomething(x, result);
    if (result.success === true) { // %%
        return result.value;
    }

    tryDoSomething(x.trim(), result);
    if (result.success === true) { // ??
        return result.value;
    }
    return -1;
}

The ?? line here is not a bug in the user code, but we thought it was, because after the %% block runs, the only remaining value in result.success's domain is false.

Pessimistic: Bad behavior on locals

We found actual bugs (several!) in partner code that looked like this:

enum Kind { Good, Bad, Ugly }
let kind: Kind = ...;
function f() {
    if (kind) {
        log('Doing some work');
        switch (kind) {
            case Kind.Good:
                // unreachable!
        }
    }
}

Here, we detected the bug that Kind.Good (which has the falsy value 0) is not in the domain of kind at the point of the case label. However, if we were fully pessimistic, we couldn't know that the global function log doesn't modify the global variable kind, thus incorrectly allowing this broken code.

Pessimistic: Bad behavior on fields, example 1

A question on flowtype SO is a good example of this

A smaller example that demonstrates the behavior:

function fn(arg: { x: string | null }) {
    if (arg.x !== null) {
        alert('All is OK!');
        // Flow: Not OK, arg.x could be null
        console.log(arg.x.substr(3));
    }
}

The problem here is that, pessimistically, something like this might be happening:

let a = { x: 'ok' };
function alert() {
    a.x = null;
}
fn(a);

Pessimistic: Bad behavior on fields, example 2

The TS compiler has code that looks like this (simplified):

function visitChildren(node: Node, visit: (node: Node) => void) {
    switch(node.kind) {
        case SyntaxKind.BinaryExpression:
            visit(node.left);
            visit(node.right); // Unsafe?
            break;
        case SyntaxKind.PropertyAccessExpression:
            visit(node.expr);
            visit(node.name); // Unsafe?
            break;
    }
}

Here, we discriminated the Node union type by its kind. A pessimistic behavior would say that the second invocations are unsafe, because the call to visit may have mutated node.kind through a secondary reference and invalidated the discrimination.

Mitigating with (shallow) inlining / analysis

Flow does some assignment analysis to improve the quality of these errors, but it's obviously short of a full inlining solution, which wouldn't be even remotely practical. Some examples of how to defeat the analysis:

// Non-null assignment can still trigger null warnings
function fn(x: string | null) {
    function check1() {
        x = 'still OK';
    }

    if (x !== null) {
        check1();
        // Flow: Error, x could be null
        console.log(x.substr(0));
    }
}
// Inlining is only one level deep
function fn(x: string | null) {
    function check1() {
        check2();
    }
    function check2() {
        x = null;
    }

    if (x !== null) {
        check1();
        // Flow: No error
        console.log(x.substr(0)); // crashes
    }
}

Mitigating with const parameters

A low-hanging piece of fruit is to allow a const modifier on parameters. This would allow a much faster fix for code that looks like this:

function fn(const x: string | number) {
  if (typeof x === 'string') {
    thisFunctionCannotMutateX();
    x.substr(0); // ok
  }
}

Mitigating with readonly fields

The visitChildren example above might be mitigated by saying that readonly fields retain their narrowing effects even in the presence of intervening function calls. This is technically unsound as you may have both a readonly and non-readonly alias to the same property, but in practice this is probably very rare.

Mitigating with other options

Random ideas that got thrown out (will add to this list) but are probably bad?

  • pure modifier on functions that says this function doesn't modify anything. This is a bit impractical as we'd realistically want this on the vast majority of all functions, and it doesn't really solve the problem since lots of functions only modify one thing so you'd really want to say "pure except for m"
  • volatile property modifier that says this "this property will change without notice". We're not C++ and it's perhaps unclear where you'd apply this and where you wouldn't.

Metadata

Assignees

No one assigned

    Labels

    DiscussionIssues which may not have code impact

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions