Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__call__ not supported #24

Closed
khatchad opened this issue Jan 20, 2023 · 11 comments · Fixed by ponder-lab/ML#58 or #108
Closed

__call__ not supported #24

khatchad opened this issue Jan 20, 2023 · 11 comments · Fixed by ponder-lab/ML#58 or #108
Assignees

Comments

@khatchad
Copy link
Collaborator

__call__ doesn't show up in the call graph.

@khatchad
Copy link
Collaborator Author

Suppose we have this input code stored in A.py:

import tensorflow as tf


# Create an override model to classify pictures
class SequentialModel(tf.keras.Model):

  def __init__(self, **kwargs):
    super(SequentialModel, self).__init__(**kwargs)

    self.flatten = tf.keras.layers.Flatten(input_shape=(28, 28))

    # Add a lot of small layers
    num_layers = 100
    self.my_layers = [tf.keras.layers.Dense(64, activation="relu")
                      for n in range(num_layers)]

    self.dropout = tf.keras.layers.Dropout(0.2)
    self.dense_2 = tf.keras.layers.Dense(10)

  def __call__(self, x):
    x = self.flatten(x)

    for layer in self.my_layers:
      x = layer(x)

    x = self.dropout(x)
    x = self.dense_2(x)

    return x


if __name__ == '__main__':
    input_data = tf.random.uniform([20, 28, 28])
    print("Input:")
    print(type(input_data))
    print(input_data)

    model = SequentialModel()
    result = model(input_data)

    print("Output:")
    print(type(input_data))
    print(result)

The problematic expression above is result = model(input_data), because that implicitly invokes SequentialModel.__call__. When building a call graph, I am not even seeing a node for __call__, which is strange. Perhaps it only gets created if there is a call to the function? I would expect to see a method reference of < PythonLoader, Lscript A.py/SequentialModel/__call__, do()LRoot; > somewhere in the following CG nodes:

Node: synthetic < PythonLoader, Lcom/ibm/wala/FakeRootClass, fakeRootMethod()V > Context: Everywhere
Node: synthetic < PythonLoader, Lcom/ibm/wala/FakeRootClass, fakeWorldClinit()V > Context: Everywhere
Node: <Code body of function Lscript A.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]
Node: synthetic < PythonLoader, Ltensorflow, import()Ltensorflow; > Context: CallStringContext: [ script A.py.do()LRoot;@88 ]
Node: synthetic < PythonLoader, Lwala/builtin/type, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@114 ]
Node: synthetic < PythonLoader, Lscript A.py/SequentialModel, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@117 ]
Node: synthetic < PythonLoader, Lwala/builtin/type, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@122 ]
Node: <Code body of function Lscript A.py/SequentialModel/__init__> Context: CallStringContext: [ script A.py.SequentialModel.do()LRoot;@12 ]
Node: synthetic < PythonLoader, Lsuperfun, do()LRoot; > Context: DelegatingContext [A=super call, B=CallStringContext: [ script A.py.SequentialModel.__init__.do()LRoot;@5 ]]
Node: synthetic < PythonLoader, Lwala/builtin/range, do()LRoot; > Context: CallStringContext: [ script A.py.SequentialModel.__init__.do()LRoot;@25 ]
Node: synthetic < PythonLoader, LCodeBody, __Lscript A.py/SequentialModel/__init__/comprehension1()LRoot; > Context: CallStringContext: [ script A.py.SequentialModel.__init__.do()LRoot;@26 ]
Node: <Code body of function Lscript A.py/SequentialModel/__init__/comprehension1> Context: CallStringContext: [ CodeBody.__Lscript A.py/SequentialModel/__init__/comprehension1()LRoot;@2 ]
Node: synthetic < PythonLoader, Ltensorflow/functions/uniform, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@111 ]
Node: synthetic < PythonLoader, Ltensorflow/functions/uniform, read_data()LRoot; > Context: CallStringContext: [ tensorflow.functions.uniform.do()LRoot;@0 ]

@khatchad
Copy link
Collaborator Author

If I change the input file to use result = model.__call__(input_data) instead, I am seeing these nodes:

Node: synthetic < PythonLoader, Lcom/ibm/wala/FakeRootClass, fakeRootMethod()V > Context: Everywhere
Node: synthetic < PythonLoader, Lcom/ibm/wala/FakeRootClass, fakeWorldClinit()V > Context: Everywhere
Node: <Code body of function Lscript A.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]
Node: synthetic < PythonLoader, Ltensorflow, import()Ltensorflow; > Context: CallStringContext: [ script A.py.do()LRoot;@88 ]
Node: synthetic < PythonLoader, Lwala/builtin/type, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@114 ]
Node: synthetic < PythonLoader, Lscript A.py/SequentialModel, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@117 ]
Node: synthetic < PythonLoader, Lwala/builtin/type, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@123 ]
Node: <Code body of function Lscript A.py/SequentialModel/__init__> Context: CallStringContext: [ script A.py.SequentialModel.do()LRoot;@12 ]
Node: synthetic < PythonLoader, Lsuperfun, do()LRoot; > Context: DelegatingContext [A=super call, B=CallStringContext: [ script A.py.SequentialModel.__init__.do()LRoot;@5 ]]
Node: synthetic < PythonLoader, Lwala/builtin/range, do()LRoot; > Context: CallStringContext: [ script A.py.SequentialModel.__init__.do()LRoot;@25 ]
Node: synthetic < PythonLoader, LCodeBody, __Lscript A.py/SequentialModel/__init__/comprehension1()LRoot; > Context: CallStringContext: [ script A.py.SequentialModel.__init__.do()LRoot;@26 ]
Node: <Code body of function Lscript A.py/SequentialModel/__init__/comprehension1> Context: CallStringContext: [ CodeBody.__Lscript A.py/SequentialModel/__init__/comprehension1()LRoot;@2 ]
Node: synthetic < PythonLoader, Ltensorflow/functions/uniform, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@111 ]
Node: synthetic < PythonLoader, L$script A.py/SequentialModel/__call__, trampoline2()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@120 ]
Node: synthetic < PythonLoader, Ltensorflow/functions/uniform, read_data()LRoot; > Context: CallStringContext: [ tensorflow.functions.uniform.do()LRoot;@0 ]
Node: <Code body of function Lscript A.py/SequentialModel/__call__> Context: CallStringContext: [ $script A.py.SequentialModel.__call__.trampoline2()LRoot;@2 ]

@khatchad
Copy link
Collaborator Author

And, the diff between the two:

7c7
< Node: synthetic < PythonLoader, Lwala/builtin/type, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@122 ]
---
> Node: synthetic < PythonLoader, Lwala/builtin/type, do()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@123 ]
13a14
> Node: synthetic < PythonLoader, L$script A.py/SequentialModel/__call__, trampoline2()LRoot; > Context: CallStringContext: [ script A.py.do()LRoot;@120 ]
14a16
> Node: <Code body of function Lscript A.py/SequentialModel/__call__> Context: CallStringContext: [ $script A.py.SequentialModel.__call__.trampoline2()LRoot;@2 ]

The first difference just looks like a difference is the IR with the values, probably because there's one more value corresponding to the "new" function invocation.

@khatchad
Copy link
Collaborator Author

I would think that __call__ needs to be added along side this code. It's a built-in function?

builtinFunctions.put("__delete__", Either.forRight(2));

khatchad pushed a commit that referenced this issue Oct 26, 2023
Adding tests for testing calling a method. We add four tests that include:

1. Calling model indirectly using `call` and `__call__` (2 tests)
2. Calling a model directly using `call` and `__call__` (2 tests)

Related to #24
@khatchad khatchad self-assigned this Nov 21, 2023
@khatchad khatchad linked a pull request Nov 29, 2023 that will close this issue
@khatchad
Copy link
Collaborator Author

I would think that __call__ needs to be added along side this code. It's a built-in function?

builtinFunctions.put("__delete__", Either.forRight(2));

I don't think so. Specifically, __init__ isn't listed as one.

@khatchad
Copy link
Collaborator Author

khatchad commented Nov 29, 2023

I was able to switch the target to the correct IMethod, however, the points-to analysis is wrong. By switching the receiver in the target selector, I was able to get a node for the __call__() trampoline:

callees of node trampoline2 : []

IR of node 11, context CallStringContext: [ script tf2_test_model_call.py.do()LRoot;@113 ]
synthetic < PythonLoader, L$script tf2_test_model_call.py/SequentialModel/__call__, trampoline2()LRoot; >
CFG:
BB0[0..0]
    -> BB1
    -> BB5
BB1[1..1]
    -> BB2
    -> BB5
BB2[2..2]
    -> BB3
    -> BB5
BB3[3..3]
    -> BB4
    -> BB5
BB4[4..4]
    -> BB5
BB5[-1..-2]
Instructions:
BB0
0   v3 = getfield < PythonLoader, LRoot, $function, <PythonLoader,LRoot> > v1
BB1
1   v4 = checkcast <PythonLoader,Lscript tf2_test_model_call.py/SequentialModel/__call__>v3
BB2
2   v5 = getfield < PythonLoader, LRoot, $self, <PythonLoader,LRoot> > v1
BB3
3   v6 = invokeFunction < PythonLoader, LCodeBody, do()LRoot; > v4,v5,v2 @2 exception:v7
BB4
4   return v6                                
BB5

However, v1 still points to the wrong thing:

[Node: synthetic < PythonLoader, L$script tf2_test_model_call.py/SequentialModel/__call__, trampoline2()LRoot; > Context: CallStringContext: [ script tf2_test_model_call.py.do()LRoot;@113 ], v1] --> [SITE_IN_NODE{synthetic < PythonLoader, Lscript tf2_test_model_call.py/SequentialModel, do()LRoot; >:Lobject in CallStringContext: [ script tf2_test_model_call.py.do()LRoot;@111 ]}]

The "receiver" is stil Lobject (this is in "test 1" that uses the callable). However, it is the following in "test 4" (explicit call to __call__():

[Node: synthetic < PythonLoader, L$script tf2_test_model_call.py/SequentialModel/__call__, trampoline2()LRoot; > Context: CallStringContext: [ script tf2_test_model_call.py.do()LRoot;@114 ], v1] --> [SMIK:SITE_IN_NODE{synthetic < PythonLoader, Lscript tf2_test_model_call.py/SequentialModel, do()LRoot; >:L$script tf2_test_model_call.py/SequentialModel/__call__ in CallStringContext: [ script tf2_test_model_call.py.do()LRoot;@111 ]}@creator:Node: synthetic < PythonLoader, Lscript tf2_test_model_call.py/SequentialModel, do()LRoot; > Context: CallStringContext: [ script tf2_test_model_call.py.do()LRoot;@111 ]]

The "receiver" here is: L$script tf2_test_model_call.py/SequentialModel/__call__. Thus, even though we have the correct IMethod being selected, somehow the pointer analysis is still wrong.

@khatchad
Copy link
Collaborator Author

khatchad commented Nov 29, 2023

In the working case (test 4), by the time we hit com.ibm.wala.ipa.callgraph.propagation.PropagationCallGraphBuilder.addConstraintsFromNewNodes(IProgressMonitor) to process the newly found node, curiously the pointer analysis already has:

[Node: synthetic < PythonLoader, L$script tf2_test_model_call4.py/SequentialModel/__call__, trampoline2()LRoot; > Context: CallStringContext: [ script tf2_test_model_call4.py.do()LRoot;@114 ], v1] ->
     SMIK:SITE_IN_NODE{synthetic < PythonLoader, Lscript tf2_test_model_call4.py/SequentialModel, do()LRoot; >:L$script tf2_test_model_call4.py/SequentialModel/__call__ in CallStringContext: [ script tf2_test_model_call4.py.do()LRoot;@111 ]}@creator:Node: synthetic < PythonLoader, Lscript tf2_test_model_call4.py/SequentialModel, do()LRoot; > Context: CallStringContext: [ script tf2_test_model_call4.py.do()LRoot;@111 ]

@khatchad
Copy link
Collaborator Author

Thus, at the point of adding the "new" node, we've already know that v1 refers to SequentialModel.__call__(), and v3 is then assigned to from v3. But only constraints from v3 are generated, which is too late. How does it know about v1 before processing the "new" node?

@khatchad
Copy link
Collaborator Author

The problem may have something to do with v1 being implicit in the IR above, i.e., there exists no explicit assignment of v1.

@khatchad
Copy link
Collaborator Author

Ah, because this isn't a "static" method (not sure what that means for Python), v1 must point to the implicit parameter (i.e., the receiver object).

@khatchad
Copy link
Collaborator Author

When we get to com.ibm.wala.ipa.callgraph.propagation.PropagationCallGraphBuilder.getTargetForCall(CGNode, CallSiteReference, IClass, InstanceKey[]), the pointer analysis does not contain this key, so there must be something in between that adds it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant