Add certain types of indirect function calls to the C++ call graph #7520
Description
I'm dealing with a codebase that makes use of a lot calls to functions pointers stored in global/static arrays, which results in the call graph (Function.calls
) not being very helpful. Here's a small example:
typedef unsigned long uintptr_t;
void func1() {}
void func2() {}
void func3() {}
void func4() {}
struct table_entry_t {
int other_data;
uintptr_t func;
};
uintptr_t func_table[] = {(uintptr_t)&func1, (uintptr_t)&func2};
table_entry_t nested_func_table[] = {{0, (uintptr_t)&func3},
{1, (uintptr_t)&func4}};
void foo() {
for (int i = 0; i < sizeof(func_table) / sizeof(func_table[0]); i++) {
((void (*)())func_table[i])();
}
for (int i = 0; i < sizeof(nested_func_table) / sizeof(nested_func_table[0]);
i++) {
((void (*)())nested_func_table[i].func)();
}
}
When I ask CodeQL to get the functions foo
calls, I'd like it to return func1
,func2
, func3
, and func4
. I've solved this issue for now by implementing a class modeling function tables defined using array aggregate literals, which I'll show below, but I'm mainly opening this issue to ask whether CodeQL can add better support for these indirect pointers in the built-in call graph.
In the example below, I've defined FunctionTableArrayAggregateLiteral
to model function tables, and then I add a custom edges
predicate which checks not only for direct function calls with a.calls(b)
, but also indirect calls through a function table (technically I don't actually verify if the function pointer is called, but this hasn't been a problem for me so far):
import cpp
class FunctionTableArrayAggregateLiteral extends ArrayAggregateLiteral {
FunctionTableArrayAggregateLiteral() { this.getAChild*() instanceof FunctionAccess }
Function getFunctions() { result = this.getAChild*().(FunctionAccess).getTarget() }
}
query predicate edges(Function a, Function b) {
a.calls(b)
or
exists(FunctionTableArrayAggregateLiteral functionTable |
// The function table is accessed inside `a`
functionTable.getEnclosingVariable().getAnAccess().getEnclosingFunction() = a and
// All functions inside the table are added as edges
b = functionTable.getFunctions()
)
}
predicate getCallGraph(Function start, Function end, string startName, string endName) {
edges+(start, end) and
start.hasName(startName) and
end.hasName(endName)
}
from Function start, Function end
where getCallGraph(start, end, "entry_func", end.getName())
select start, end
Of course this can lead to false positives in certain cases (just because one function in a function table is used doesn't mean all of them are), so I don't expect this exact type of solution to be used, but maybe something like it can be considered.