[Program Analysis] Intraprocedural Analysis

CHA analysis is used to make too conservative assumptions to the method call in the Intraprocedural Analysis. All the results of the analysis should be. save. According to the Lattice Theory, the must and may analysis should be less precise. So the Interprocedural Analysis to see the data flow in the BB and the Call Graph to see the data flow propagation between functions and raise the precision of the analysis is very important.

For Better precision, we need Interprocedural Analysis: propagate data-flow information along interprocedural control-flow edges i.e., call and return edges. and Class Hierarchy analysis is a classical method to construct Call Graph.

The tradeoff between fast and precision is a long-been-discussed topic.

  • Class hierarchy analysis(CHA)
  • Rapid type analysis(RTA)
  • Variable type analysis(VTA)
  • Pointer analysis(k-CFA)

The ref1's author Jeffery Dean is the architect in Google who initiated TensorFlow/map-reduc e/big table/spanner. He carried out program analysis during his Ph.D.

Class Hierarchy Analysis

IMAGE 2022-04-27 09:36:58

Method call in Java


The virtual call is for OOP.

At program runtime, a virtual call is determined to be a specific call determined by two things.

  • The recipient of the content returned by virtual call: c
    The method signature at the point of the call: m
  • The function call with the form: o1.foo(...)2, will be resolved according to the following 2 points
    • Receiver object's type
    • call site's function signature.

class C{T foo(P p, Q q, R r){...}}->C.foo(P, Q, R)


Where c is the type of receiver object, and m is the function signature of call sites. dispatch function in c to find and function signature m compatible with the non-abstract function non-abstract method, if not found then go to c's parent class to continue to find until found, the function is the core of CHA analysis.

Algorithm

This algorithm can resolve virtual function calls based on the declared type of the receiver object (as distinguished from the actual type). It requires the entire inheritance chain of the program as an information base, e.g. A a = ... The algorithm in ; a.foo() assumes that the variable a can point to class A and all its subclasses, and resolves multiple possible target methods accordingly. Here is the algorithm for CHA:

The virtual function call only needs to get the declared type of the receiver variable at the call site, and then does Dispatch resolution on that type and all subclass objects of that type. This algorithm only considers the type of the receiver variable at the call site and its inheritance chain and ignores other data flow and control flow information, so it is very fast, but the accuracy is average. Therefore, its main application is the complement the IDE to meet the speed and fit the needs, but also keep the accuracy in a good acceptable range.

Call Graph Construction

Somewhat similar to BFS for the method from the entry.

Inter-procedural Control-Flow Graph

Reference

  1. Optimization of OOP using Static CHA @POPL1995