[Program Analysis] Intermediate Representation

我又来学软工了😄,倒腾java的种种,今天就在乱搞soot,跑一个命令行真脑瘫。不过zhe次不像暑研要写在Phosphor上的ASM visitor,只需处理三地址码的jimple就好。

 java -cp "/Users/yiweiyang/.m2/repository/org/soot-oss/soot/4.2.1/soot-4.2.1.jar:/Users/yiweiyang/.m2/repository/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar:/Users/yiweiyang/.m2/repository/org/slf4j/slf4j-log4j12/1.7.5/slf4j-log4j12-1.7.5.jar:/Users/yiweiyang/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/yiweiyang/.m2/repository/de/upb/cs/swt/axml/2.0.0/axml-2.0.0.jar:/Users/yiweiyang/.m2/repository/com/google/guava/guava/19.0/guava-19.0.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm/9.2/asm-9.2.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm-commons/9.2/asm-commons-9.2.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm-tree/9.2/asm-tree-9.2.jar:/Users/yiweiyang/.m2/repository/de/upb/cs/swt/heros/1.2.2/heros-1.2.2.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm-util/9.2/asm-util-9.2.jar" soot.Main --soot-class-path /Users/yiweiyang/project/test_soot/target/classes:/Users/yiweiyang/.sdkman/candidates/java/current/jre/lib/rt.jar  Bear

AST vs. IR

IR: Three-Address Code (3AC)

3AC in Real Static Analyzer: Soot

Soot

Hello world example using API

    public static SootClass generateClass() {
        // Load dependencies
        Scene.v().loadClassAndSupport("java.lang.Object");
        Scene.v().loadClassAndSupport("java.lang.System");
        Scene.v().loadNecessaryClasses();

        // Create the class HelloWorld as a public class that extends Object
        SootClass sClass = new SootClass("HelloWorld", Modifier.PUBLIC);
        sClass.setSuperclass(Scene.v().getSootClass("java.lang.Object"));
        Scene.v().addClass(sClass);
        // Create: public static void main(String[])
        SootMethod mainMethod = new SootMethod("main",
                Arrays.asList(new Type[] {ArrayType.v(RefType.v("java.lang.String"), 1)}),
                VoidType.v(), Modifier.PUBLIC | Modifier.STATIC);
        sClass.addMethod(mainMethod);
        // Generate dava body from the jimple body
        JimpleBody jimpleBody = createJimpleBody(mainMethod);
        // Set the jimple body as the active one
        mainMethod.setActiveBody(jimpleBody);
        return sClass;
    }
public class HelloWorld extends java.lang.Object
{

    public static void main(java.lang.String[])
    {
        java.lang.String[] frm1;
        java.io.PrintStream tmpRef;
        frm1 := @parameter0: java.lang.String[];
        tmpRef = <java.lang.System: java.io.PrintStream out>;
        virtualinvoke tmpRef.<java.io.PrintStream: void println(java.lang.String)>("Hello world!");
        return;
    }
}

Using Commandline

Do-While Loop


public class DoWhileLoop3ACExample {
    public static void main(String[] args) {
        int[] arr = new int[10];
        int i = 0;
        do {
            i = i + 1;
        } while (arr[i] < 10);
    }
}
public class DoWhileLoop3ACExample extends java.lang.Object {
    public void <init>() {
        DoWhileLoop3ACExample r0;
        r0 := @this: DoWhileLoop3ACExample;
        specialinvoke r0.<java.lang.Object: void <init>()>();
        return;
    }

    public static void main(java.lang.String[]) {
        int[] r0;
        int $i0, i1;
        java.lang.String[] r1;
        r1 := @parameter0: java.lang.String[];
        r0 = newarray (int)[10];
        i1 = 0;
     label1:
        i1 = i1 + 1;
        $i0 = r0[i1];
        if $i0 < 10 goto label1;
        return;
    }
}

Optimized

可以用下面这个命令看BB。

java -cp $SOOT_PATH soot.tools.CFGViewer --soot-class-path $SOOT_CLASS_PATH --graph=BriefBlockGraph $CLASS_NAME

Method Call


对于method call不同编译选项在jimple上看不出来,ASM有不一样。

Class

public class Class3AC {
    public static final double pi = 3.14;
    public static void main(String[]args){}
}
public class Class3AC extends java.lang.Object
public static final double pi;
static void <clinit>() {
 double temp$0;
 temp $0=3.14
 <Class3AC: double pi> = temp$0;
 return;
}
/** main function */
public static void main(java. lang. String[]) {
java. lang.String [] args;
 args : = @parametero: java. lang.String[];
 return;
}
/** Default Constructor */
public void <init>(){
 Class3AC this;
 this := @this: Class3AC;
 specialinvoke this, <java, lang. Object: void <init> ()$>();
 return;
}

Soot 中的三地址码:

1.@parameter:函数参数
2.$x:临时变量

  1. <method signature>:类+返回值类型+方法名+函数参数类型
  2. <init>:构造函数
  3. <clinit>:类初始化函数(静态变量初始化等)
  4. invokespecial:调用构造函数、父类方法、私有方法
  5. invokevirtual:实例方法调用(virtual dispatch)
  6. invokeinterface:不能优化、调用接口、检查接口实现
  7. invokestatic:调用静态方法
  8. invokedynamic:运行其他动态语言

Static Single Assignment (SSA)

Control Flow Graphs (CFG)

Reference

  1. https://taodaling.github.io/blog/2020/06/01/java%E7%BC%96%E8%AF%91%E4%BC%98%E5%8C%96/
  2. https://mayuwan.github.io/2018/05/08/soot/