[Distributed System] Summary

Synchronization

Logic Clock

Lamport Timestamp

a ->b ==> TS(a) < TS(b) The reverse does not hold.

vector lock

Use vector to store the lock with every cores. Follows the reverse of the property.

Ordering



FLP

To prove that random timeouts can avoid live locking.

  • Lemma 1: Disjoint schedules are commutative
  • Lemma 2: There exists an initial configuration that is bivalent
  • Lemma 3: Starting from a bivalent config., there is always another bivalent config. that is reachable
  • Theorem (Impossibility of Consensus): There is always a run of events in an asynchronous distributed system such that the group of processes never reach consensus (i.e., stays bivalent all the time)

RPC

  • RPC = Remote Procedure Call
  • Proposed by Birrell and Nelson in 1984
  • Important abstraction for processes to call functions in other processes
  • Allows code reuse
  • Implemented and used in most distributed systems, including cloud computing systems
  • Counterpart in Object-based settings is called RMI (Remote Method Invocation)
  • What’s the No. 1 design choice for RPC?

Local Procedure Call (LPC)

The call from one function to another function within the same process

  • Uses stack to pass arguments and return values
  • Accesses objects via pointers(e.g., C) or by reference(e.g., Java)

LPC has exactly-once semantics

Remote Procedure Call

Call from one function to another function, where caller and callee function reside in different processes

  • Function call crosses a process boundary
  • Accesses procedures via global references(e.g., Procedure address = IP + port + procedure number)
  • Cannot use pointers across processes since a reference address in process P1 may point to a different object in another process P2(We need serialization)
    Similarly, RMI (Remote Method Invocation) in Object-based settings

Semantics

  • Under failures, hard to guarantee exactly-once semantics
  • Function may not be executed if
    • Request (call) message is dropped
    • Reply (return) message is dropped
    • Called process fails before executing called function
    • Called process fails after executing called function
    • Hard for the caller to distinguish these cases
  • Function may be executed multiple times if
    • Request (call) message is duplicated

Idempotent OPerations
  • Idempotent operations are those that can be repeated multiple times, without any side effects
  • Idempotent operations can be used with at-least-once semantics

implementation

  • client stub: has same function signature as callee()
  • communication module: Forwards requests and replies to appropriate hosts
  • dispatcher: Selects which server stub to forward a request to
  • server stub: calls callee(), allows it to return a value

ACID

  • Atomicity: "all or nothing"
  • Consistency: "it looks correct to me"
  • Isolation: "as if alone"
  • Durability: "survive failures"


Fault in the Two-phase Commit

  • What if the database node crashes?
  • What if the coordinator crashes?
    • Coordinator writes its decision to disk
    • When it recovers, read the decision from the disk and send it to replicas (or abort if no decision was made before the crash)
  • If the coordinate crashes after preparation, but before broadcasting the decision, other nodes do not know how it has been decided.
    • Replicas participating in the transaction cannot commit or abort after responding “ok” to the prepared request
    • Algorithm is blocked until coordinator recovers






Consensus Protocol

Replication

Fault Tolerance and Reliable Multicast

P2P System

Provenance:

  1. Specialized
  2. light-weight
  3. easy-to-use software to enable Internet-scale file sharing

Data-intensive Computing: Massive

  1. Some Examples
    1. Google MapReduce
      • Yahoo Hadoop/PIG
      • Data parallel computing
    2. IBM Research System S
      • InfosphereStream product
      • Continuous Data stream processing
    3. Microsoft Dryad/ LINQ
      • DAG processing
      • SQL query support
  2. Google Infra
    1. Distributed file systems: GFS
    2. Distributed storage: BigTable
    3. Job scheduler: the work queue
    4. Parallel computation: MapReduce
    5. Distributed lock server: chubby

Map Reduce

  • A user-specified map the function processes a key/value pair to generate a set of intermediate key/value pairs.
  • A user-specified reduce function merges all intermediate values associated with the same intermediate key.

Cloud Computing

Cloud Compute

AI for systems

  • AI for IT operations. AIOps combines big data and ML to automate IT operations processes.
  • digital economy - increasing complexity of modern application arch.
  • no manual intervention
  • increased automation and faster remediation

The proliferation of monitoring tools makes analytics challenging

  • The use of disparate monitoring tools makes it extremely difficult to obtain end-to-end visibility across the entire business service or application
  • 72% of IT organizations rely on up to nine different IT monitoring tools to support the modern application.

Case Study

Use un-supervised data to dig the vast majority of DevOps data.

runtime timeout bug identification tool.

  • Combines timeout-related feature selection and runtime anomaly detection to achieve higher precision.
  • does not require any application instrumentation for bug detection.
  • Evaluates over 19 performance bugs and identifies 18 of them.

Docker container Vulnerability Detection

AIOps @Microsoft

Reference

  1. https://zhuanlan.zhihu.com/p/504176976

[Program Analysis] Intermediate Representation

我又来学软工了😄,倒腾java的种种,今天就在乱搞soot,跑一个命令行真脑瘫。不过zhe次不像暑研要写在Phosphor上的ASM visitor,只需处理三地址码的jimple就好。

 java -cp "/Users/yiweiyang/.m2/repository/org/soot-oss/soot/4.2.1/soot-4.2.1.jar:/Users/yiweiyang/.m2/repository/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar:/Users/yiweiyang/.m2/repository/org/slf4j/slf4j-log4j12/1.7.5/slf4j-log4j12-1.7.5.jar:/Users/yiweiyang/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/yiweiyang/.m2/repository/de/upb/cs/swt/axml/2.0.0/axml-2.0.0.jar:/Users/yiweiyang/.m2/repository/com/google/guava/guava/19.0/guava-19.0.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm/9.2/asm-9.2.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm-commons/9.2/asm-commons-9.2.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm-tree/9.2/asm-tree-9.2.jar:/Users/yiweiyang/.m2/repository/de/upb/cs/swt/heros/1.2.2/heros-1.2.2.jar:/Users/yiweiyang/.m2/repository/org/ow2/asm/asm-util/9.2/asm-util-9.2.jar" soot.Main --soot-class-path /Users/yiweiyang/project/test_soot/target/classes:/Users/yiweiyang/.sdkman/candidates/java/current/jre/lib/rt.jar  Bear

Continue reading "[Program Analysis] Intermediate Representation"

MacVM Hypervisor performance

Mac has launched its API for virtulization and KhaosT has wrapped this api into a fully automated tool to install MacOS on Apple Sillicon just like opening sandbox.

The virtulization API is universal in swift.

let vm = VZVirtualMachine(configuration: configuration, queue: .main)
vm.delegate = self
            
vm.start { [weak self] result in
    switch result {
    case .success:
       self?.document?.isRunning = true
       NSLog("Success")
    case .failure(let error):
       NSLog("Error: \(error)")
     }
}

vm lifecycle

Performance test between M1 and 9700k

# M1 Direct
❯ openssl speed -evp aes-128-cbc aes-256-cbc des-ede3 rsa2048 sha256                                             (base)
Doing sha256 for 3s on 16 size blocks: 13547385 sha256's in 2.98s
Doing sha256 for 3s on 64 size blocks: 6595521 sha256's in 2.91s
Doing sha256 for 3s on 256 size blocks: 2811715 sha256's in 2.98s
Doing sha256 for 3s on 1024 size blocks: 811209 sha256's in 2.90s
Doing sha256 for 3s on 8192 size blocks: 109612 sha256's in 2.96s
Doing sha256 for 3s on 16384 size blocks: 55935 sha256's in 2.94s
Doing des ede3 for 3s on 16 size blocks: 4252004 des ede3's in 2.99s
Doing des ede3 for 3s on 64 size blocks: 1017455 des ede3's in 2.91s
Doing des ede3 for 3s on 256 size blocks: 272326 des ede3's in 2.99s
Doing des ede3 for 3s on 1024 size blocks: 66040 des ede3's in 2.97s
Doing des ede3 for 3s on 8192 size blocks: 8196 des ede3's in 2.94s
Doing des ede3 for 3s on 16384 size blocks: 4052 des ede3's in 2.93s
Doing aes-256 cbc for 3s on 16 size blocks: 28033078 aes-256 cbc's in 2.95s
Doing aes-256 cbc for 3s on 64 size blocks: 7296104 aes-256 cbc's in 2.97s
Doing aes-256 cbc for 3s on 256 size blocks: 1724393 aes-256 cbc's in 2.88s
Doing aes-256 cbc for 3s on 1024 size blocks: 438037 aes-256 cbc's in 2.84s
Doing aes-256 cbc for 3s on 8192 size blocks: 57285 aes-256 cbc's in 2.93s
Doing aes-256 cbc for 3s on 16384 size blocks: 27726 aes-256 cbc's in 2.92s
Doing aes-128-cbc for 3s on 16 size blocks: 91712957 aes-128-cbc's in 2.89s
Doing aes-128-cbc for 3s on 64 size blocks: 24402793 aes-128-cbc's in 2.90s
Doing aes-128-cbc for 3s on 256 size blocks: 8617690 aes-128-cbc's in 2.97s
Doing aes-128-cbc for 3s on 1024 size blocks: 2327697 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 8192 size blocks: 246604 aes-128-cbc's in 2.60s
Doing aes-128-cbc for 3s on 16384 size blocks: 157122 aes-128-cbc's in 2.95s
Doing 2048 bits private rsa's for 10s: 8364 2048 bits private RSA's in 9.92s
Doing 2048 bits public rsa's for 10s: 229578 2048 bits public RSA's in 7.99s
OpenSSL 1.1.1l  24 Aug 2021
built on: Tue Nov 23 07:53:43 2021 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: /Users/yiweiyang/project/spack/lib/spack/env/clang/clang -fPIC  -O3 -Wall -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -D_REENTRANT -DZLIB -DNDEBUG -I/Users/yiweiyang/project/spack/opt/spack/darwin-monterey-x86_64/apple-clang-13.0.0/zlib-1.2.11-fj3aancm72hxjybw6445d5uuftok6jvr/include
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
des ede3         22753.20k    22377.02k    23316.21k    22769.35k    22837.29k    22658.01k
aes-256 cbc     152043.81k   157222.44k   153279.38k   157940.10k   160163.39k   155569.45k
aes-128-cbc     507753.40k   538544.40k   742804.26k   799852.93k   776992.30k   872639.61k
sha256           72737.64k   145056.13k   241543.30k   286440.70k   303358.62k   311713.96k
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.001186s 0.000035s    843.1  28733.2

# M1 hypervisor
test-vm@test-vms-Mac ~ % openssl speed -evp aes-128-cbc des-ede3 rsa2048 sha256
Doing sha256 for 3s on 16 size blocks: 12417802 sha256's in 2.91s
Doing sha256 for 3s on 64 size blocks: 8605571 sha256's in 2.73s
Doing sha256 for 3s on 256 size blocks: 7519183 sha256's in 2.97s
Doing sha256 for 3s on 1024 size blocks: 3766735 sha256's in 2.99s
Doing sha256 for 3s on 8192 size blocks: 675151 sha256's in 2.98s
Doing des ede3 for 3s on 16 size blocks: 4212079 des ede3's in 3.00s
Doing des ede3 for 3s on 64 size blocks: 1079073 des ede3's in 2.99s
Doing des ede3 for 3s on 256 size blocks: 274746 des ede3's in 2.98s
Doing des ede3 for 3s on 1024 size blocks: 71968 des ede3's in 2.99s
Doing des ede3 for 3s on 8192 size blocks: 8420 des ede3's in 2.98s
Doing aes-128-cbc for 3s on 16 size blocks: 43091935 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 64 size blocks: 11225248 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 256 size blocks: 2726199 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 1024 size blocks: 673152 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 8192 size blocks: 82678 aes-128-cbc's in 2.98s
Doing 2048 bit private rsa's for 10s: 4518 2048 bit private RSA's in 9.94s
Doing 2048 bit public rsa's for 10s: 103797 2048 bit public RSA's in 9.94s
LibreSSL 3.3.5
built on: date not available
options:bn(64,64) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
des ede3         22500.93k    23073.39k    23597.93k    24660.00k    23143.85k
aes-128-cbc     231587.58k   239916.76k   234029.06k   233230.99k   227646.27k
sha256           68226.66k   201824.46k   648416.14k  1288982.79k  1855047.46k
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.002201s 0.000096s    454.4  10441.9

# 9700K Direct
❯ openssl speed -evp aes-128-cbc aes-256-cbc des-ede3 rsa2048 sha256
Doing sha256 for 3s on 16 size blocks: 21369624 sha256's in 2.86s
Doing sha256 for 3s on 64 size blocks: 11887732 sha256's in 2.92s
Doing sha256 for 3s on 256 size blocks: 5599953 sha256's in 2.97s
Doing sha256 for 3s on 1024 size blocks: 1737136 sha256's in 2.98s
Doing sha256 for 3s on 8192 size blocks: 229175 sha256's in 2.92s
Doing sha256 for 3s on 16384 size blocks: 117287 sha256's in 2.97s
Doing des ede3 for 3s on 16 size blocks: 8288297 des ede3's in 2.99s
Doing des ede3 for 3s on 64 size blocks: 2095278 des ede3's in 2.98s
Doing des ede3 for 3s on 256 size blocks: 525764 des ede3's in 2.98s
Doing des ede3 for 3s on 1024 size blocks: 130040 des ede3's in 2.95s
Doing des ede3 for 3s on 8192 size blocks: 16445 des ede3's in 2.99s
Doing des ede3 for 3s on 16384 size blocks: 8217 des ede3's in 2.98s
Doing aes-256 cbc for 3s on 16 size blocks: 49239507 aes-256 cbc's in 2.98s
Doing aes-256 cbc for 3s on 64 size blocks: 12584820 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 256 size blocks: 3150137 aes-256 cbc's in 2.98s
Doing aes-256 cbc for 3s on 1024 size blocks: 792478 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 8192 size blocks: 99018 aes-256 cbc's in 2.98s
Doing aes-256 cbc for 3s on 16384 size blocks: 49599 aes-256 cbc's in 2.99s
Doing aes-128-cbc for 3s on 16 size blocks: 205033256 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 64 size blocks: 84091603 aes-128-cbc's in 2.92s
Doing aes-128-cbc for 3s on 256 size blocks: 21937430 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 1024 size blocks: 5512203 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 8192 size blocks: 689395 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 16384 size blocks: 344040 aes-128-cbc's in 2.99s
Doing 2048 bits private rsa's for 10s: 24679 2048 bits private RSA's in 9.94s
Doing 2048 bits public rsa's for 10s: 800018 2048 bits public RSA's in 9.94s
OpenSSL 1.1.1m  14 Dec 2021
built on: Thu Feb 10 23:52:55 2022 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: /Volumes/DataCorrupted/spack/lib/spack/env/clang/clang -fPIC  -O3 -Wall -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -D_REENTRANT -DZLIB -DNDEBUG -I/Volumes/DataCorrupted/spack/opt/spack/darwin-monterey-skylake/apple-clang-13.0.0/zlib-1.2.11-7yfdtza4upfsjhx6nyigqab6ctnbiqxg/include
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
des ede3         44352.09k    44999.26k    45166.30k    45139.31k    45056.00k    45176.96k
aes-256 cbc     264373.19k   269374.07k   270615.80k   271403.84k   272199.82k   271782.61k
aes-128-cbc    1100849.70k  1843103.63k  1878254.88k  1894126.13k  1895142.23k  1885201.12k
sha256          119550.34k   260553.03k   482689.55k   596921.90k   642945.75k   647013.54k
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000403s 0.000012s   2482.8  80484.7

# 9700K hypervisor
test-vm@macos-12 ~ %  openssl speed -evp aes-128-cbc aes-256-cbc des-ede3 rsa2048 sha256
Doing sha256 for 3s on 16 size blocks: 6998286 sha256's in 1.76s
Doing sha256 for 3s on 64 size blocks: 5937478 sha256's in 2.40s
Doing sha256 for 3s on 256 size blocks: 3640021 sha256's in 2.89s
Doing sha256 for 3s on 1024 size blocks: 1276928 sha256's in 2.99s
Doing sha256 for 3s on 8192 size blocks: 172371 sha256's in 2.98s
Doing des ede3 for 3s on 16 size blocks: 6392897 des ede3's in 2.97s
Doing des ede3 for 3s on 64 size blocks: 1840224 des ede3's in 2.97s
Doing des ede3 for 3s on 256 size blocks: 486441 des ede3's in 2.99s
Doing des ede3 for 3s on 1024 size blocks: 122497 des ede3's in 2.99s
Doing des ede3 for 3s on 8192 size blocks: 15081 des ede3's in 2.98s
Doing aes-256 cbc for 3s on 16 size blocks: 24622581 aes-256 cbc's in 2.96s
Doing aes-256 cbc for 3s on 64 size blocks: 6913334 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 256 size blocks: 1779287 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 1024 size blocks: 993353 aes-256 cbc's in 2.98s
Doing aes-256 cbc for 3s on 8192 size blocks: 130072 aes-256 cbc's in 2.99s
Doing aes-128-cbc for 3s on 16 size blocks: 248465144 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 64 size blocks: 80633266 aes-128-cbc's in 2.98s
Doing aes-128-cbc for 3s on 256 size blocks: 20577300 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 1024 size blocks: 5140306 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 8192 size blocks: 631903 aes-128-cbc's in 2.99s
Doing 2048 bit private rsa's for 10s: 13162 2048 bit private RSA's in 9.97s
Doing 2048 bit public rsa's for 10s: 246512 2048 bit public RSA's in 9.96s
LibreSSL 2.8.3
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
des ede3         34426.72k    39593.28k    41604.84k    41932.50k    41488.42k
aes-256 cbc     133310.22k   148026.58k   152125.43k   341155.78k   356036.10k
aes-128-cbc    1341209.76k  1730061.34k  1759369.75k  1758876.41k  1729626.63k
sha256           63774.17k   158049.71k   322534.94k   437365.11k   473238.50k
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000758s 0.000040s   1320.1  24743.0

高级语言 to LLVM 的解释层

最近在做编译原理课程设计的设计,看了很多到 LLVM 的编译器的想法,同时发现 Rust 类型体操作为黑魔法合集也能带给社区很多新鲜玩意,就把之前设计 Chocopy LLVM 层的一些小想法放在这,上科大的同学想玩可以加个piazza,invite code: CHOCOPY。有一部分参考 High Level Constructs to LLVM_IR, 范型的设计更多参考 rust 和 c。

Continue reading "高级语言 to LLVM 的解释层"

POPL 22 attendency

1.21

1.22

CoqPL

PriSC

BPF and MDS

trace point autometa

Detection for kernel, maybe useful for metigation of the MDS.










MDS part





spectre


Cats and Rice game




Too strong to reduce the time difference of ARRAY_MAX and non ARRAY_MAX case

another case


spectre v2

spectre v4












FaCT








Towards Understanding Spectre-PHT in Memory Safe Language



SEEC










A weird thing in arm64 of operator << in gcc-11

I'm trying to do some log stuff in a Compiler project. When I'm trying to use the fmt::format library.

It was safe and sound to run with apple-clang 13, but when it comes to gcc-11 for the following line:

if ((x.second)->is_list_type()) {
    LOG(INFO) << fmt::format("{} : [{}]", x.first,
            ((ClassValueType *)((ListValueType *)x.second)->elementType)->className);
}

LogStream is something like:

class LogStream {
public:
    LogStream() { sstream_ = new std::stringstream(); }
    ~LogStream() = default;

    template <typename T> LogStream &operator<<(const T &val) noexcept {
        (*sstream_) << val;
        return *this;
    }

    friend class LogWriter;

private:
    std::stringstream *sstream_;
};


The operator << gets error reading the memory byte from the fmt byte, possibly because the author of GCC is not aware the pointer passed do not fit in the following ldur style of stream out. On x86 OSX machine, the GCC have some _M_is_leaked() check in the same line and on Windows MSVC, the line has reported the memory leakage for doubly linked pointer.

The compiled code is:

There's trick to maintain a compiler that have a universal error code output.