ChinaSys 2024 Fall

AI Chip

TrEnv

堆扩展不行

remote fork 容器本身。

SIGMETRICS

变长图

低比特量化

visually

like omnitable

Skyloft

NeoMem

failure durability
type 2 accelerator.
PTX JIT is better than NVBit.

December 13, 2022November 18, 2023

possible solutions to 443 and 80 highjack

Once I'm updating my centos VPS to the latest ngnix and kernel, the server weirdly disables 443 and 80 connection.

Using Wireshark, I found multiple connections to my socks server on 1082 from multiple IP seem to be scanned and SYN attacked.

69	1.379581	104.149.139.86	82.102.27.93	TCP	54	1082 → 42168 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
122	2.078443	104.149.139.86	185.174.159.18	TCP	54	1082 → 57925 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
763	13.859168	104.149.139.86	185.174.159.18	TCP	54	1082 → 52535 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
1412	23.098894	82.102.27.93	104.149.139.86	TCP	74	SuperMic_42:b7:48
198	5.122016	104.149.139.86	212.109.221.254	TCP	54	JuniperN_bb:05:01
1352	29.871004	99.84.252.117	104.149.139.86	TLSv1.3	212	SuperMic_42:b7:48

I realize the open file for accept4 will have a limit by ulimit -n max opening file in parallel, which also limits the accept4 syscall. It was reset by kernel updates. Some nginx upload file limits also may be the outcome of this. After setting it to 65534, no 443 and 80 highjacks will be enforced.

February 1, 2022February 15, 2022

centos7/8 podman kernel not support overlayfs

Sounds like I'm messed up the mount of overlay hostPath that stores the containers when updating from centos7 to 8-stream. I also noticed that 9-stream is in beta.

[root@ecs-t6-large-2-linux-20190912001402 ~]# podman ps
Error: kernel does not support overlay fs: 'overlay' is not supported over extfs at "/var/lib/containers/storage/overlay": backing file system is unsupported for this graph driver

For Kubernetes, we automatically apply the pod using yaml file like

metadata:
  name: vo-hostpath-pod
spec:
  containers:
  - name: filebeat
    image: ikubernetes/filebeat:5.6.7-alpine
    env:                            
    - name: REDIS_HOST              
      value: redis.ilinux.io:6379   
    - name: LOG_LEVEL               
      value: info                   
    volumeMounts:                 
    - name: varlog            
      mountPath: /var/log   
    - name: socket                
      mountPath: /var/run/docker.sock
    - name: varlibdockercontainers 
      mountPath: /var/lib/docker/containers
      readOnly: true    
  volumes:            
  - name: varlog  
    hostPath:           
      path: /var/log   
      type: DirectoryOrCreate 
  - name: varlibdockercontainers
    hostPath:
      path: /var/lib/docker/containers
      type: Directory
  - name: socket
    hostPath:
      path: /var/run/docker.sock
      type: Socket

Debbug the command using strace

newfstatat(AT_FDCWD, "/root/bin/crun", 0xc00019e378, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/bin/runc", {st_mode=S_IFREG|0755, st_size=11889776, ...}, 0) = 0
openat(AT_FDCWD, "/etc/selinux/refpolicy/contexts/lxc_contexts", O_RDONLY|O_CLOEXEC) = 9
epoll_ctl(4, EPOLL_CTL_ADD, 9, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=1719281968, u64=140043422935344}}) = 0
fcntl(9, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(9, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
fstat(9, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0

The open at /etc/selinux/refpolicy/contexts/lxc_contexts is wierd so I think there's sth about the selinux, so I remvoe container-selinux and everythin works fine.

[root@ecs-t6-large-2-linux-20190912001402 ~]# docker ps
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
WARN[0000] Error validating CNI config file /etc/cni/net.d/10-flannel.conflist: [failed to find plugin "flannel" in path [/usr/local/libexec/cni /usr/libexec/cni /usr/local/lib/cni /usr/lib/cni /opt/cni/bin]]
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES

Also after upgrade to CentOS 8, don't forget to disable firealld and selinux because it'll update the settings. I debug it through self ping success and couldn't get anything from browser.

[root@ecs-t6-large-2-linux-20190912001402 ~]# sudo ss -tulpn
Netid                 State                  Recv-Q                 Send-Q                                                    Local Address:Port                                  Peer Address:Port                Process
udp                   UNCONN                 0                      0                                                               0.0.0.0:5355                                       0.0.0.0:*                    users:(("systemd-resolve",pid=1045,fd=12))
udp                   UNCONN                 0                      0                                                         127.0.0.53%lo:53                                         0.0.0.0:*                    users:(("systemd-resolve",pid=1045,fd=18))
udp                   UNCONN                 0                      0                                                         192.168.0.173:123                                        0.0.0.0:*                    users:(("ntpd",pid=640,fd=21))
udp                   UNCONN                 0                      0                                                             127.0.0.1:123                                        0.0.0.0:*                    users:(("ntpd",pid=640,fd=18))
udp                   UNCONN                 0                      0                                                               0.0.0.0:123                                        0.0.0.0:*                    users:(("ntpd",pid=640,fd=16))
udp                   UNCONN                 0                      0                                                                  [::]:5355                                          [::]:*                    users:(("systemd-resolve",pid=1045,fd=14))
udp                   UNCONN                 0                      0                                      [fe80::f816:3eff:fe8b:cbe7]%eth0:123                                           [::]:*                    users:(("ntpd",pid=640,fd=22))
udp                   UNCONN                 0                      0                                                                 [::1]:123                                           [::]:*                    users:(("ntpd",pid=640,fd=19))
udp                   UNCONN                 0                      0                                                                  [::]:123                                           [::]:*                    users:(("ntpd",pid=640,fd=17))
tcp                   LISTEN                 0                      9                                                               0.0.0.0:21                                         0.0.0.0:*                    users:(("pure-ftpd",pid=601,fd=5))
tcp                   LISTEN                 0                      511                                                             0.0.0.0:888                                        0.0.0.0:*                    users:(("nginx",pid=2970,fd=20),("nginx",pid=2969,fd=20),("nginx",pid=2792,fd=20))
tcp                   LISTEN                 0                      128                                                             0.0.0.0:8888                                       0.0.0.0:*                    users:(("BT-Panel",pid=764,fd=6))
tcp                   LISTEN                 0                      100                                                           127.0.0.1:25                                         0.0.0.0:*                    users:(("master",pid=1030,fd=16))
tcp                   LISTEN                 0                      511                                                             0.0.0.0:443                                        0.0.0.0:*                    users:(("nginx",pid=2970,fd=22),("nginx",pid=2969,fd=22),("nginx",pid=2792,fd=22))
tcp                   LISTEN                 0                      511                                                           127.0.0.1:6379                                       0.0.0.0:*                    users:(("redis-server",pid=1706,fd=6))
tcp                   LISTEN                 0                      128                                                             0.0.0.0:5355                                       0.0.0.0:*                    users:(("systemd-resolve",pid=1045,fd=13))
tcp                   LISTEN                 0                      1024                                                          127.0.0.1:11211                                      0.0.0.0:*                    users:(("memcached",pid=734,fd=28))
tcp                   LISTEN                 0                      128                                                             0.0.0.0:9999                                       0.0.0.0:*                    users:(("sshd",pid=1244,fd=5))
tcp                   LISTEN                 0                      511                                                             0.0.0.0:80                                         0.0.0.0:*                    users:(("nginx",pid=2970,fd=21),("nginx",pid=2969,fd=21),("nginx",pid=2792,fd=21))
tcp                   LISTEN                 0                      9                                                                  [::]:21                                            [::]:*                    users:(("pure-ftpd",pid=601,fd=6))
tcp                   LISTEN                 0                      100                                                               [::1]:25                                            [::]:*                    users:(("master",pid=1030,fd=17))
tcp                   LISTEN                 0                      150                                                                   *:3306                                             *:*                    users:(("mysqld",pid=2166,fd=19))
tcp                   LISTEN                 0                      128                                                                [::]:5355                                          [::]:*                    users:(("systemd-resolve",pid=1045,fd=15))
tcp                   LISTEN                 0                      1024                                                              [::1]:11211                                         [::]:*                    users:(("memcached",pid=734,fd=29))
tcp                   LISTEN                 0                      128                                                                [::]:9999                                          [::]:*                    users:(("sshd",pid=1244,fd=6))
[root@ecs-t6-large-2-linux-20190912001402 ~]#

January 29, 2022March 18, 2022

高级语言 to LLVM 的解释层

最近在做编译原理课程设计的设计，看了很多到 LLVM 的编译器的想法，同时发现 Rust 类型体操作为黑魔法合集也能带给社区很多新鲜玩意，就把之前设计 Chocopy LLVM 层的一些小想法放在这，上科大的同学想玩可以加个piazza，invite code: CHOCOPY。有一部分参考 High Level Constructs to LLVM_IR, 范型的设计更多参考 rust 和 c。

Continue reading "高级语言 to LLVM 的解释层"

January 22, 2022January 23, 2022

POPL 22 attendency

1.21

1.22

CoqPL

PriSC

BPF and MDS

trace point autometa

Detection for kernel, maybe useful for metigation of the MDS.

MDS part

spectre

Cats and Rice game

Too strong to reduce the time difference of ARRAY_MAX and non ARRAY_MAX case

another case

spectre v2

spectre v4

FaCT

Towards Understanding Spectre-PHT in Memory Safe Language

SEEC

November 8, 2021November 8, 2021

`ucx` not necessarily occupy all the cores at all times even bind by core when using openmpi

root@epyc:~# uname -a
Linux epyc.node2 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64 GNU/Linux
root@epyc:~# ldd --version
ldd (Debian GLIBC 2.28-10) 2.28
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

In cases of quantum espresso

mpirun -hostfile ../AUSURF112/host --mca pml ucx --mca btl sm,rc,ud,self --mca btl_tcp_if_include 192.168.10.0/24 --bind-to core -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS=1 -np 256 /home/qe/sb/bin/pw.x -nk 4 -nd 64 -i ./grir443.in > 11_7_out256 2> 11_7_out256err

It would dead for a while

The bug is reported by pthread_rwlock.
78F81AA7630E54DC102C3E66FCF43E97

diff --git 2.28/nptl/pthread_rwlock_common.c 2.29/nptl/pthread_rwlock_common.c
index a290d08332..81b162bbee 100644
--- 2.28/nptl/pthread_rwlock_common.c
+++ 2.29/nptl/pthread_rwlock_common.c
@@ -310,6 +310,7 @@ __pthread_rwlock_rdlock_full (pthread_rwlock_t *rwlock,
          if (atomic_compare_exchange_weak_relaxed
              (&rwlock->__data.__readers, &r, r | PTHREAD_RWLOCK_RWAITING))
            {
+              r |= PTHREAD_RWLOCK_RWAITING;
              /* Wait for as long as the flag is set.  An ABA situation is
                 harmless because the flag is just about the state of
                 __readers, and all threads set the flag under the same

September 30, 2021October 10, 2021

TwinVisor: Hardware-isolated Confidential Virtual Machines for ARM @SOSP2021

The foundation of trustzone

Here's the graph extrated from [1], essentially to tell the root of trust. A secure system depends on every part in the system to cooperate. For SGX, the Trusted Computing Base(Trusted Counter/ RDRAND/ hardware sha/ ECDSA) is the memory region allocated from a reserved memory on the DRAM called the Enclave Page Cache (EPC), which is initialized at the booting time. The EPC is currently limited to 128MB (in IceLake, was raised to 1TB with weakened HW support. Only 96 MB(24K*4KB pages) could be used, 32MB is for various metadata.) To prevent distruptions by physical attack or previledge software attack from cacheline-granularity modification, every cacheline can be assoiciated with a Message Authentication Code(MAC), but this does not prevent replay attack. To extend the trusted region of memory and do not introduce huge overheads, one solution is put the construct the merkle tree, that every cacheline of leaf is assured by MAC and root MAC is stored at EPC. Transaction Memory Abort with SGX can be leveraged to do page fault side-channel. The transaction memory page fault attack on peresistent memory is still under research.

For Riscv, we have currently 2 proposals - Keystone and Penglai for enclave and every vendor has different implementations. Keystone essentially utilize M- mode PMP limited special registers the control permissions of U- mode and S- mode accesses to a specified memory region. The number/priority of PMP could be pre-configured. and the addressing is mode of naturally aligned power-of-2 regions (NAPOT) and base and bound strategy. The machine mode is unavoidable introduce physical memory fragmentation and waste: everytime you enter another enclave, you have to call M- mode once. Good Side is S/U- Mode are both enclaved by M- mode with easy shared buffer and enclave operation throughout all modes. Penglai has upgraded a lot since its debut(from 19 first commit on Xinlai's SoC to OSDI 21). The originality for sPMP is to reduces the TCB in the machine mode and could provides guarded page table(locked cacheline), Mountable Merkle Tree and Shadow Fork to speed up. However, it introduce the double PMPs for OS to handle, and overhead of page table walk could still be high, which makes it hard to be universal.

Starting from Penglai, IPADS continuously focus on S- mode Enclaves. One of the application may be the double hypervisor in the secure/non-secure S- Mode. The Armv8.4 introduce the both secure and non-secure mode hypervisor originally to support cloud native secure hypervisor. TwinVisor is to run unmodified VM images both as normal and confidential VMs. Armv9 introduce the Confidential Compute Architecture(CCA), another similar technology. TwinVisor is an pre-opensource implementation of it.

supported trustzone extention starting from Armv7.

AMBA-AXI bus extension, adding the flags secure read and write address lines: AWPROT and ARPROT.
extension of controller (or extension of master), adding SCR.NS bits inside ARM Core, so that operations initiated by ARM Core can be marked as "access initiated as secure or access initiated as non-secure".
TZPC extension, TZPC is added to the AXI-TO-APB side to configure the apb controller privileges (or secure controller).
TZASC extension, in the DDRC (DMC) on top of the addition of a memory filter.
MMU support for security extensions:
1. TTBRx_EL0, TTBRx_EL1 extension: In Armv7, these two registers are banked for secure and non-secure attributes, that is, there is a set of such registers in the secure and non-secure worlds, so in linux and tee, each can maintain a memory page table of its own. The secureos and monitor could share the page table if they are both 64 bits.
2. cache extension: add the (non-)secure attributes.
3. VSTTBR_EL2 extension: Since Armv8.4, when the non-secure world uses TTBR_EL2 to translate the address, the entry attribute is checked to be secure and will be translated by itself.
GIC to secure extensions. The trap is devided into group0, secure group1 and non-secure group1. The group0 and secure group1 will not trap to linux.

Proposed Attack Model

The author mentioned physical attack or previledge software attack from N-VM to S-VM, this can be prevent by controlling the transmission channel.

TACTOC attack led by Shared Pages for General-purpose Registers, check-after-load way [50] by reading register values before checking them.

Design

Horisontal trap: modifies the N-visor to logically deprivledge N-visor without sharing the data. Exeptional Return(ERET) is the only sensitive instruction affect trusted chain, it intercepted by TZASC and repoted to S-visor.
Shadow S2PT: shadow page table of VSTTBR_EL2, used in kvm, too. It has page fault with different status when in different world.
Split Continuous Memory Allocation: Tricks to improves utilization and speed up memory management in Twinvisor. In linux, buddy allocator used to decide a continuous memory is big enough for boot and do CMA, this is for better performance of IOMMU that require physical memory to be continuous. (This deterministic algorithm makes it easy for memory probing and memory dump by e.g. row hammer/DRAMA ).
Efficient world switch: change NS bit in SCR_EL3 register in EL3, side core polling and shared memory to avoid context switches
Shadow PV I/O: use shadow I/O rings and shdow DMA buffer to be transparent to S-VMs. reduce ring overhead by do IRQ only when WFx instructions.

Experiment

Suppose

The world switch does not happen so frequently.

Hardware

Kirin 990. (Not scalable to Big machines, because KunPeng920 is not yet Armv8.4, scability is not convincible)

Reference

A Survey on RISC-V Security: Hardware and Architecture TAO LU, Marvell Semiconductor Ltd., USA
MIT 6.888
ShieldStore: Shielded In-memory Key-value Storage with SGX
Improving the Performance and Endurance of Encrypted Non-volatile Main Memory through Deduplicating Writes
RiscV Spec 1.11
Armv7 TZ
lwn CMA and IOMMU

August 24, 2021February 16, 2022

Phosphor - My Pitfalls writing dependency

Currently, I'm busy writing emails for my Ph.D and taking TOEFL and taking care of the Quantum ESPRESSO library changing and MadFS Optimization, so it may waste some time. Till now, I have to apply the DTA tool of phosphor for the java order dependency project.

about surfire integration into normal tests.

Maven extension
- Integration into Maven add the redirector
  - Insert phosphor plugin one class by one into.
  - Configuration to the phosphor
  - Class Visitor, Method Visitor, Adaptor Mode Visitor
- Mutable field in the Dependency Tainter
  - Start the taint for some place attach the tainted check after the test
  - Assert the junit stuf in check=omparison.
  - Brittle assertions in check(Taint) recursively.
- Output the tainted version into the sufire executable folder
Debug
- mvn install -Dmaven.surefire.debug -f /Volumes/DataCorrupted/project/UIUC/bramble/integration-tests/pom.xml and attach the trace point.
  - Start from the maven compilation.

Brittle Assertion

This outputs only the dependency for one test introduced in Oracle Polish JPF. For dependenct between test1 and test2,

For NPE, get the pair by idflakies test first.

JVM Asm

Reference

https://www.kingkk.com/2020/08/ASM%E5%8E%86%E9%99%A9%E8%AE%B0/

June 7, 2021November 18, 2023

Rethinking File Mapping Structures for Persistent Memory.

April 12, 2021November 18, 2023