ASPLOS23 attendency



Xiangshan Tutorial



主要是介绍他们的firesim的,就问他们什么时候更新f1 vu9p


Performance vs. Correctness When Writing Low-Level HPC Code

Exploring Performance of Cache-Aware Tiling Strategies in MLIR Infrastructure

Intel OneDNN在MLIR上approach

PyAIE: A Python-based Programming Framework for Versal ACAP AI Engines


A Scalable Formal Approach for Correctness-Assured Hardware Design

Jin Yang 大师的,之前在AHA讲过了,

Designing a Dataflow Hardware Accelerator with an Abstract Machine

ASTRA-sim: Enabling SW/HW Co-Design Exploration for Distributed Deep Learning Training Platforms


Accelerating Sparse Tensor Algebra by Overbooking Buffer Occupancy

Detecting Microarchitectural Vulnerabilities via Fuzz Testing of White-box CPUs

用fuzzing地手段找Store Bypass。

ConstSpec: Mitigating Cache-based Spectre Attacks via Fine-Gain Constant-Time Accesses

SMAD: Efficiently Defending Against Transient Execution Attacks

这次被分配的mentor的学生的,这个mentor在GPU side channel很著名。

FireSim and Chipyard User/Developer Workshop

Integrating a high performance instruction set simulator with FireSim to cosimulate operating system boots By tesorrent


Session 1B: Shared Memory/Mem Consistency

这个chair是admit,辣个VMWare最会排列组合Intel ext的男人

Cohort: Software-Oriented Acceleration for Heterogeneous SoCs

这篇是在fpga上自己定义L1/L2 cache和crypto accelerator。然后怎么弄在一起,在CXL.cachehze就不是一个问题。

Probabilistic Concurrency Testing for Weak Memory Programs

一个PCT Frameware,用SC的规范来assert,找bug。


hit bug 更快

Hieristic for h is good enough for data structure test. assertion tests looks great, When I was in shanghaitech, there’s people using the same tool on PM.


MC Mutants: Evaluating and Improving Testing for Memory Consistency Specifications

Transform disallowed memory to weak memory label.

一个binary translator

Protect the System Call, Protect (Most of) the World withBASTION

Session 2A: Compiler Techniques & Optimization

SPLENDID: Supporting Parallel LLVM-IR Enhanced Natural Decompilation for Interactive Development

Beyond Static Parallel Loops: Supporting Dynamic Task Parallelism on Manycore Architectures with Software-Managed Scratchpad Memories

Graphene: An IR for Optimized Tensor Computations on GPUs

Coyote: A Compiler for Vectorizing Encrypted Arithmetic Circuits

NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers


Session 5B (Storage)

Session 7A (Deep Learning Systems)