TMO: Transparent Memory Offloading in datacenters

Both of the papers are from Dimitrios

Memory offloading

Because the memory occupation on a single node is huge, we are required to offload them into far memory.

They have to model what the memory footprint is like. And what's shown in the previous work zswap, it only has a single slow memory tier with compressed memory and they only have offline application profiling, which the metric is merely page-promotion rate.

Transparent memory offloading

Memory Tax comes can be triggered by infrastructure-level functions like packaging, logging, and profiling and microservices like routing and proxy. The primary target of offloading is memory tax SLA.

TMO basically sees through the resulting performance info like pressure stall info to predict how much memory to offload.

Then they use the PSI tracking to limit the memory/IO/CPU using cgroup, which they called Senpai.

IOCost reclaims not frequently used pages to SSD.

Reference

  1. Jing Liu's blog
  2. Software-Defined Far Memory in Warehouse-Scale Computers
  3. Cerebros: Evading the RPC Tax in Datacenters
  4. Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator

TPP: Transparent Page Multi-tier CXL-Memory

CXL在这里的作用可以理解为External cache coherent agent,一个带CC的iMC,带宽肯定比NUMA高,但是这个工作只提升了18%,他们的策略是用PMU collect data,然后冷热load 和flush,给个temperature的grading,个人觉得肯定有page coloring那种优秀的算法搞定这件事,比这个数高。

IOring Windows at first sight and migration to `monoio`

最近在和LemonHX一起写个跨平台下载器,想要的是个延时确定的协程调度器,然后我们就看上了字节开源的monoio,准备贡献一波Windows部分。

主要需要跨平台抽象的部分已经写好了, GAT 刚进主线, 其实感觉贡献这个更经济一点. 字节内好像也没有开始用这个, 只是做了点测试.