First-generation Memory Disaggregation for Cloud Platforms @Arxiv

CXL disaggregation because:

  1. Memory inefficiency: s platform-level memory stranding
  2. Current cloud vendor try on memory disaggregation: require no modifications to
    customer workloads or the guest OS./ the system
    must be compatible with virtualization acceleration techniques/ the system must be available as
    commodity hardware.

CXL greatly facilitated fast and deployable disaggregated memory. but has questions:

  1. A memory pool?
  2. Balance pool size with higher latency of large pools
  3. How provider manages and exposes the pooled memory to guest OSes
  4. how much additional memory latency can cloud workloads tolerate
  5. how should the provider schedule VMs on machines with CXL memory

CXL has 2-32 ports to different nodes through an external memory controller.

Numa is not applicable for affinity on CXL, because the memory on CXL does not have its upper core. A practice to introduce a similar Numa case is zero-copy Numa, which exposes pool memory to a VM’s guest OS as a zero-core virtual NUMA (zNUMA) node, that takes the cache coherency inside CXL into consideration but to a higher granularity, only with hot/code states on memory stranding. They also predict the latency model for lower VM memory access.

CXL standardizes a protocol for memory pooling, which is better for disaggregation

(Someone asked about the CC of heterogeneous ARM/Intel cores having the same Semantic to load/store to the pool? the answer is they are observed by itself as the same order, meaning it does not hurt its original order.) Pool memory management is complicated by hardware resource constraints which force memory allocation to happen at 1GB granularity. and poses memory fragmentation challenges similar to huge pages. Currently the scheduler policy on CXL is not defined. Does the CXL has page coloring technology stuff?

Memory stranding:When the DRAM-to-core ratio of VM arrivals and the server resources does not match, tight packing becomes more difficult and resources may end up stranded.

Disaggregation System Design

Support PR1 vm performance/PR2 Resource efficiency/PR3 small blast radius, FR1 Customer inertia/FR2 Compatibility(SR-IOV/DDA protocol)/FR3 Commodity hardware.

Hardware configuration


EMC is CXL.mem ensured device.

Software configuration

Reference

  1. https://www.adelaidecitycaravans.com.au/wp-content/uploads/formidable/6/intel-ddr-t-protocol.pdf
  2. https://techxplore.com/news/2022-03-cxl-based-memory-disaggregation-technology-big.html
  3. https://tldp.org/LDP/tlk/dd/pci.html
  4. https://bsodtutorials.wordpress.com/2014/01/23/understanding-pci-configuration-space/