Multi-Generation LRU

HeMem has a critique that access bit based sampling is slow, so they use pebs, while TPP leverages the autoNUMA to rely on the kernel's LRU-list approach to denote. Then I found the MGLRU approach that can additionally select the aging pages(A rmap walk targets a single page and does not try to profit from discovering a young PTE.) with the better spatial locality of scanning access bit approach.

Focus on both memory-backed files, which give detailed results and more general cases like anon page in page table access which they have assumptions of w & w/o temporal locality.

Overhead Evaluation through eBPF

Does it matches the LRU performance?

According to the DynamoRIO results, 5% of the perfect LRU in local get get to 95% of the performance.