Scheduling asynchronous page migration based on the access pattern.
Hardware support is crucial
- Page access scans alone have high latency
- PMU address sampling drastically reduces promotion latency (access to promotion time)
- Earlier promotion improves performance
Per application policy is crucial
- The ufard they run in the userspace is per processs control flow
- Each application's policy of migration page should be separated and have conflict using PGO
Thoughts
- PGO rather than online PEBS? because PEBS's overhead is huge, even if you start in a seperate threads, or lower the sample period to 10k or 2m.
- The TLB should be hidden by CXL.cache atomic exchange cacheline and no need to update the page table. The page table reuse distance should be also considered, since either way of updating page table 1. mark page ro and migrate or atomic exchange requires timing next time use this page.
- will eBPF to control all the policies be a better choice? offloading policy to rc/ep