Copy-on-Pin: The Missing Piece for Correct Copy-on-Write @ASPLOS’23


Nadav has been enumerating the Intel extensions providing support for virtualization for VMware and providing security mitigation or debugging applying for the Intel extensions. And provides things like userspace memory remote paging [2] for providing VMware a better service disaggregation technology. They've been investigating the vulnerability of IOMMU with the DMA [1] and remote TLB shootdown performance bugs(updating the page table will incur TLB shootdown) by introducing con-current flushing, early acknowledgment, cacheline consolidation, and in-context TLB flushes.

This paper examines the interaction between COW and pinned pages, which are pages that cannot be moved or paged out to allow the OS or I/O devices to access them directly.

Basically, we need a COW-share prevention on the pinned page. The Missing Piece for Correct Copy-on-Write which considers how COW interacts with other prevalent OS mechanisms such as POSIX shared mapp1ings, caching references, and page pinning. It defines an invariant that indicates if there is any private writable mapping it must be a single exclusive mapping and provides test cases to evaluate COW and page pinning via O_DIRECT read()/write() in combination with fork() and write accesses.

For implementation, they made a tool similarly to dynamic taint analysis that mark an exclusive flag for page(possibly of CXL to make a hardware software codesign of this, but in a cacheline or page granularity). This flag also introduces refinements to avoid unnecessary copies and handles swapping, migration and read-only pinning correctly. An evaluation of the performance of RelCOP compared to two prior COW handling schemes shows that it does not introduce noticeable overheads. An implementation of this design was integrated into upstream Linux 5.19 with 747 added and 340 removed lines of code. Evaluation results show that RelCOP performs better than PreCOP by up to 23% in the sequential access benchmark and 6% in the random access benchmark without introducing noticeable overheads.


  1. Characterizing, Exploiting, and Detecting DMA CodeInjection Vulnerabilities in the Presence of an IOMMU @Eurosys'20
  3. Don't shoot down TLB shootdowns!