Here's some thoughts for CXL outer devices with CPU support, since I think the problem in 2024 is not merely another computational storage or Near Data Computation. It's neither merely the CPU support because the Latency behind loading the data is un tolerable. To support the both side, we still need to expose the Mojo-like superset support compared to Python. CXL is also a transparent tiering, caching, remote memory or cache management problem. The most meaningful integration of CXL is rewrite the application and let the data sharing between multiple host to be minimized under 256MB and indexing huge amount of memory pool. Then we may come to kernel and driver integration that how we can transparently manage the remote cache. Workloads like VectorDB, MCF(A motor graph data workloads) or GAP can do address compression in the driver level. We also see radix tree management for paging or bloom filter in rMMU or rTLB support.