SegCache: a memory-efficient and scalable in-memory key-value cache for small objects NSDI 21'

Yue Lu from Twitter, and collaborator Juncheng Yang and Rashmi from CMU. Pelikan (starting 2020), mostly written by Brian Martin(I really like his obsession to rust) is the current production in-memory cache for small data in twitter which is inspired by the traces captured with eBPF last year can do a smaller granularity analysis of memory cache.

Continue reading "SegCache: a memory-efficient and scalable in-memory key-value cache for small objects NSDI 21'"

[Program Analysis] Intraprocedural Analysis

CHA analysis is used to make too conservative assumptions to the method call in the Intraprocedural Analysis. All the results of the analysis should be. save. According to the Lattice Theory, the must and may analysis should be less precise. So the Interprocedural Analysis to see the data flow in the BB and the Call Graph to see the data flow propagation between functions and raise the precision of the analysis is very important.

Continue reading "[Program Analysis] Intraprocedural Analysis"

A command line string to the class constructor.

There's the demand of passing a name and doing the class construction of this name. I don't want to make the switch case on the construct. Let's do the hack!

I found the base implementation in stackoverflow.

template <class T> void* constructor() { return (void*)new T(); }

struct factory
{
   typedef void*(*constructor_t)();
   typedef std::map<std::string, constructor_t> map_type;
   map_type m_classes;

   template <class T>
   void register_class(std::string const& n)
   { m_classes.insert(std::make_pair(n, &constructor<T>)); }

   void* construct(std::string const& n)
   {
      map_type::iterator i = m_classes.find(n);
      if (i == m_classes.end()) return 0; // or throw or whatever you want
      return i->second();
   }
};

factory g_factory;

#define REGISTER_CLASS(n) g_factory.register_class<n>(#n)

The problem is it does not allow the arg passing in construction. My class accepts the arguments module.

template <class T, typename M_> void *constructor(M_ *module_) {
    return (void *)new T{reinterpret_cast<M_ *>(module_)};
}
template <typename M_> struct arg_to_pass {
    typedef void *(*constructor_t)(M_ *);
    typedef std::map<std::string, constructor_t> map_type;
    map_type m_classes;
    M_ *module;

    template <class T> void register_class(std::string const &n, M_ *&module_) {
        module = module_;
        m_classes.insert(std::make_pair(n, &constructor<T, M_>));
    }

    void *construct(std::string const &n) {
        auto i = m_classes.find(n);
        if (i == m_classes.end())
            return nullptr; // or throw or whatever you want
        return i->second(module);
    }
};

arg_to_pass<Module> pass_factory;

#define REGISTER_CLASS(n, m_) pass_factory.register_class<n>(#n, m_)

This will resolve all the problem.

FusionFS: Fusing I/O Operations using $CISC_{Ops}$ in Firmware File Systems

The paper is joined work between my upperclassman Jian Zhang who's currently taking Ph.D. at Rutgers.

Current Hw-Sw co-design

  • Hardware Trend
    • Design a fast path to reduce latency.
  • Software Trend
    • Do kernel bypass/zero-copy

Good

FusionFS comes up with aggregated I/O ops into $CISC_{Ops}$, the fuses and offloads data ops are carried out on the co-processor on storage. These higher throughputs are gained with assurance to the resource management fairness, crash consistency, and fast recovery.

  • Kernel FS pushes all the W/R to the VFS Layer, this does not necessarily mean it's slow, often the time waiting for heavy-weighted Writeback, page cache is not hit, I/O queue locks waiting for the device ready, or deep VFS calls.

  • User FS may have some of the W/R intercepted and bypass the kernel. Some of the userspace semantic fusion is implemented using FUSE.

  • Device FS(Before CrossFS is Firmware FS) makes FS Lib directly call the firmware to wait until it can make DMA to memory.

    • Good for Disaggregation & Concurrency throughput
    • Mainly for NVM when speed is high, not applicable to SSDs
  • This paper used Compute Offloading, which is greatly applied in the SMartNIC. Storage plus the data processing makes transparent to the kernel, the kernel only needs to know some of the results is fused.

    • write fusion
    • read fusion
    • data replacement for locality
    • PolarDB - PCIe layer compute offloading. I think it could be replaced by CXL.
  1. crc-append interpreted into CISCops Basically, based on the predefined rules, the co-processor is able to fuse most of the data operations like LevelDB CRC, open read-write close.

  2. CFS I/O scheduling.

  3. Durability maintained by Micro Tx.

Bad

  1. Large sequential data read/write will introduce preprocessor overhead, at least for data calculation and buffer store. Can pattern matching and make bypass the data processing.

  2. This paper shared a lot of similar designs with CrossFS for resource management, durability, and Permission checks.

  3. I'm curious why not implement the SSD main controller? It's meaningless to write on NVM because programmers must do handmade I/O fusion on such devices.

  4. Performance is roughly the same with NOVA when with slow device CPU. I don't know if IO thread affinity and other kernel optimization are applied, the additional hardware has real benefits. However, the recovery speed is really quick because of MicroTx.

Refinement

  1. Still could apply kernel bypass over the FusionFS.
  2. SSD main controller/ Memory controller implementation is better than adding another CPU.

Reference

  1. POLARDB Meets Computational Storage: Efficiently Support AnalyticalWorkloads in Cloud-Native Relational Database
  2. CrossFS: A Cross-layered Direct-Access File System

Install Asahi Linux

❯ curlie https://alx.sh | sudo sh                                                                                (base)
HTTP/2 200
server: nginx/1.21.1
date: Sat, 19 Mar 2022 15:19:56 GMT
content-type: text/plain; charset=utf-8
content-length: 967
cache-control: max-age=300
content-security-policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
etag: "af389cf3253fe3f924350c99c293434d1c78883b3c51676bf954211c7bb0872b"
strict-transport-security: max-age=31536000
x-content-type-options: nosniff
x-frame-options: deny
x-xss-protection: 1; mode=block
x-github-request-id: 821E:6DAB:83C117:8E8F7F:623518DF
accept-ranges: bytes
via: 1.1 varnish
x-served-by: cache-fra19151-FRA
x-cache: HIT
x-cache-hits: 1
x-timer: S1647703197.985480,VS0,VE1
vary: Authorization,Accept-Encoding,Origin
access-control-allow-origin: *
x-fastly-request-id: 7ac18bfda3e671762cc598d7521b9663e583d871
expires: Sat, 19 Mar 2022 15:24:56 GMT
source-age: 195
strict-transport-security: max-age=31536000;


Bootstrapping installer:
  Checking version...
  Version: v0.3.5
  Downloading...
  Extracting...
  Initializing...


Welcome to the Asahi Linux installer!

This installer is in an alpha state, and may not work for everyone.
It is intended for developers and early adopters who are comfortable
debugging issues or providing detailed bug reports.

Please make sure you are familiar with our documentation at:
  https://alx.sh/w

Press enter to continue.


By default, this installer will hide certain advanced options that
are only useful for developers. You can enable expert mode to show them.
» Enable expert mode? (y/N): y

Collecting system information...
  Product name: MacBook Pro (16-inch, 2021)
  SoC: Apple M1 Max
  Device class: j316cap
  Product type: MacBookPro18,2
  Board ID: 0xa
  Chip ID: 0x6001
  System firmware: iBoot-7459.101.2
  Boot UUID: 57F7E9CF-77CE-4D0F-83A4-15967BA49F69
  Boot VGID: 57F7E9CF-77CE-4D0F-83A4-15967BA49F69
  Default boot VGID: 57F7E9CF-77CE-4D0F-83A4-15967BA49F69
  Boot mode: macOS
  OS version: 12.3 (21E230)
  System rOS version: 12.3 (21E230)
  No Fallback rOS
  Login user: yiweiyang

Collecting partition information...
  System disk: disk0

Collecting OS information...

Partitions in system disk (disk0):
  1: APFS [Macintosh HD] (850.00 GB, 6 volumes)
    OS: [B*] [Macintosh HD] macOS v12.3 [disk3s1, 57F7E9CF-77CE-4D0F-83A4-15967BA49F69]
  2: (free space: 144.66 GB)
  3: APFS (System Recovery) (5.37 GB, 2 volumes)
    OS: [  ] recoveryOS v12.3 [Primary recoveryOS]

  [B ] = Booted OS, [R ] = Booted recovery, [? ] = Unknown
  [ *] = Default boot volume

Using OS 'Macintosh HD' (disk3s1) for machine authentication.

Choose what to do:
  f: Install an OS into free space
  r: Resize an existing partition to make space for a new OS
  q: Quit without doing anything
» Action (f): f

Choose an OS to install:
  1: Asahi Linux Desktop
  2: Asahi Linux Minimal (Arch Linux ARM)
  3: UEFI environment only (m1n1 + U-Boot + ESP)
  4: Tethered boot (m1n1, for development)
» OS: 1

Downloading OS package info...
.

Minimum required space for this OS: 15.00 GB

Available free space: 144.66 GB

How much space should be allocated to the new OS?
  You can enter a size such as '1GB', a fraction such as '50%',
  the word 'min' for the smallest allowable size, or
  the word 'max' to use all available space.
» New OS size (max): max

The new OS will be allocated 144.66 GB of space,
leaving 167.94 KB of free space.

Enter a name for your OS
» OS name (Asahi Linux):

Choose the macOS version to use for boot firmware:
(If unsure, just press enter)
  1: 12.3
» Version (1): 1

Using macOS 12.3 for OS firmware

Downloading macOS OS package info...
.

Creating new stub macOS named Asahi Linux
Installing stub macOS into disk0s5 (Asahi Linux)
Preparing target volumes...
Checking volumes...
Beginning stub OS install...
++
Setting up System volume...
Setting up Data volume...
Setting up Preboot volume...
++++++++++
Setting up Recovery volume...
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Wrapping up...

Stub OS installation complete.

Adding partition EFI (500.17 MB)...
  Formatting as FAT...
Adding partition Root (144.16 GB)...
Collecting firmware...
Installing OS...
  Copying from esp into disk0s4 partition...
+
  Copying firmware into disk0s4 partition...
  Extracting root.img into disk0s7 partition...
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++Error downloading data (IncompleteRead(10240000 bytes read, 1294336 more expected)), retrying... (1/5)
+Error downloading data (IncompleteRead(1130496 bytes read, 10403840 more expected)), retrying... (2/5)
+++++++++++++++++++++++++++++++++++++Error downloading data (IncompleteRead(5979282 bytes read, 5555054 more expected)), retrying... (1/5)
+++++++
Preparing to finish installation...
Collecting installer data...

To continue the installation, you will need to enter your macOS
admin credentials.

Password for yiweiyang:

Setting the new OS as the default boot volume...

Installation successful!

Install information:
  APFS VGID: 052A3A5A-E253-4B46-AF6A-F9615F39844B
  EFI PARTUUID: 487bfda1-77ae-4a4c-8462-3cba91f073a7

To be able to boot your new OS, you will need to complete one more step.
Please read the following instructions carefully. Failure to do so
will leave your new installation in an unbootable state.

Press enter to continue.




When the system shuts down, follow these steps:

1. Wait 15 seconds for the system to fully shut down.
2. Press and hold down the power button to power on the system.
   * It is important that the system be fully powered off before this step,
     and that you press and hold down the button once, not multiple times.
     This is required to put the machine into the right mode.
3. Release it once 'Entering startup options' is displayed,
   or you see a spinner.
4. Wait for the volume list to appear.
5. Choose 'Asahi Linux'.
6. You will briefly see a 'macOS Recovery' dialog.
   * If you are asked to 'Select a volume to recover',
     then choose your normal macOS volume and click Next.
     You may need to authenticate yourself with your macOS credentials.
7. Once the 'Asahi Linux installer' screen appears, follow the prompts.

Press enter to shut down the system.

Twitter card `ERROR: Failed to fetch page due to: ChannelClosed` pitfalls behind Nginx proxy

Twitter seems not to accept my newly composed wordpress behind nginx with error code ERROR: Failed to fetch page due to: ChannelClosed

Using tcpdump -i eth1 -X -e -w test.cap on my psychz VPS. it gets

Sounds the handshakes return with different SSL version. ssl_ciphers 'ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:!3DES'; from https://wordpress.org/support/topic/twitter-card-images-not-showing/ did not work. So, eventually, I found I didn't switch on the Cloudflare full encryption. Everything is good.