Summary
-
Motivation for the paper the current distributed system complies the Lamport's happened-before logic and uses mutex lock to lock the variable that prevents data race from other threads accessing the variable.
-
They use the lockset algorithm that maintains an observation thread for seeing if the same locks are held whenever the shared data is accessed and infer which locks protect which data item. Since Eraser has no way to know the purpose of the lock, it must obtain the lock protection relationship based on the history. For each shared variable v, Eraser maintains a candidate lock set C(v), which holds all locks that have protected v. When the variable is initialized, C(v) is the set of all possible locks. When the variable is accessed, C(v) <- C(v) $\cap$ set of locks for the current thread. Obviously, this set will only get smaller or remain stable. Terminal: All threads protect the same resource with consistent locks (immobilization point) => Refined; C(v) is empty => Inconsistent. To weaken the strong rule, we can assume the initialization has no lock and the read-only data do not require a lock and the read-write lock is single-writer multiple-readers. If no other thread can refer to it, then there is no need to lock other threads. This is true for newly allocated data. To avoid false alerts, Eraser delays the refinement of candidate sets until after initialization is complete. However, confirming that initialization is complete is difficult, so this is delayed again until the shared variable is first accessed by a second thread. If the shared variable is always read or written by only one thread, then the refinement is always not executed. Furthermore, for read-only objects, again, protection is not necessary. Such support can be provided by warning only those shared variables that are written by multiple threads.
If updated because of the writer, we should only save locks that are in reading mode.
Eraser is implemented in Digital Unix OS on the Alpha processor using the ATOM [Srivastava and Eustace 1994] binary modification system. Eraser takes the unmodified binary code as input and produces new binary code that is functionally consistent but with Eraser runtime inserted into it. Monitor each global/heap load/store to maintain C(v) (via address and stack pointer relationships) Monitor each lock acquire/release call and thread creation/termination to maintain the locks held by the threads. Monitor each memory allocation to initialize C(v), and each word has the potential to become a shared variable. (So much trouble, mostly because changes are being made to the binary, not the source code) Maintain the set of stack addresses accessible through registers rather than stack pointers. When a race occurs, Eraser logs to the file and reports the line number, traceback of the stack frame, thread ID, memory address, access method, PC/RSP value, etc. to locate the problem code. If that doesn't find it, then let Eraser log each access. -
The author found that the general overhead comes from the fact that a function call is required for each monitoring in ATOM, but upgrading to version 1996 allows inlining. Therefore he directly concluded that performance was not greatly affected and that it was not the main purpose of the program. The slowdown by a factor of 10 to 30x is unavoidable.
Critique
- They monitor C/C++ allocators and locks that require binary rewriting and runtime function calls. It still couldn't observe the kernel level and microarchitecture level races. Like there's a soft IRQ that changes the memory state, the simple model of observing aside is not valid. For modeling, the memory safety/ race condition inside the kernel should introduce modeling for all possible state changes that will make the tool much more soundiness.
- This is a combination of art and engineering. It's art is because it utilizes some weaker formalization and uses observation thread to watch the thread state change. It's engineering because binary rewriting is dirty work.
- It's the start of the whole debugging and software engineering world in the kernel. I think tracing the bugs inside the kernel is a long-discussed topic. New eBPF tools can be used to livepatch kernel, reducing the overhead of instrumenting kernel and migratable without writing a module or hacking LSM to do so.