Proposal for *A online systematic scheduling algorithm over Distributed IO Systems.*

In the resource allocation problem in the Distributed Systems under the High Performance Computer, we don't really know which device like disk, NIC (network interface) is more likely to be worn, or not currently on duty which may trigger delaying a while to get the data ready. The current solution is random or round robin scheduling algorithm in avoidance of wearing and dynamic routing for fastest speed. We can utilize the data collected to make it automatic.

Matured system administrator may know the pattern of the parameter to tweak like stride on the distributed File Systems, network MTUs for Infiniband card and the route to fetch the data. Currently, eBPF(extended Berkeley Packets Filter) can store those information like the IO latency on the storage node, network latency over the topology into the time series data. We can use these data to predict which topology and stride and other parameter may be the best way to seek data.

The data is online, and the prediction function can be online reinforce learning. Just like k-arm bandit, the reward can be the function of latency gains and device wearing parameter. The update data can be the real time latency for disks and networks. The information that gives to the RL bots can be where the data locate on disks, which data sought more frequently (DBMS query or random small files) and what frequency the disk make fail.

Benchmarks and evaluation can be the statistical gain of our systems latency and the overall disk wearing after the stress tests.

一个 本科生 直面 PhD 的碰壁人生

自救乃第一天理 -《慈悲与玫瑰》熊培云

我是个普通人,普通到每天都在觉得自己各种无力,无论是自己的能力无法在合适的方向抒发,同时又困窘于自己的时间有限,在同时兼顾发paper,申请和考托考GRE当助教。都太难了!申请上的想法,和 Shu 聊过以后,了解到了国内读研的可能性,同时感觉到我作为本科生的无力。可是我却有拼一拼,没有博士,就工作的想法,因为我个人觉得国内三年的研究生设置过长,对于培养一个有给定问题解决一个问题能力的研究者太长,对于探索更大领域中 vision 的博士又太短。同时我觉得去美国的机会不太多,看了看 乔神/USTCqzy/caoshuxin 的处境,我觉得MSRA真的是本世纪最成功的留学机构。

有关方向

我的预判是 Computer System (with Arch) 是比较好的方向,无论是从现在的趋势,还是未来的 funding 。DBMS 一定是大热的方向(作为一个文件系统和软工搬砖者,还是夸一夸敌军),毕竟现在数据为王。和 Arch 相关的有最近出的 NVDimm, 有很多可拓展的方向。而 ML system 的大热注定了这个方向是一个泡沫。

GPA 拉胯

个人感觉从能力上和看问题的 vision 上我是属于比较有好的统括性能力的人,但缺乏马上复线的坚持能力。自从我对超算比赛魔改代码的兴趣有所丧失以后,对带同学和教他们改代码还是很有兴趣的。我个人有挺多高考遗留下来的后遗症,简单来说就是考试失能,实在不想在这种maybe “高分低能”的人堆当中卷致死了。

何为成才

和父母稍稍聊过,觉得我说服不了他们,他们也说服不了我,他们一直觉得我没什么用,从很多意义上来讲,长时间的生活费不足,直到最一年才稍稍变好。我的敢想,很大程度上来自我爸的自信,但又有点不同,我看到的,和身居高位的人看到的不太一样,而革命从来都是自下而上的。

为什么有些需要开源有些需要闭源

我认为一切的探索都是

暑研或许是 Ph.D. 的唯一机会

这个集散地 nsf fund 的工作只招美国居民或者绿卡。之前看到一亩三分地上有个关于为什么美国和中国同工不同酬,有个解释很有趣,visa 就好似一种半透膜,不是所有人都付得起硕士 OPT 的钱。不过这钱有点Overpay了。而码农这种工作确实吃青春饭,只有在学的最快的时候多学一点才有用。感觉只有加州、UIUC、剩下三大的暑研或者未来PhD 适合我去念。

General Requirement

  1. Online Application
  2. One-page Personal Statement: why this teacher? why this program
  3. Official Transcript
  4. Curriculum Vitae (CV)
  5. Your Top Faculty Choices - If no faculty matches your interest, please indicate your preferred. Preferably the professor with similar aim.
  6. professor whose research area best aligns with your interest. You can learn about each faculty member’s research area by referring to the Samueli School of Engineering website.
  7. Two References – Required email and the

UCI onsite ranking 1.30

  1. https://www.ics.uci.edu/~harris/
    1. Electronic Design Automation from Natural Language
    2. Embedded Systems
    3. Social Engineering Attack
    4. Functional Verification
  2. https://faculty.sites.uci.edu/zhouli/
    1. IoT
    2. Embedded System
  3. Fadi Kurdahi digital system
  4. mohammad AL Faruque
  5. https://www.ics.uci.edu/~mlevorat/
    1. Real-Time distributed computing in wireless systems
    2. Wireless systems for AI and AI for wireless systems
    3. IoT and Healthcare
  6. https://chenli.ics.uci.edu/research/
    1. database
  7. https://www.ics.uci.edu/~xhx/
    1. AI data mining
  • 最终选定 Zhou Li (张一帆在这)和 另两个 system 的。

UIUC Online ranking undecided

Caltech 2.22

Harvard undecided

CMU 2.2

  • Fuzzing & Arch & DBMS 啥都有,如果能去这的话 MCS 也可。 https://applygrad.cs.cmu.edu/apply/bio.php
  • Awaiting Recommendation
  • No reply
Screen Shot 2021-02-01 at 3.58.42 PM

WUSTL 2.17

  • https://sites.wustl.edu/csereu/apply/ HPC 写 CPP
  • Awaiting Recommendation
  • Rej

UCR

  • https://connect.ucr.edu/register/MSRIP Gu yan & Sun 并行算法
  • Posted
  • Rej

暑研

暑研面经

将会被UIUC接手。之后再补充面筋。主要是4月的时候看到招生群里有人贴出UIUC SE今年投递了200人最后只会有20人左右进,然后就投了,让我做了一个修复flaky test的小实验,说是实验其实就是个并发RAW bug没啥难度,然后就进了,7月开始work,跑OD Flaky test实验,有去年发的TACAS一篇。[2022.2]现在感觉Darko只给了平推或者黑推。因为开学以后搞超算,Wing也没什么时间知道,就丢给我一个Bramble访问者模式DTA插桩,主要之前对JVM不太熟,写Java还行,但maven阶段的debug 真的难顶,一开始赶ICST,但是北大的弟弟咕了我也咕咕了。

Darko Meeting 最喜欢躺着

暑研干了啥

和北大的同学修bug,darko会给很多修bug的建议,然后我们看是工具的锅/项目的锅还是java 1.8。刚开始做就是修修Flaky Test 的 bug, 最多的是类似hdfs和hbase数据库里的并发bug。

大四上学期

SC21

主要是在搞毕业论文和SC21. 从队长升级为学生教练的感觉挺不错,但是问题是在这方面发力对申请毫无作用,唯一的作用大概是让殷老师在申请的时候给我说好话。

当然从培养能力的角度来说,还是有领导力的培养的。至少NTU/THU有的造血pipeline(wiki/teaching/yole spirit)都有。也算是终于把进大学时候的梦想有一个阶段性的成果。我认为以后的几年,每次拿个前三不是问题。

从想认真搞毕业论文到最近的比较佛系

有关如何寻找一个方向的研究

跟AP做就不用担心没有idea,反倒是和Tenure需要考虑。我看很多人想不到新idea的时候就去大公司实习几个月,既可以合作带出一些项目、尤其是产业界的数据,像我CMU ECE套的那位。

更新进度条放上面,同时在 gradcafe 更新

Cornell 貌似马桶堵了,发现自己填了奇怪的by mail。Dec.1 CST 交完了。

UIUC/CMU/UW/MIT/UTAustin/NorthWest/UCSD/Purdue/UCLA/WISCONSIN/UMass/UTAustin/Purdue/GaTech/UCSC Dec.15 CST  交完了,静待面试。

12.22 某great chicago校私下面。

1.17 某chicago校套过的教授面,说了我的ps有问题,意思就是没戏。

1.19 GT meltdown 作者 Daniel Genkin 教授面,应该是过了教授法眼,面完感觉稳了。他有8位Ph.D,他完全不care防御,感觉eBPF对他没啥用(这句话在我看到PriSC开篇talk后觉得可以搞),他说最近有个browser RIDL的锅。但此人有数据造假和搞学生等问题。其实更想去做rudra的kim那。后来写了个邮件,给了个chat。

1.21 UCSC Rising Star 他说进去做hw/sw codesign,然后fpga virtualization,然后傲腾高性能系统。对每一个project都挺感兴趣的。至少connection他都在UMich.

1.25 CMU ECE某套过的教授约面。由于paper过于实验数据性不太想去她那,但有总比没好。面完就觉得挂了,问了下cache replacement和logistics。

2.3 收到UCSC offer,估计没其他地方了。

2.4 UW 拒

2.15 UMadison 拒

差不多完结了,大概知道自己什么水平和竞品差距了云云,在一个完全竞争的市场就是winner takes all,还是贴的buff越多越强手,我直接filter掉了泛华人民族主义老师,我在这论述中美差距想必不是个好劳动力。面我的都是强导师学校,其他应该都拒了,找个安稳的地方、nice的导师搞自己的研究就好。同时祝贺美女拿到UIUC offer,学长拿到ETHz offer,叶神UCB offer.

以下原回答

双非 GPA 3.01/4 65% / T103 / G不交了. 只申请美帝 System/Arch/SE Ph.D. (梦校UIUC,一个二流学生进,一流学生出的地方,梦想成为的人 Chris Lattner )抑或去工业界恰烂钱,Ph.D.毕业大概率恰几年软饭再创业,主页。

GPA低是因为知道自己不是考试的料,想早点学会SE,OS,PL,Arch,学了挺多代码量令人发指的课。同时,身体不太好,是个MtF,但申请表上写的是男。

大一打了点CTF和Hackathon,暑假跟学校安全实验室发了篇ISSTA,大二暑假Jump Trading Linux Team实习有return offer,学校某组干了半年的NVM 和 Compiler TA,超算带队2年,发了篇 critique,拿了个第二,大三 UIUC SE 暑研。

想去的 UIUC 组,距消息人士 ,不投今年大概率没坑位的组,可能被diversity。

  1. Ghose Memory
  2. Charitm Sec
  3. Lingming Zhang AI Fuzzing
  4. Jianhuang Memory
  5. Darko Marinov Flaky test

套完了,感觉收益很低,有回复就报的学校。回复率8/17,套的现在一个都没捞我,看来我太没有竞争力了。

  1. Emmett Witchel UT Austin Serverless 强 committee replied 据消息人士,Austin人都去恰烂钱了 已回
  2. Michael Swift Madison NVM 强 committee relied 老板人很nice 已回
  3. Jishen Zhao UCSD ML/NVM 貌似ML化了 强导师 未回
  4. Brandon Lucia CMU ECE NVM 强导师 未回
  5. Dimitrios Skarlatos CMU ECE NVM 强导师 未回
  6. Barisk Umich NVM FPGA密院大本营 强导师 未回 但和 jiachen和ian都聊过。只招收合作过的学生,据说学长面还给个小project试试水平。
  7. Xinyu Xing NW AI kernel Sec强导师 已回
  8. Changhee Jung Purdue NVM 强committee 套了 已回欢迎
  9. Mengjia MIT NVM Sec强committee 已回欢迎
  10. Moin Gatech Memory 强导师 未回
  11. Dkohlbre UW Riscv TEE 强 Committee 已回欢迎
  12. Christina Delimitrou Cornell Serverless 强committee 已回欢迎
  13. Andrew R. Quinn FPGA OS ucsc 无回信
  14. J. ELIOT B. MOSS UMass Database/Parallel Algorithm 强committee 无回信
  15. Harry Xu's Homepage UCLA Data-race related java SE/Formal/NVM 强committee 无回信
  16. Akshitha Sriraman CMU Low latency HPC Metrics Cache replacement 强导师 twitter上套
  17. Alexandros GaTech HPC Network 强导师 无回信

UMich SoP

For pure motivation, I need a Ph.D. for investigating a direction that is worth my life fighting for and the society's values. With the rapid growth of the Chinese economy followed by huge research investment, at least for the past three years in ShanghaiTech, I witnessed extraordinary scientific progress in all disciplines. China has also provided huge markets to fast deploy the research results and companies start to be willing to devote higher salaries and equipment for new grads to dig into their research fields. However, most professor in our school only takes care of short-term profits and put many efforts into applications of established ideas, which things solely get one direction worse in other institutes. Plus, no profitable company is founded on tech infrastructure as Nvidia, Intel and Xilinx do but exploiting the unsophisticated public's time like Tencent and ByteDance. That accounts for the U.S.A. is still the origin of innovation today. In China, the general public's pure pursuit for better technology downturns to self-imposed comfort based on the current circumstance. But, I'm not and from the bottom of my heart, want to use technology to change.

I recently published a paper on the adversarial sample in AI security scenario on ISSTA21 as the fourth author under the supervision of Prof. Fu Song. I helped the first author Ph.D. candidate Zhe Zhao run most experiments during my Freshman summer. It innovatively utilized the fact label change rate through model mutation testing to distinguish adversarial examples and put them on defend the data that use this technique, which we called Attack as Defense. I got to know how software engineering testing works on artificial intelligence and could apply to any other places like language spec on smart contracts, operating system‘s concurrency, and computer architecture's semantics. That's my two other Work-In-Progress work mainly focus on, to use Z3 solver on verifying the possible timestamp attack and arithmetic overflow on Diem move language. During my weekly seminar at System and Software Security Lab for two years, I grabbed ideas like Decision Procedures, basically, the originality/application of SMT solver as the combination of logic and program, fuzzing techniques, and Capture The Flags Surroundings - a security competition.

From my Sophomore year on, my main focus turns into industrial needs practice. GeekPie_HPC is a place I devote time to. We just obtain second place at SC21-SCC. I would say I put the obscure system knowledge into production on high-performance heterogeneous systems. For example, I got how the Linux system called flock work in class, but not until I found it messy once linking on GPFS with un-updated data drag me into this semantic deeper, I resolved it by fsync to manually force synchronize. I knew Cuda only as a library importer using Pytorch auto-gradient that for sure run on GPU, not until I compare different compiler hint with different HPC algorithm and MPI scatter/reduce and alltoallv takes me to figure out how data transmit on GPU. My school establishes a long-term connection to Jump Trading by us winning the super clustering competition that the recruiter gets to know that our students are unique to problem-solving with the right tools. My experience at Jump Trading in sophomore summer let me dig into the more cutting-edge technology eBPF and Intel Mesh Micro Architecture. However, the main focus of industrial is quite different. I mostly applied for the kernel dynamic inspection work on the distributed filesystem in terms of different lease users and apply the core affinity strategy considering core to NUMA, DDR, NIC, and GPU latency. From my perception through my ex-colleague, more production level engineers usually have Bachelor Degree only and are cultivated by the company like my mentor, but the real secret big thing is usually brought by Ph.D. like the author of eBPF or reverse-engineering work on intel processors.

For this summer, I remotely joined Darko Marinov's as REU(research experience for undergrads) and worked with a Peking University classmate Ruidong Zhu for testing order-dependent tests. I started a brand-new direction as pure software testing on order-dependent JUnit tests. Flakiness means tests may fail or pass for different rounds. This could be triggered by some order-dependent values which could be identified on Darko's iDFlaky tool automatically run on Azure. For testing, their previous work explains the cleaner, polluters, and victims of specific variables on specific values. Their latest work submitted for ICSE21 is to introduce Non-idempotent tests that could be identified by running methods one after one in isolated methods/class/entire suite to see whether they may be flaky. We run a dynamic taint analysis tool called PraDet on all the runnable tests on three of their latest test suites and report. We are currently modifying a more advanced tool based on these limitations. During the process. I'm intrigued by the passion of my mentor Wing Lam and Darko's energy in thoughts in contrast to his lazy lying posture.

For choosing UMich, I'm captivated by a school that chose potential people that are intrinsically apt with engineering problem-solving skills and cultivate them into world-class researchers like Baris Kasikci. The recently published paper "Rethinking File Mapping for Persistent Memory" on FAST21 is really amazing. The authors propose to use hash for File Mapping. an example is given in the text, PMem is divided into a file data region and metadata region, if the logical address to be mapped is <inum=1, iblk=21>, the offset of this logical block in the hash is i, then the physical block address corresponding to this logical block is ( file data region start address + i*4KB). There is 5+ paper every year from Baris. For these world-class research opportunities, the CS department of UMich is especially attractive to me. It would be a privilege to study under the guidance of its remarkable faculty during "A New Golden Age for Computer Architecture".

I have enjoyed being able to apply what I learned in classes such as computer architecture and the principle of the compiler to my research. On the other hand, I have also cultivated a broad interest in other areas, such as Reinforce Learning, as a source of inspiration. I seek different kinds of creativity in engineering and in the beauty of itself when it was realized. It is this creative will that I wish to pursue in UMich's Ph.D. program and afterward as a researcher in the industry. My learning experience under the guidance of my advisor convinced me not only of the potential of research but also of the value of teaching. I have also enjoyed working as an undergraduate teaching assistant for the compiler. Through my course studies, I expect to become and will work hard to be a productive researcher and teacher.

UCSD PS

First of all, my previous experience makes me an open-minded person with high motivation that does not take the current circumstances for granted. I think that kind of momentum and curiosity is cultivated through my travel and experience. As for the social practices, for the summer of Sophomore, 20 other students and I come to PingTang, the place installed with a Five-hundred-meter Aperture Spherical Telescope. We investigated how this externality affects the locals' tourism from the first year's pouring of capital to the second year's over-saturated and how it changed with the downturn of the Chinese economy. China's investment of Infrastructure is fundamental to every public in the rural area, and socialism is taking effect with the targeted poverty alleviation in this Xi's time. 800 RMB per year per family is the definition of the poor and until 2020 if he's still under this line, he has disabled member or unwillingness to labor. However, criticism is cast on the push of every man to engage in the smallholder economy like strawberries that do not match the local environment. I solo visit HK during the protest, Singapore, Malaysia, Thailand, India, and Nepal within 12 days. I witnessed the big countries' hegemony and small country esteem. I witnessed the deep inequality of poverty in this world and the importance of establishing the network/highway infrastructure.

The open mind takes me naturally into a diverse environment. My previous employer, Jump Trading is a place that embraces diversity. I first come to realize that in a tiny office, there exists multiple races, LGBTQ+, multiple languages as a native language, and multiple religions. For communicating more fluently without barriers, all we did is to respect with no discrimination. The colleague who worked with me is an MtF(Male to Female), besides calling 'her', talking off sex mutual stuff and no man's joke. My mentor is born in Malaysia and his mother is from England and his father is from Hong Kong. So he's quite familiar with Cantonese words. From a technical perspective, the people who graduated from French Schools focus more on mathematical proof as well as intuition while those from American Schools care more about implementation and effectiveness. We are valuing every people from different backgrounds which I'm tuned a while for it since I'm situated in a single race country with a single religion. Every year, there are 3 top-tier competitions for super-cluster competition and I'm the lead for the team to compete with prestigious universities like UCSD, UIUC, and Gatech. Our team GeekPie_HPC has recruited 2 females out of 6 for daily training and eventual competition. We highly recommend female computer science students to join in such a low female density department.

My research taste and delight come from the demand of my curiosity. Many dummy things happen when choosing the courses and taking exams, I get accustomed to getting the hardest course that gives me the challenge of pressure. Once I'm determined to do something, I would focus on the point until it's figuring out or give up it because I knew the stuff does not fit me. The overall process of college for me is a time of testing failures. The projects and exams are similar to a I knew that I have many shortages, but it didn't bother my desperation to solve hardest open questions.

Jung making-connection Letter

I’m CS Undergrad from ShanghaiTech specializing in general systems. I grabbed most of my practical skills by attending GeekPie HPC. I spent some time working on eBPF and intel processor micro arch at Jump Trading Shanghai (which has proven to be engineers' efforts talking with other guys but get me into the micro arch world). During summer 2022, I worked on Java Flaky Testing with Darko Marinov from UIUC. During my time at Chundong's lab, we discussed a lot on your paper of study on failure tolerance, memory order bugs, and performance on Optane persistent memory. I referred to your paper for grabbing a general knowledge of how to tune performance on Optane Memory. I think I could put energy into them if I had the opportunity to join your team. Sincerely, would you recruit Ph.D. or masters this year?

Best Yiwei

  1. 南科大飞跃手册@wjc‘s recommendation
  2. 孙明瑞@n+e’s recommendation
  3. James.Qiu@Zhihuihu