Database - vickieGPT’s blog

October 7, 2020February 9, 2022

On database of Go and Kubernetes and Rust

A few days during the long vacation in China, I found some stuff good to play. The intensive was to figure out a fast framework for my ugly blog, which turns out to be non-sense. But the process of figuring out them is so so interesting.

My demands

My need is to have gitment embeddings work so that the trash comments will be filtered. Those can be handwritten by myself, but I don’t currently have time to do systematic research on javascript, only applying the latest wheel is enough for me. Besides, I’m in great need of markdown writing experience, so gatsby, Hexo and Hugo are my choices.

Rust

First, I consult on some rust written blog, with which I provide a great amount of efforts. https://github.com/ramsayleung/blog was a fantastic one. I found it utilize diesel to make mapping by struct in rust like:

table! {
    post (id) {
        id -> Int4,
        title -> Varchar,
        subtitle -> Varchar,
        raw_content -> Text,
        rendered_content -> Text,
        create_time -> Timestamp,
        modify_time -> Timestamp,
        post_type -> Int4,
        hit_time -> Int4,
        published -> Bool,
        slug_url -> Varchar,
        enable_comment -> Bool,
        tag -> Jsonb,
    }
}

table! {
    user (id) {
        id -> Int4,
        username -> Varchar,
        hashed_password -> Varchar,
        create_time -> Timestamp,
        modify_time -> Timestamp,
        email -> Varchar,
        avatar_url -> Nullable<Varchar>,
    }
}

table! {
    visitor_log (id) {
        id -> Int4,
        ip -> Inet,
        access_time -> Timestamp,
        user_id -> Int4,
    }
}

allow_tables_to_appear_in_same_query!(
    post,
    user,
    visitor_log,
);

I consult on the database of our schools'. I thought it was pgdb, which I guess right. In terms of the static website for blogging, the database seems laggy and out of date. I eventually found out that even the database API is written in rust, but the speed of calling prosgredb is not that fast, within 10s ms per outer key consultation. The web parts is pure js, but not wasm. The rust part only account for its request logic dealing with backend, still the js to get data from the database, which is sad.

Rust is still not a frontend ready language, although it claims to be a fast and high throughput language in terms of dealing with data. Although they have https://github.com/SASUKE40/yew-starter for wasm but still javascript, so why not just javascript?

nearly all the data storing in the language that utilize API use mapping

For example JSON

HUGO

Hugo is written in go. At the jump, I have some experience of dealing with time serialized data (LSM) of HPC data using go API. go is really an out-of-box language so you don’t care much about the memory leakage and semaphore stuff for multithreading programs. Because many of the company is utilizing the language, there’s a bunch of resources and society for CRUD and business code, from database to HTTP sever, from JSON to YAML. HUGO is just another part of it. I gain much information from the blog there https://draveness.me/few-words-time-management/.

Gatsby

React implementation, React components required. I’m not so familiar with javascript and only had one project with LEAFERX, a nice guy. I eventually turn back to php using wordpress.

Why Rust is not ready and Go is ready.

Inside the choice of blog, I talked about the Rust right now is porting everything out of its good & safe logic of itself. The scheme of the rust deisey is just dumby. Rust is not ready for high throughput program unless it has better package for native web deployment. Go is ready for it has its own coroutine, c++2a is catch up with it later on. But go is combining the java developer to make it has c++ speed with single lines. Like Drogan/Drongan.

http package of `go`

The net/http language of Go wraps both the HTTP client and server implementations, in order to support better scalability, it introduces the net/http. An interface to the HTTP request, where the caller takes the request as an argument to get a response to the request, and net/http. Handler is mainly used by the HTTP server to respond to client requests.

scheduler of `go`

Signal-Based Preemptor Dispatcher - 1.14 ~ so far

Enabling signal-based true preemption dispatch.
Garbage collection triggers preemption scheduling when the stack is scanned.

Not enough time points have been seized to cover the full range of edge cases.

static void schedule(G *gp) {
schedlock();
if(gp != nil) {
    gp->m = nil;
    uint32 v = runtime·xadd(&runtime·sched.atomic, -1<<mcpuShift);
    if(atomic_mcpu(v) > maxgomaxprocs)
        runtime·throw("negative mcpu in scheduler");
    switch(gp->status){
    case Grunning:
        gp->status = Grunnable;
        gput(gp);
        break;
    case ...:
    }
} else {
    ...
}
gp = nextgandunlock();
gp->status = Grunning;
m->curg = gp;
gp->m = m;
runtime·gogo(&gp->sched, 0);
}

How overlay network is written in `go`.

Overlay networking is not actually a new technology, it is a computer network built on another network, a form of network virtualization technology that has been facilitated by the evolution of cloud virtualization technology in recent years.

In practice, we typically use Virtual Extensible LAN (VxLAN) to set up an Overlay network. In the following diagram, two physical machines can access each other over a three-layer IP network.

Reference

https://draveness.me/whys-the-design-overlay-network/
Kubernetes 源码剖析

September 23, 2020February 8, 2022

[Database] Disk Representations: Files, Pages, Records

Tables stored as logical files • Consist of pages

Pages contain a collection of records Pages are managed
On disk by the disk space manager: pages read/written to physical disk/files
In memory by the buffer manager:
higher levels of DBMS only operate in memory

DataBase Files

DB Files : A collection of pages, each containing a collection.

Unordered Heap Files

Records placed arbitrarily across pages Clustered Heap Files
Records and pages are grouped Sorted Files
Pages and records are in sorted order
Index Files
B+ Trees, Linear Hashing, ...
May contain records or point to records in other files

Unordered Heap Files

collection of recording in particular order , but not to be confused wi the heap data structure.
Keep track of record, free space and pages.

Heap File implemented as List

doubly list

Comparison of different file type

September 16, 2020February 8, 2022

[Database] Storing Data - Disks and Buffers

Architecture of a DBMS

SQL client of DBMS

Relational Operators

files and index management

Buffer Management

September 13, 2020February 9, 2022

[Database] SQL

Admin
Grading

Homework: 20%
Quizs: 10%
Course project: 25%
Midterm: 20%
Final exam: 25%

Reason to have a Database

Unitility

Data processing backs essentially every app
Databases of one form or another back most apps
The principles taught in this class back nearly everything in computing

Centrality

Data is at the center of modern society.
Unprecedented in its nature and significance
- Particular and voluminous
- Often asymmetric
  - low value in isolation, high value when aggregated
- Difficult to protect
The infrastructure determines what’s possible

September 13, 2020February 9, 2022

[Database] SQL Cond.

习题课 - 1A106 - 周一8-9

关系型数据库

容器存储关系的集合
- Relation
  - schema - fixed
  - Instance - change often
    - multiset
- Tuple
- Attribute
- DDL
  - Sailor
    
    Group by 聚合函数或select
- DML

SELECT S.dept, AVG(S.gpa), COUNT(\*) FROM Students S
WHERE S.gender = 'F'
GROUP BY S.dept
HAVING COUNT(\*) >= 2 ORDER BY S.dept;

Distinct Aggregate

SELECT COUNT(**DISTINCT** S.name) 
FROM Students S
WHERE S.dept = 'CS';

SELECT **DISTINCT** COUNT(S.name) 
FROM Students S
WHERE S.dept = 'CS';

e.g

SELECT S.name, AVG(S.gpa) 
FROM Students S
GROUP BY S.dept;

name has multiple of them. Should be S.name or coercion function

SUMMARY for SQL1

Relational model has well-defined query semantics
Modern SQL extends “pure” relational model (some extra goodies for duplicate row, non-atomic types... more in next lecture)
Typically, many ways to write a query
- DBMS figures out a fast way to execute a query, regardless of how it is written.

DML 多表

SELECT [DISTINCT]

FROM <single table>
[WHERE <predicate>]
[GROUP BY <column list>
[HAVING <predicate>] ] [ORDER BY <column list>] [LIMIT <integer>];

select - collection

join Queries 链接查询

SELECT [DISTINCT] <column expression list>
FROM <table1 [AS t1], ... , tableN [AS tn]> 
[WHERE <predicate>]
[GROUP BY <column list>[HAVING <predicate>] ] [ORDER BY <column list>];

cross(Catesian) product

All pairs of tuples, concatenated

Use where to filter - 有预约记录的信息

SELECT S.sid, S.sname, R.bid
FROM Sailors AS S, Reserves AS R 
WHERE S.sid=R.sid #外键==内键

先找到再去除，先行后列

AS is column Name and Table Aliases

self-join

AS can be used as the product's col name.

In a must in this case

Arithmetic Expression

SELECT S.age, S.age-5 AS age1, 2*S.age AS age2 
FROM Sailors AS S
WHERE 2*S.age = S2.age - 1

SQL calculator

SELECT
log(1000) as three,
exp(ln(2)) as two,
cos(0) as one,
ln(2*3) = ln(2) + ln(3) as sanity;

4 - 1

string comparisons

Old School SQL (Like)

SELECT S.sname
FROM Sailors S
WHERE S.sname LIKE 'B_%’

Standard Regular Expressions

SELECT S.sname
FROM Sailors S 
WHERE S.sname ~ 'B.*’

combining Predicates

Subtle connections between:

Boolean logic in WHERE (i.e., AND, OR)
Traditional Set operations (i.e. INTERSECT, UNION)

Sid’s of sailors who reserved a red OR a green boat.

SELECT R.sid
FROM Boats B, Reserves R 
WHERE R.bid=B.bid AND (B.color='red' OR B.color='green')

Sid’s of sailors who reserved a red AND a green boat.

SELECT R.sid
FROM Boats B, Reserves R
WHERE R.bid=B.bid AND B.color='red'

UNION ALL

SELECT R.sid
FROM Boats B, Reserves R
WHERE R.bid=B.bid AND B.color='green'

UNION ALL ~ OR set operation
INTERSECT ~ AND set operation

set semantics

Set: acollection of distinct elements
Standard ways of manipulating / combining sets
- Union
- Intersect
- Except
Treat tuples within a relation as elements of a set

Default to set semantics

Relational tables

SQL	Relation
Set/Multiset	Set
ordering	non-order

Table is just the interpretation of the relation.

Distinct

\(A=<x,y>\)
\(B=<y>\)
\(A/B=<x>\)
Relational Division: “Find sailors who’ve reserved all boats.” Said differently: “sailors with no counterexample missing boats”

SELECT S.sname FROM Sailors S WHERE NOT EXISTS
(SELECT B.bid
FROM Boats B
WHERE NOT EXISTS (SELECT R.bid
FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid ))

ARGMAX

max's other property

SELECT *
FROM Sailors S WHERE S.rating >= ALL
(SELECT S2.rating FROM Sailors S2)

SELECT *
FROM Sailors S WHERE S.rating =
(SELECT MAX(S2.rating) FROM Sailors S2)

both are okay, the second is not stable and does not produce 并列。

INNER JOIN

减少cross product 的and

SELECT s.*, r.bid
FROM Sailors s, Reserves r WHERE s.sid = r.sid
AND ...

SELECT s.*, r.bid
FROM Sailors s INNER JOIN Reserves r ON s.sid = r.sid
WHERE ...

join variable

SELECT <column expression list> FROM table_name
[INNER | NATURAL
| {LEFT |RIGHT | FULL } {OUTER}] JOIN
table_name
ON <qualification_list> WHERE ...

The shadowed manifest list capacity.

Left Outer join

SELECT s.sid, s.sname, r.bid
FROM Sailors2 s LEFT OUTER JOIN Reserves2 r 
ON s.sid = r.sid;

不匹配赋NULL
行数和sailor一样

SELECT r.sid, b.bid, b.bname
FROM Reserves2 r RIGHT OUTER JOIN Boats2 b 
ON r.bid = b.bid

不匹配赋NULL
行数和boat一样

NATURAL & INNER

SELECT s.sid, s.sname, r.bid FROM Sailors s, Reserves r WHERE s.sid = r.sid
AND s.age > 20;

SELECT s.sid, s.sname, r.bid
FROM Sailors s INNER JOIN Reserves r ON s.sid = r.sid
WHERE s.age > 20;

SELECT s.sid, s.sname, r.bid
FROM Sailors s NATURAL JOIN Reserves r WHERE s.age > 20;

Equal, natural 只输出on 行成立的部分

FULL OUTER JOIN

「s.bname」不存在它置为NULL
「b.name， b.bid」不存在它置为NULL

Views

Named Queries

Makes development simpler
Often used for security
Not Materialized

CREATE VIEW Redcount
AS SELECT B.bid, COUNT(*) AS scount FROM 
Boats2 B, Reserves2 R
WHERE R.bid=B.bid AND B.color='red' GROUP BY B.bid

Subqueries in FROM

Like a “view on the fly”!

SELECT bname, scount FROM Boats2 B,
(SELECT B.bid, COUNT (*)
FROM Boats2 B, Reserves2 R
WHERE R.bid = B.bid AND B.color = 'red' GROUP BY B.bid) AS Reds(bid, scount)
WHERE Reds.bid=B.bid AND scount < 10

Common table experssion (WITH)

another view on the fly

WITH Reds(bid, scount) AS
(SELECT B.bid, COUNT (*)
FROM Boats2 B, Reserves2 R
WHERE R.bid = B.bid AND B.color = 'red' GROUP BY B.bid)

ARGMAX GROUP BY

The sailor with the highest rating per age

WITH maxratings(age, maxrating) AS (SELECT age, max(rating)
FROM Sailors
GROUP BY age)

SELECT S.*
FROM Sailors S, maxratings m
WHERE S.age = m.age
AND S.rating = m.maxrating;

NULL values

Field values are sometimes unknown
– SQL provides a special value NULL for such situations.
– Every data type can be NULL
The presence of null complicates many issues. E.g.:
– Selection predicates (WHERE)
– Aggregation
But NULLs comes naturally from Outer joins

NULL op x is NULL
Explicit NULL checks

SELECT * FROM sailors WHERE rating IS NOT NULL;

if NULL is after WHERE, it will not output

My demands

Rust

nearly all the data storing in the language that utilize API use mapping

HUGO

Gatsby

Why Rust is not ready and Go is ready.

http package of go

scheduler of go

How overlay network is written in go.

Reference

DataBase Files

Unordered Heap Files

Heap File implemented as List

Comparison of different file type

Architecture of a DBMS

SQL client of DBMS

Reason to have a Database

关系型数据库

Distinct Aggregate

e.g

SUMMARY for SQL1

DML 多表

join Queries 链接查询

cross(Catesian) product

self-join

Arithmetic Expression

SQL calculator

string comparisons

combining Predicates

set semantics

Relational tables

Distinct

ARGMAX

INNER JOIN

join variable

Left Outer join

NATURAL & INNER

FULL OUTER JOIN

Views

Subqueries in FROM

Common table experssion (WITH)

ARGMAX GROUP BY

NULL values

NULL and Aggregation

http package of `go`

scheduler of `go`

How overlay network is written in `go`.