Intro to Parallelism Architectures and Software Structures we will focus on the shared-nothing here 😋
Kinds of Query Parallelism side note:
intra: single inter: multiple at the same level Parallel Data Acceess Data Partitioning across Machines Round robin means that each machine haves the same shuffled data
parallel scans scan and merge
$\sigma_p$ : an operator that skip entire sites that have no matching tuples in range or hash partitioning
🎉
Intro transaction’s principle ACID Isolation (Concurrency) however, do not consider serial execution 😅
Atomicity and Durability Consistency Concurrency Control 基本符号表达 序列等价性:
$Def1:$ Serial Schedule each transaction executes in a serial order, one after the other, without any intervening $Def2:$ schedules Equivalent involve same transaction each transaction’s actions are the same order both transactions have the same effect on the database’s final state $Def3:$ Serializable if a schedule is serializable, then it is equivalent to some serial schedule Conflict Serializability conflict operations?
Two Phase Locking (2PL) Strict 2PL same as 2PL, but with stricter definition of release locks at once
pink area is the Strict 2PL
Lock Management there is a lock manager, which maintains a hash table keyed on names of objects being locked
Deadlocks 🤔
why happen? side note:
prioritize upgrades can avoid #2 unlike the OS which can have a fixed order of required sources……
avoiding deadlocks timeout first, TIMEOUT is a not so bad idea 🤔