0%

sudo -i switch to the root user

passwd change password for current user

create alias

1
alias ip = 'curl ipinfo.io/ip'

this is available only in the current shell session

to make persistent, add that into ~/.bashrc

~/.bashrc file is executed whenever bash shell started

~ represents user's home directory

enable no password login

  1. edit the config file
1
sudo nano /etc/pam.d/gdm-password
  1. add rule to the first line:
1
auth sufficient pam_succeed_if.so user = sdcgvhgj

run a script on startup

  • crontab -e edit the cron table
  • add a new line at the bottom
1
@reboot /path/to/script arg1 arg2
  • crontab -l list cron table

another way: use systemd

append a line to crontab

1
(crontab -l; echo "0 4 * * * myscript")| crontab -

difference between ; and &&

link

In cmd1 && cmd2, cmd2 is only executed if cmd1 succeeds (returns 0).

In cmd1 ; cmd2, cmd2 is executed in any case.

https://changkun.de/modern-cpp/zh-cn/00-preface/

https://www.learncpp.com/

Install GCC compiler on Windows

It is recommended to use MinGW-w64

  1. download the release of w64devkit
  2. unzip
  3. add w64devkit\bin to the PATH environment variable

same to install CMake

C++并发

(看了好多资料,要么依赖太多要么讲的太少,终于找到一篇合适的教程,这人讲的真好,举的例子也贴近实际)

  • thread

    • join, detach
    • yield, get_id, sleep_for, sleep_until
    • thread::hardware_concurrency() //硬件并发数
  • mutex

    • once_flag, call_once
    • mutex, timed_mutex, recursive_mutex, shared_mutex (用于读写模型)
    • lock(), try_lock(), unlock()
    • lock(locable_1, locable_2) 标准库的实现保证不会死锁
    • lock_guard, unique_lock, shared_lock, scoped_lock
  • condition_variable

    • works only with unique_lock
    • check -> unlock -> wait -> lock -> check -> unlock -> wait -> lock -> ...
    • check -> go on
  • future

    • async
    • packaged_task
    • promise
  • parallel algorithm in <algorithm> and <numeric>

    1
    2
    3
    sort(execution::seq, copy1.begin(), copy1.end());
    sort(execution::par, copy2.begin(),copy2.end());
    sort(execution::par_unseq, copy2.begin(),copy2.end());

C++内存模型

https://paul.pub/cpp-memory-model/

C++中值的类别

https://paul.pub/cpp-value-category/

Misc

polymorphism with smart opinters

1
2
static_pointer_cast
dynamic_pointer_cast

子类对象被cast成父类,成员函数调的是子类的函数

Google C++ Style

https://google.github.io/styleguide/cppguide.html

Short Tutorial

https://www.thegeekstuff.com/2016/02/c-plus-plus-11/

enum

1
2
3
4
enum class CoordinateArea { FirstArea, SecondArea, ThirdArea, FourthArea};

CoordinateArea caOne = CoordinateArea::FirstArea;
CoordinateArea caSome= CoordinateArea::FourhtArea;

lambda function

1
2
3
4
[firstPart](secondPart) TypeYouReturn{ BodyOfLambda}(acctualParameters);

vectror<int> iVector;
for_each( begin(iVector), end(iVector), [](int n){if(n%2==0)cout<<n<<end;});

[firstPart]

  • [] nothing

  • [&] reference

  • [=] copy

  • [this]

static assertion

1
2
3
static_assert(evaluatedExpression, stringMessage);

static_assert(sizeof(long long int)>=16;”This is unexpected”);

Move and &&

one could use rvalue as reference as well

1
MovableClass&& operator=(MovableClass&&); 

Pointers

1
2
3
4
5
6
7
unique_ptr<someType> suniquePtr(new someType(args));
...
uniquePtr.release();

shared_ptr<someType> somePtr(new someType(args));

weak_ptr<someType> weakPtr= somePtr;

Tuple

1
auto tuple = make_tuple(“triangle”, ‘t’, 10, 15, 20);

菜鸟教程

C

存储类

  • auto:默认的存储类
  • register:放寄存器,无地址
  • static:
    • 修饰局部变量:在函数调用之间保持局部变量的值
    • 修饰全局变量:变量的作用域限制在声明它的文件内(修饰函数同理)
  • extern:
    • 修饰局部变量:引用全局变量
    • 修饰全局变量:引用另一个文件的变量

静态数组

  • int a[10];
  • 存储在栈或者全局数据区
  • 大小固定
  • 生命周期取决于作用域
  • sizeof(a) 返回占用字节数
  • 很多情况下,a等价于a[0] 的指针

动态数组

  • int *a = (int *)malloc(10 * sizeof(int));

  • 存储在堆,用free(a);释放内存

  • 大小可变,用realloc重新分配内存

    a = (int *)realloc(a, 20 * sizeof(int));

枚举

1
2
3
4
enum DAY{
MON=1, TUE, WED, THU, FRI, SAT, SUN
};
enum DAY day = WED;

指针数组

1
const char *names[] = {"Zara Ali","Hina Ali","Nuha Ali","Sara Ali"};

函数指针

1
2
int (*fun_ptr)(int, int) = fun;
fun_ptr(x, y);

函数指针数组

1
void (*fun_ptr_arr[])(int, int) = {add, subtract, multiply};

函数指针可以作为函数的参数

命令行参数

  • argc 表示传入参数的个数,没有参数时值为1

  • argv[0] 表示程序的名称

C++

类成员默认为private

构造函数,析构函数,复制构造函数

1
2
3
4
5
6
7
8
9
classname(int x, int y): x(x), y(y) {
...
}
~classname() {
...
}
classname (const classname &obj) {
...
}

友元函数,友元类

public:任何

private:自己,友元

protected:自己,友元,子类

public继承:不变

protected继承:public变protected

private继承:全变private

内联函数 inline

this:

  • 指向当前对象
  • 只能成员函数中使用

类的静态变量

  • 在类外使用::初始化

类的静态函数

虚函数 virtual

  • 告诉编译器不要静态链接
  • 动态链接,后期绑定
  • 允许用基类的指针来调用子类的这个函数

纯虚函数:只有函数声明没有函数定义

1
virtual int f(int, int) = 0;

抽象类:有纯虚函数的类,不能实例化对象

泛型

1
2
template <typename T>
template <class T>

explicit构造函数:禁止隐式类型转换

const成员函数:不会修改成员变量

Reference: Git tutorial, Git docs Git Book

Configuration

1
2
git config --global http.proxy 127.0.0.1:10809
git config --global https.proxy 127.0.0.1:10809
1
2
git config --global user.name sdcgvhgj
git config --global user.email sdcgvhgj@gmail.com
1
git config --global core.editor "nano -w"

where --global means it works for all repositories

SSH key

  1. Generate SSH key:
1
ssh-keygen -o -t rsa -C "sdcgvhgj@gmail.com"
  1. Copy your key:
1
cat /root/.ssh/id_rsa.pub
  1. Add your key at GitHub key settings

Clone

1
git clone https://github.com/USERNAME/REPOSITORY.git

What this does:

  • make new folder
  • initialize as a Git repository
  • create remote origin
  • fetch all branches
  • check out the default branch

Add Remote

1
git remote add upstream REMOTE_URL

add a new remote called upstream

Fetch

1
git fetch REMOTE-NAME

grabs all the new remote-tracking branches and tags without merging those changes into your own branches.

Merge

1
git merge REMOTE-NAME/BRANCH-NAME

merge a remote-tracking branch (i.e., a branch fetched from a remote repository) with your local branch

Pull

1
git pull REMOTE-NAME BRANCH-NAME

complete both git fetch and git merge

use git merge --abort to take the branch back to where it was in before you pulled

Contribute

1
git branch my-branch

create a new branch

1
git checkout my-branch

switch to that branch

1
git add file1.md file2.md

or

1
git add .

stage the changed files

1
git commit -m "edit file1.md, file2.md"

commit changes

Push

1
git push REMOTE-NAME BRANCH-NAME

push your local changes to your online repository

1
git push REMOTE-NAME LOCAL-BRANCH-NAME:REMOTE-BRANCH-NAME

push the LOCAL-BRANCH-NAME to your REMOTE-NAME, but it is renamed to REMOTE-BRANCH-NAME

1
git push -u REMOTE-NAME BRANCH-NAME

-u specifies default branch for future git push and git pull commands

non-fast-forward error:

  • occurs when another person has pushed to the same branch as you

  • pull first before push

Others

git init initialize your repository (create .git file in current folder)

git status shows the status of changes as untracked, modified, or staged

git log --graph show log

git branch -v shows the branches being worked on locally

git branch BRANCH-NAME create branch

git branch -d BRANCH-NAME delete branch

git branch -M BRANCH-NAME rename the current branch to BRANCH-NAME

git remote -v shows remote URLs

git checkout BRANCH-NAME switch branch

git tag list all tags

git reset --hard HEAD throw all uncommitted changes

git commit --amend -m "commit info"

git rebase main rebase current branch to main

git stash save "info" save uncommitted files

git stash pop

git revert

git reflog

1
2
3

g++ gdbdemo.cpp -ggdb -o main
gdb main

https://stackoverflow.com/questions/41420814/how-to-view-the-internal-data-of-a-smart-pointer-inside-gdb

cntr+xa: TUI

basics

1
2
3
4
5
6
7
8
9
10
11
12
13
14
run FILE_NAME
start //break main & run
break FILE_NAME:FUNCTION_NAME
where //trace along with the line number.
next
list
list ROW_NUMBER
print VARIABLE_NAME
print VARIABLE_NAME=VALUE
info locals
continue
disable
make FILE_NAME
restart

others

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
info gdb 					//Manual
info locals //Vars in local scope
info variables //Vars declared outside current scope
info functions //Names and datatypes of all defined functions
info b //List all breakpoints
break funcName //Set breakpoint at function funcName (short: b funcName)
break file::line //Set breakpoint at line in file
layout next //Cycle through the layouts of gdb
p var //Print the value of variable var
p var = value //Force set value to a var
run //Start the program
start //Synonymous to (b main && run). Puts temporary b at main
next //Execute the current line in source (short: n)
step //Step into function call at current line (short: s)
finish //Finish the execution of current function (short: fin)
continue //Resume execution (After a breakpoint) (short: c)
refresh //Repaint the interface (To fix corrupted interface)
shell cmd //Run shell command cmd from gdb prompt
python gdb.execute(cmd) //Run a gdb command cmd from python prompt
set print pretty on //Enable pretty printing
(Put in ~/.gdbinit)
$ gdb -c core.num //Examine the dumped core file from a SIGSEGV(shell command)
bt //Print backtrace
break _exit //Breakpoint at exit of program
whatis expr //Print datatype of expr
ptype expr //Detailed print of datatype of expr
watch var //Stop when var is modified
watch -l foo //Watch foo loaction
rwatch foo //Stop when foo is read
watch foo if foo>10 //Watch foo conditionally
delete //Delete all breakpoints
delete breakpoint_no //Delete breakpoint breakpoint_no
command breakpoint_no //Run user listed commands when breakpoint is hit
(End list of commands with 'end')
file executable //Load the executable for debugging from inside gdb
quit //Quit (short: q)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
#!/bin/bash

echo "password:"
read password
echo "your password is: $password"

set -x
apt update -y
# apt upgrade -y

echo ----------Trojan----------
cd /root
mkdir trojan
cd trojan
wget https://github.com/p4gefau1t/trojan-go/releases/download/v0.10.6/trojan-go-linux-amd64.zip
apt install -y unzip
unzip trojan-go-linux-amd64.zip
curl https://get.acme.sh | sh
apt install -y socat
ln -s /root/.acme.sh/acme.sh /usr/local/bin/acme.sh
acme.sh --register-account -m sdcgvhgj@qq.com
acme.sh --issue -d sdcgvhgj.top --standalone -k ec-256
acme.sh --installcert -d sdcgvhgj.top --ecc --key-file /root/trojan/server.key --fullchain-file /root/trojan/server.crt
echo "
run-type: server
local-addr: 0.0.0.0
local-port: 10000
remote-addr: tcs.nju.edu.cn
remote-port: 80
password:
- $password
ssl:
cert: /root/trojan/server.crt
key: /root/trojan/server.key
sni: sdcgvhgj.top
" > config.yaml
nohup ./trojan-go > trojan.log 2>&1 &
iptables -t nat -A PREROUTING -p tcp --dport 10000:12000 -j REDIRECT --to-port 10000
iptables -t nat -A PREROUTING -p udp --dport 10000:12000 -j REDIRECT --to-port 10000
# (crontab -l; echo "@reboot iptables -t nat -A PREROUTING -p tcp --dport 10000:12000 -j REDIRECT --to-port 10000") | crontab -
# (crontab -l; echo "@reboot iptables -t nat -A PREROUTING -p udp --dport 10000:12000 -j REDIRECT --to-port 10000") | crontab -
(crontab -l; echo "@reboot nohup /root/trojan/trojan-go --config /root/trojan/config.yaml > /root/trojan/trojan.log 2>&1 &") | crontab -
# iptables -t nat -L PREROUTING -nv --line-number

echo ----------Aria----------
apt install -y aria2
mkdir /root/.aria2
touch /root/.aria2/aria2.session
touch /root/.aria2/aria2.conf
mkdir /home/download
echo "
dir=/home/download
input-file=/root/.aria2/aria2.session
save-session=/root/.aria2/aria2.session
log=/root/.aria2/aria2.log
continue=true
enable-rpc=true
rpc-secret=$password
rpc-listen-port=6800
rpc-allow-origin-all=true
rpc-listen-all=true
#rpc-secure=true
#rpc-certificate=/root/trojan/server.cer
#rpc-private-key=/root/trojan/server.key
" > /root/.aria2/aria2.conf
aria2c --daemon=true --conf-path=/root/.aria2/aria2.conf

echo ----------Filebrowser----------
cd /root
mkdir filebrowser
cd filebrowser
wget https://github.com/filebrowser/filebrowser/releases/download/v2.23.0/linux-amd64-filebrowser.tar.gz
tar -zxvf linux-amd64-filebrowser.tar.gz
./filebrowser config init
./filebrowser config set --address 0.0.0.0 --baseurl '/f' --port 10001 --log ./filebrowser.log --root /home
./filebrowser users add admin $password --perm.admin
nohup ./filebrowser > /dev/null 2>&1 &

echo ----------Docker----------
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done
apt update
apt install -y ca-certificates curl gnupg
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
apt update
apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

echo ----------Nginx----------
apt install -y nginx
cd /var/www/html
mkdir a
cd a
wget https://github.com/mayswind/AriaNg/releases/download/1.3.3/AriaNg-1.3.3.zip
unzip AriaNg-1.3.3.zip
mkdir /home/files
echo '
server {
listen 443 ssl;
server_name sdcgvhgj.top;
ssl_certificate /root/trojan/server.crt;
ssl_certificate_key /root/trojan/server.key;
root /var/www/html;
location / {
try_files $uri $uri/ =404;
}
location /a {
try_files $uri $uri/ =404;
}
location /f {
proxy_pass http://127.0.0.1:10001;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-Ip $remote_addr;
}
location /files {
autoindex on;
root /home;
}
}
server {
listen 80;
root /var/www/html;
server_name sdcgvhgj.top;
location / {
rewrite ^(.*)$ https://$host$1 permanent;
}
location /a {
try_files $uri $uri/ =404;
}
}
' > /etc/nginx/sites-enabled/default
nginx -t
systemctl start nginx

echo ----------NextCloud----------
systemctl stop nginx
acme.sh --issue -d cloud.sdcgvhgj.top --standalone -k ec-256
acme.sh --installcert -d cloud.sdcgvhgj.top --ecc --key-file /root/trojan/cloud.key --fullchain-file /root/trojan/cloud.crt
docker run \
--detach \
--sig-proxy=false \
--name nextcloud-aio-mastercontainer \
--restart always \
--publish 20001:8080 \
--env APACHE_PORT=10002 \
--env APACHE_IP_BINDING=0.0.0.0 \
--volume nextcloud_aio_mastercontainer:/mnt/docker-aio-config \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
nextcloud/all-in-one:latest
echo '
server {
listen 80;
listen [::]:80; # comment to disable IPv6

if ($scheme = "http") {
return 301 https://$host$request_uri;
}

listen 443 ssl http2; # for nginx versions below v1.25.1
listen [::]:443 ssl http2; # for nginx versions below v1.25.1 - comment to disable IPv6

# listen 443 ssl; # for nginx v1.25.1+
# listen [::]:443 ssl; # for nginx v1.25.1+ - keep comment to disable IPv6

server_name cloud.sdcgvhgj.top;

location / {
proxy_pass http://127.0.0.1:10002$request_uri;

proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Scheme $scheme;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Accept-Encoding "";
proxy_set_header Host $host;

client_body_buffer_size 512k;
proxy_read_timeout 86400s;
client_max_body_size 0;
}

ssl_certificate /root/trojan/cloud.crt;
ssl_certificate_key /root/trojan/cloud.key;

ssl_session_timeout 1d;
ssl_session_cache shared:MozSSL:10m; # about 40000 sessions
ssl_session_tickets off;

ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-CHACHA20-POLY1305;
ssl_prefer_server_ciphers on;
}
' > /etc/nginx/sites-enabled/cloud
nginx -t
systemctl start nginx

https://lele1894.blogspot.com/2022/04/blog-post.html

youtube小姐姐

  • https://www.youtube.com/watch?v=XbkR4hWo_W8

  • https://www.youtube.com/@the_influencer_/videos

bilibili直播rtmp推流链接 https://link.bilibili.com/p/center/index#/my-room/start-live

OBS其直播推流及视频录制教程 https://www.bilibili.com/read/cv6446984?spm_id_from=333.999.0.0

merge two webm files (video and audio) in ffmpeg

ffmpeg -i video.webm -i audio.webm -c copy output.mkv

Reading Notes of: Quantum Computation and Quantum Information

2.1 Linear Algebra

vector space (e.g. \(\mathbb{C}^n\))

a vector subspace of a vector space \(V\) is a subset \(W\) of \(V\) such that \(W\) is also a vector space

a basis of a vector space

the dimension of a vector space is the number of elements in the basis

a linear operator between vector spaces \(V\) and \(W\)

a matrix representation of the operator \(A:V\rightarrow W\) \[ A|v_j\rangle=\sum_iA_{ij}|w_i\rangle \] To make the connection between matrices and linear operators we must specify a set of input and output basis states for the input and output vector spaces of the linear operator.

inner product : 1. linear 2. commutation 3. (v,v)>=0

inner product space : a vector space equipped with an inner product

a Hilbert space is exactly the same thing as an inner product space

orthogonal

norm

unit vector

orthonormal

orthonormal basis (produced by Gram-Schmidt procedure from a basis)

a matrix representation is with respect to orthonormal input and output bases

a vector has a matrix representation with respect to an orthonormal basis

the inner product of two vectors is equal to the vector inner product between two matrix representations of those vectors

define outer product \(|w\rangle\langle v|\) as a linear operator from \(V\) to \(W\), making use of the inner product

completeness relation: \(\sum_i|i\rangle\langle i|=I\)

representation of any linear operator in the outer product notation: \[ A=\sum_{ij}|w_i\rangle\langle w_i|A|v_j\rangle\langle v_j| =\sum_{ij}\langle w_i|A|v_j\rangle |w_i\rangle\langle v_j|=\sum_{ij}A_{ij}|w_i\rangle\langle v_j| \] Cauchy-Schwarz inequality : \(|\langle v|w\rangle|^2\leq\langle v|v\rangle\langle w|w\rangle\)

eigenvector, eigenvalue of a linear operator

The characteristic function depends only upon the operator \(A\), and not on the specific matrix representation used for \(A\).

eigenspace corresponding an eigenvalue \(v\) is a vector subspace of \(V\)

diagonal representation \(A=\sum_i\lambda_i|i\rangle\langle i|\) , where \(|i\rangle\) form an orthonormal set of eigenvectors for \(A\).

an eigenspace is called degenerate when it's more than one dimensional

the adjoint / Hermitian conjugate of a linear operator on a Hilbert space

a Hermitian / self-adjoint operator

the projector onto the subspace \(W\) is \(P\equiv\sum_{i=1}^k|i\rangle\langle i|\). \(W\) is a \(k\)-dimensional vector subspace of the \(d\)-dimensional vector space \(V\). \(|1\rangle,\dots,|k\rangle\) is an orthonormal basis for \(W\). The definition is independent of the orthonormal basis used for \(W\).

The orthogonal complement of \(P\) is the operator \(Q\equiv I-P\).

normal : \(AA^\dagger=A^\dagger A\)

normal \(\Longleftrightarrow\) diagonalizable

Hermitian \(\Longleftrightarrow\) diagonalizable and having real eigenvalues

projector \(\Longleftrightarrow\) diagonalizable and having 0/1 eigenvalues

unitary \(\Longleftrightarrow\) diagonalizable and having modulus-1 eigenvalues

unitary : \(U^\dagger U=I\)

unitary operators also satisfies \(UU^\dagger=I\), therefore normal and diagonalizable. unitary operators preserve inner products between vectors.

outer product representation of unitary operators : \(U=\sum_i|w_i\rangle\langle v_i|\), where \(|v_i\rangle\) is any orthonormal set and \(|w_i\rangle\equiv U|v_i\rangle\) is therefore also an orthonormal set.

All eigenvalues of a unitary matrix have modulus 1.

positive operator : for any \(|v\rangle\), \((|v\rangle,A|v\rangle)\) is real, non-negative

A positive operator is necessarily Hermitian. (Any \(A\) can be written as \(A=B+iC\), where \(B\) and \(C\) are Hermitian)

tensor product of two Hilbert spaces

Kronecker product of two matrices

operator functions \(f(A)\equiv\sum f(\lambda)|\lambda\rangle\langle\lambda|\)

trace \(tr(A)=\sum A_{ii}=\sum \langle i|A|i\rangle\)

trace is invariant under the unitary similarity transformation

thus the trace of an operator is any trace of its matrix representation

commutator \([A,B]=AB-BA\)

anticommutator \(\{A,B\}=AB+BA\)

\([A,B]=0\) if and only if \(A\) and \(B\) are simultaneously diagonalizable

Polar decomposition \(A=UJ=KU\), where \(U\) is unitary, \(J\) and \(K\) are positive operators.

Singular value decomposition \(A=UDV\), where \(U\) and \(V\) are unitary, \(D\) is a diagonal matrix with non-negative entries.

2.2 The postulates of quantum mechanics

Postulate 1: state space

Postulate 2: evolution (of a closed system)

Hadamard gate \(H=\frac{1}{\sqrt 2}(|0\rangle+|1\rangle)\langle 0|+\frac{1}{\sqrt 2}(|0\rangle-|1\rangle)\langle 1|\)

\(H\) is Hermitian and unitary.

Postulate 3: quantum measurement

the measurement of a qubit in the computational basis

distinguishability of quantum states

projective measurements

  • observable \(M\), a Hermitian
  • \(\langle M\rangle\)
  • \(\Delta(M)=\sqrt{\langle M^2\rangle-\langle M\rangle}\)

the Heisenberg uncertainty principle : \[ \Delta(C)\Delta(D)\geq \frac{|\langle\psi|[C,D]|\psi\rangle|}{2} \] measurement of spin along the \(\vec{v}\) axis: an observable \(\vec{v}\cdot\vec{\sigma}=v_1\sigma_1+v_2\sigma_2+v_3\sigma_3\)

POVM \(\{E_m\}\), \(\sum_m E_m=I\), positive operator

one sided error of distinguishing two (not necessarily orthogonal) states

${M_m^}M_m=E_m$ unitary \(U_m\) s.t. \(M_m=U_m\sqrt{E_m}\)

equivalence up to the global phase factor

Postulate 4: the state space of a composite system

a linear operator on a subspace which preserves inner product can be extended to the entire space as a unitary operator

unitary dynamics + projective measurements + ability to introduce ancillary systems = Postulate 3

superdense coding

Bell states: 00+11, 00-11, 01+10, 01-10

for all bell states, \(\langle\psi|E\otimes I|\psi\rangle=\frac{1}{2}(\langle 0|E|0\rangle+\langle 1|E|1\rangle)\), thus undistinguishable by only the first qubit.

density operator \(\rho\):

  • positive operator
  • \(tr(\rho)=1\)

\(tr(\rho^2)\leq 1\) with equality iff \(\rho\) is a pure state \[ \rho=\sum_i p_i|\psi_i\rangle\langle\psi_i|=\sum_j q_j|φ_j\rangle\langleφ_j|\Longleftrightarrow \exists U\ \sqrt {p_i}|\psi_i\rangle=\sum_j u_{ij}\sqrt{q_j}|φ_j\rangle \] an arbitrary density matrix for a qubit may be written as \[ \rho=\frac{I+\vec{r}\cdot\vec{\sigma}}{2} \] partial trace \(\rho^A\equiv tr_B(\rho^{AB})\) defined by \(tr_B(\rho\otimes\sigma)=\rho\ \mathrm{tr}(\sigma)\)

the reduced density operator for Bell states is \(\frac{I}{2}\)

why partial trace? the unique function that satisfies \(tr(M\rho^A)=tr((M\otimes I)\rho^{AB})\), the correct measurement statistics

Schmidt decomposition: \[ |\psi\rangle=\sum_i\lambda_i|i_A\rangle|i_B\rangle \]

  • \(|\psi\rangle\) is any pure state of \(AB\)

  • \(|i_A\rangle\) are orthonormal states for \(A\), \(|i_B\rangle\) are orthonormal states for \(B\)

  • \(\lambda_i\) are non-negative real numbers satisfying \(\sum_i\lambda_i^2=1\)

  • \(\rho^A=\sum_i\lambda_i^2|i_A\rangle\langle i_A|\), \(\rho^B=\sum_i\lambda_i^2|i_B\rangle\langle i_B|\), the eigenvalues of \(\rho^A\) and \(\rho^B\) are identical

  • a symmetry between component systems of a pure state

  • Schmidt bases, Schmidt number

  • Schmidt number quantifies the "amount" of entanglement, preserved under unitary transformation on component system. \(|\psi\rangle\) is a product state if and only if it has Schmidt number 1.

purification:

  • suppose \(\rho^A=\sum_i p_i|i^A\rangle\langle i^A|\)

  • define \(|AR\rangle\equiv \sum_i\sqrt{p_i}|i^A\rangle|i^R\rangle\)

  • then \(tr_R(|AR\rangle\langle AR|)=\rho^A\)

for a pure state phi of AB, calculate the decomposition of rho^A and rho^B, then the Schmidt decomposition of phi is derived? \[ (\rho^A,\rho^B)\text{ uniquely determine } \rho^{AB}? \]

Course Homepage: Advanced Algorithms

Min-Cut and Max-Cut

Karger's Min-Cut Algorithm

pick an edge uniformly at random and contract it

Theorem:

\(\Pr[\textrm{a min-cut is returned}]\geq\frac{2}{n(n-1)}\)

Proof:

  • a min-cut preserves when not contracted
  • \(C\) is a min-cut \(\Rightarrow|C|\leq\frac{2|E|}{|V|}\)

Corollary:

The number of distinct min-cuts \(\leq n(n-1)/2\)

Faster Min-Cut

contract to \(t=n/\sqrt{2}+1\) vertices and branch

runs in \(O(n^2\log n)\) time and succeeds with probability \(\Omega(\frac{1}{\log n})\)

Greedy Cut

a vertex \(v\) joins one of \(S,T\) to maximize current \(E(S,T)\)

1/2-approximation

Random Cut

\(\mathbf{E}[SOL]=|E|/2\)

de-randomization by conditional expectation

mutual independence, k-wise independence

de-randomization by pairwise independence

Fingerprinting

Checking Matrix Multiplication

Freivald’s Algorithm:

pick a uniform random \(r\in\{0,1\}^n\) check whether \(ABr=Cr\)

if \(AB\neq C\) then \(\Pr[ABr=Cr]\leq \frac{1}{2}\)

Polynomial Identity Testing

Input: a polynomial \(f\in\mathbb{F}[x]\) of degree \(d\).

Output: \(f\equiv 0\) ?

Fundamental Theorem of Algebra

Any non-zero \(d\)-degree polynomial \(f\in\mathbb{F}[x]\) has at most \(d\) roots

Randomized algorithm (fingerprinting):

pick a uniform random \(r\in S\) check if \(f(r)=0\)

if \(f\not\equiv 0\) then \(\Pr[f(r)=0]\leq\frac{d}{|S|}\)

Input: a polynomial \(f\in\mathbb{F}[x_1,...,x_n]\) of degree \(d\).

Output: \(f\equiv 0\) ?

Randomized algorithm (fingerprinting):

pick \(r_1,...,r_n\in S\) uniformly and independently at random, check if \(f(r_1,...,r_n)=0\)

Schwartz-Zippel Theorem

\(f\not\equiv 0\Longrightarrow\Pr[f(r_1,...,r_n)=0]\leq\frac{d}{|S|}\)

Checking Distinctness

Input: two multisets \(A={a_1,...,a_n}\) and \(B={b_1,...,b_n}\) where elements in \([n]\)

Output: \(A=B\) ?

\(f_A(x)=\prod_{i=1}^{n}(x-a_i)\)

\(\textrm{FING}(A)=f_A(r), r\in\mathbb{Z}_p\)

if \(A\neq B\) :

  • \(f_A\equiv f_B\) on \(\mathbb{Z}_p\)
  • \(f_A\not\equiv f_B\) on \(\mathbb{Z}_p\) but \(f_A(r)=f_B(r)\)

Hashing and Sketching

Count Distinct Elements

Input: a sequence \(x_1,x_2,...,x_s\in U=[N]\)

Output: an estimation of \(n=|\{x_1,x_2,...,x_s\}|\)

\((\epsilon,\delta)\)-estimator: a random variable \(\hat{Z}\) s.t. \[ \Pr[(1-\epsilon)n\leq\hat{Z}\leq(1+\epsilon)n]\geq1-\delta \] Markov's Inequality

For nonnegative random variable \(X\), for any \(t>0\), \[ \Pr[X\geq t]\leq\frac{\mathbb{E}[X]}{t} \] Chebyshev's Inequality

For random variable \(X\), for any \(t>0\), \[ \Pr[|X-\mathbb{E}[X]|\geq t]\leq\frac{\mathbf{Var}[X]}{t^2} \] Min Sketch:

  • for each \(1\leq j\leq k\), let \(Y_j=\min_{1\leq i\leq s}h_j(x_i)\)

  • return \(\hat{Z}=1/\bar{Y}-1\) where \(\bar{Y}=\frac{1}{k}\sum_{j=1}^{k}Y_j\)

\[ \begin{align*} &\mathbb{E}[\bar{Y}]=\frac{1}{n+1},\mathbf{Var}[\bar{Y}]\leq\frac{1}{k(n+1)^2}\\ &\Pr\big[\hat{Z}<(1-\epsilon)n\textrm{ or }\hat{Z}>(1+\epsilon)n\big] \leq\Pr\left[|\bar{Y}-\mathbb{E}[\bar{Y}]|>\frac{\epsilon/2}{n+1}\right] \leq\frac{4}{k\epsilon^2} \leq\delta \end{align*} \]

Space cost: \(k=\lceil4/(\epsilon^2\delta)\rceil\) real numbers in \([0,1]\)

Storing \(k\) idealized hash functions

Universal Hash Family:

A family \(\mathcal{H}\) of hash functions in \(U\rightarrow[m]\) is \(k\)-universal if for any distinct \(x_1,...,x_k\in U\), \[ \Pr_{h\in\mathcal{H}}[h(x_1)=\dots=h(x_k)]\leq\frac{1}{m^{k-1}} \] Moreover, \(\mathcal{H}\) is strongly \(k\)-universal (\(k\)-wise independent) if for any distinct \(x_1,...,x_k\in U\) and any \(y_1,...,y_k\in[m]\), \[ \Pr_{h\in\mathcal{H}}\left[\bigwedge_{i=1}^kh(x_i)=y_i\right]=\frac{1}{m^k} \] Linear Congruential Hashing

  • Represent \(U\subseteq\mathbb{Z}_p\) for sufficiently large prime \(p\)

  • \(h_{a,b}(x)=((ax+b)\mod p)\mod m\)

  • \(\mathcal{H}=\{h_{a,b}|a\in\mathbb{Z}_p\backslash\{0\},b\in\mathbb{Z}_p\}\)

Theorem:

  • The linear congruential family \(\mathcal{H}\) is 2-wise independent

  • to achieve \(k\)-wise independent: degree-\(k\) polynomial in finite field with random coefficients

Flajolet-Martin Algorithm

BJKST Algorithm

Frequency Estimation

Bloom Filters

Data: a set \(S\) of \(n\) items \(x_1,x_2,...,x_n\in U=[N]\)

Query: an item \(x\in U\)

Determine whether \(x\in S\)

uniform & independent hash function \(h_1,...,h_k:U\rightarrow[m]\)

Data Structure:

  • bit vector \(v\in\{0,1\}^m\), initially all 0's
  • for each \(x_i\): set \(v[h_j(x_i)]=1\) for all \(1\leq j\leq k\)
  • Query x: "yes" iff \(v[h_j(x)]=1\) for all \(1\leq j\leq k\)

choose \(k=c\ln 2\) and \(m=cn\)

space cost: \(m=cn\) bits

time cost: \(k=c\ln 2\)

false positive \(\leq(1-(1-1/m)^{kn})^k\approx(1-e^{-kn/m})^k=2^{-c\ln 2}\leq(0.6185)^c\)

Count-Min Sketch

Data: a sequence \(x_1,x_2,...,x_n\in U=[N]\)

Query: an item \(x\in U\)

Estimate the frequency \(f_x=|\{i:x_i=x\}|\) of \(x\)

\(\hat{f}_x\) : estimation of \(f_x\) within additive error

\(\Pr\big[|\hat{f}_x-f_x|\geq\epsilon n\big]\leq\delta\)

\(k\) independent 2-universal hash functions \(h_1,...,h_k:[N]\rightarrow [m]\)

  • \(\mathrm{CMS}[k][m]\), initialized to all 0's
  • Upon each \(x_i\): \(\mathrm{CMS}[j][h_j(x_i)]++\) for all \(1\leq j\leq k\)
  • Query \(x\): return \(\hat{f}_x=\min_{1\leq j\leq k}\mathrm{CMS}[j][h_j(x)]\)

\(\mathbb{E}\big[\mathrm{CMS}[j][h_j(x)]\big]=f_x+\sum_{y}f_y\Pr[h_j(y)=h_j(x)]\leq f_x+\frac{n}{m}\)

\(\Pr\big[|\hat{f}_x-f_x|\geq\epsilon n\big]=\Pr\left[\forall1\leq j\leq k:\mathrm{CMS}[j][h_j(x)\right]-f_x\geq\epsilon n]\leq \left(\frac{1}{\epsilon m}\right)^k\)

choose \(m=\lceil e/\epsilon\rceil\) and \(k=\lceil\ln(1/\delta)\rceil\)

space cost: \(O(\epsilon^{-1}\log (1/\delta)\log n)\) bits

time cost: \(O(\log(1/\delta))\)

\(O(\log(1/\delta)\log N)\) bits for hash functions

Balls into Bins

1-1 birthday paradox
on-to coupon collector
pre-image size occupancy problem

Birthday Paradox

\(n\) balls are thrown into \(m\) bins

event \(\mathcal{E}\): each bin receives \(\leq 1\) balls

\(\Pr[\mathcal{E}]=\prod_{i=0}^{n-1}(1-\frac{i}{m})\approx e^{-n^2/2m}\)

\(n=\sqrt{2m\ln(1/p)}\Longrightarrow\Pr[\mathcal{E}]=(1\pm o(1))p\)

In a class of \(n>57\) students, with \(>99\%\) probability, there are two students with the same birthday.

Perfect Hashing

Data: a set \(S\) of \(n\) items \(x_1,x_2,...,x_n\in U=[N]\)

Query: an item \(x\in U\)

Determine whether \(x\in S\)

Birthday Paradox by 2-universal hashing:

\(\mathbb{E}[\mathrm{\#\ of\ collisions}]=\sum_{i<j}\Pr[X_i=X_j]\leq\binom{n}{2}\frac{1}{m}\)

\(n\leq\sqrt{2m\epsilon}\Longrightarrow\Pr[\neg\mathcal{E}]=\Pr[Y\geq 1]\leq\mathbb{E}[Y]\leq\epsilon\)

FKS Perfect Hashing

Query(x):

  • retrieve primary hash \(h\)
  • go to bucket \(i=h(x)\)
  • retrieve secondary hash \(h_i\)
  • check whether \(T_i[h_i(x)]=x\)

Size of the \(i\)-th bin: \(|B_i|\)

\(\mathrm{\#\ of\ collision}=\frac{1}{2}\sum|B_i|(|B_i|-1)\Longrightarrow\mathbb{E}\bigg[\sum|B_i|^2\bigg]=\frac{n(n-1)}{m}+n\)

\(\exists h\) from a 2-universal family s.t. the total space cost is \(O(n)\)

\(O(n\log N)\) space, \(O(1)\) time

Coupon Collector

\(X\): number of balls thrown to make all the \(n\) bins nonempty

\(\mathbb{E}[X]=\sum_{i=1}^{n}\frac{n}{n-i+1}=n\sum_{i=1}^{n}\frac{1}{i}=nH(n)\)

expected \(n\ln n+O(n)\) balls

Theorem:

For \(c>0\), \(\Pr[X\geq n\ln n+cn]\leq e^{-c}\)

Proof:

For one bin, it misses all balls with probability:

\(\left(1-\frac{1}{n}\right)^{n\ln n+cn}<e^{-(\ln n+c)}<\frac{1}{ne^c}\)

Then it can be proved by union bound.

Occupancy Problem

\(m\) balls are thrown into \(n\) bins

\(X_i:\) number of balls in the \(i\)-th bin

\(\forall i:\mathbb{E}[X_i]=\frac{m}{n}\)

Theorem:

With high probability, the maximum load is

  • \(O\left(\frac{\log n}{\log \log n}\right)\) when \(m=n\)
  • \(O(\frac{m}{n})\) when \(m\geq n\ln n\)

Proof:

\(\Pr[X_i\geq L]\leq \binom{m}{L}\frac{1}{n^L}\leq\frac{m^L}{L!n^L}\leq\left(\frac{em}{n}\right)^L\frac{1}{L^L}\)

when \(m=n\), this probability \(\leq\left(\frac{e}{L}\right)^L\leq\frac{1}{n^2}\) for \(L=\frac{3\ln n}{\ln\ln n}\)

when \(m\geq n\ln n\), this probability \(\leq e^{-L}\leq\frac{1}{n^2}\) for \(L=\frac{e^2m}{n}\geq e^2\ln n\)

Theorem:

With high probability, the maximum load is

  • \(\Omega\left(\frac{\log n}{\log \log n}\right)\) when \(m=n\)

Proof:

Exactly (?), \(X_1,...,X_n\sim\mathrm{Bin}(m,1/n)\) subject to \(\sum_{i=1}^{n}X_i=m\)

Heuristically, treat \(X_1,...,X_n\) as i.i.d. \(Y_1,...,Y_n\sim\mathrm{Pois}(m/n)\)

  • Theorem: \(\forall m_1,...,m_n\in\mathbb{N}\) s.t. \(\sum_{i=1}^{n}m_i=m\)

\[ \Pr\left[\bigwedge_{i=1}^nX_i=m_i\right]=\Pr\left[\bigwedge_{i=1}^nY_i=m_i\bigg|\sum_{i=1}^nY_i=m\right] \]

  • Theorem (Poisson Approximation): \(\forall\) nonnegative function \(f\)

\[ \mathbb{E}[f(X_1,...,X_n)]\leq e\sqrt{m}\cdot\mathbb{E}[f(Y_1,...,Y_n)] \]

Power of Two Choices

When \(n\) balls are thrown into \(n\) bins using the two-choice paradigm, the maximum load is \(\ln\ln n/\ln 2+\Theta(1)\) w.h.p.

Cuckoo Hashing

Concentration of Measure

Chernoff-Hoeffding Bounds

Chernoff Bound

For independent (or negatively associated) \(X_1,...,X_n\in\{0,1\}\) with \[ X=\sum_{i=1}^{n}X_i\mathrm{\ and\ }\mu=\mathbb{E}[X] \] For any \(\delta>0\), \[ \Pr[X\geq(1+\delta)\mu]\leq\left(\frac{e^\delta}{(1+\delta)^{(1+\delta)}}\right)^\mu\leq\exp\left(-\frac{\mu\delta^2}{3}\right) \] For any \(0<\delta<1\), \[ \Pr[X\leq(1-\delta)\mu]\leq\left(\frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}}\right)^\mu\leq\exp\left(-\frac{\mu\delta^2}{2}\right) \] For \(t\geq 2e\mu\): \[ \Pr[X\geq t]\leq 2^{-t} \] For any \(t>0\): \[ \begin{align*} \Pr[X\geq\mathbb{E}[X]+t]\leq\exp\left(-\frac{2t^2}{n}\right)\\ \Pr[X\leq\mathbb{E}[X]-t]\leq\exp\left(-\frac{2t^2}{n}\right) \end{align*} \] A party of \(O(\sqrt{n\log n})\) can manipulate a vote w.h.p.

Hoeffding Bound

For \(X=\sum_{i=1}^nX_i\), where \(X_i\in[a_i,b_i],1\leq i\leq n\), are independent (or negatively associated),

for any \(t>0\): \[ \begin{align*} \Pr[X\geq\mathbb{E}[X]+t]\leq\exp\left(-\frac{2t^2}{\sum_{i=1}^{n}(b_i-a_i)^2}\right)\\ \Pr[X\leq\mathbb{E}[X]-t]\leq\exp\left(-\frac{2t^2}{\sum_{i=1}^{n}(b_i-a_i)^2}\right) \end{align*} \] Sub-Gaussian Random Variables

A centered (\(\mathbb{E}[Y]=0\)) random variable \(Y\) is said to be sub-Gaussian with variance factor \(\nu\) (denoted \(Y\in\mathcal{G}(\nu)\)) if \[ \mathbb{E}[e^{\lambda Y}]\leq\exp\left(\frac{\lambda^2\nu}{2}\right) \] Chernoff-Hoeffding Bound:

For \(Y=\sum_{i=1}^nY_i\), where \(Y_i\in\mathcal{G}(\nu_i)\), \(1\leq i\leq n\), are independent (or negatively associated) and centered (i.e. \(\mathbb{E}[Y_i]=0\)),

for any \(t>0\): \[ \begin{align*} \Pr[Y\geq t]\leq\exp\left(-\frac{t^2}{2\sum_{i=1}^{n}\nu_i}\right)\\ \Pr[Y\leq-t]\leq\exp\left(-\frac{t^2}{2\sum_{i=1}^{n}\nu_i}\right) \end{align*} \] Poisson Tails

For \(Y\sim\mathrm{Pois}(\mu)\), \[ \begin{align*} k>\mu\Longrightarrow\Pr[X>k]<e^{-\mu}\left(\frac{e\mu}{k}\right)^k\\ k<\mu\Longrightarrow\Pr[X<k]<e^{-\mu}\left(\frac{e\mu}{k}\right)^k \end{align*} \]

The Method of Bounded Differences

McDiarmid’s Inequality:

For independent \(X_1,X_2,...,X_n\), if \(n\)-variate function \(f\) satisfies the Lipschitz condition: for every \(1\leq i\leq n\), \[ |f(x_1,...,x_n)-f(x_1,...,x_{i-1},y_i,x_{i+1},...,x_n)|\leq c_i \] for any possible \(x_1,...,x_n\) and \(y_i\),

then for any \(t>0\): \[ \Pr\bigg[|f(X_1,...,X_n)-\mathbb{E}[f(X_1,...,X_n)|\geq t\bigg]\leq 2\exp\left(-\frac{t^2}{2\sum_{i=1}^nc_i^2}\right) \]

  • Chernoff: sum of Boolean variables, 1-Lipschitz

  • Hoeffding: sum of \([a_i,b_i]\)-bounded variables, \((b_i-a_i)\)-Lipschitz

  • Balls into Bins:

    • \(Y:\) number of empty bins
    • \(Y\) is 1-Lipschitz with respect to \(m\) variables, where \(m\) is the number of balls
  • Pattern Matching

    • \(Y:\) number of substrings of \(X\in\Sigma^n\) matching the pattern \(\pi\in\Sigma^k\)
    • \(Y\) is \(k\)-Lipschitz with respect to \(n\) variables \(X_1,...,X_n\)
  • Sprinkling Points on Hypercube

    • \(Y:\) shortest Hamming distance from \(X\in\{0,1\}^n\) to \(S\subseteq\{0,1\}^n\)
    • \(Y\) is 1-Lipschitz with respect to \(n\) variables \(X_1,...,X_n\)

Martingale

A sequence of random variables \(Y_0,Y_1,...\) is a martingale with respect to \(X_0,X_1,...\) if for all \(t\geq 0\),

  • \(Y_t\) is a function of \(X_0,...,X_t\)
  • \(\mathbb{E}[Y_{t+1}|X_0,X_1,...,X_t]=Y_t\)

A fair gambling game:

  • \(X_i:\) outcome of the \(i\)-th betting
  • \(Y_i:\) capital after the \(i\)-th betting

Azuma's Inequality:

For martingale \(Y_0,Y_1,...\) (with respect to \(X_0,X_1,...\)) satisfying: \[ \forall i\geq 0,\quad |Y_i-Y_{i-1}|\leq c_i \] for any \(n\geq 1\) and \(t>0\) : \[ \Pr\bigg[\big|Y_n-Y_0\big|\geq t\bigg]\leq 2\exp\left(-\frac{t^2}{2\sum_{i=1}^nc_i^2}\right) \] Your capital does not change too fast if:

  • the game is fair (martingale)
  • payoff for each gambling is bounded

Doob Martingale

A Doob sequence \(Y_0,Y_1,...,Y_n\) of an \(n\)-variate function \(f\) with respect to a random vector \((X_1,...,X_n)\) is: \[ \forall 0\leq i\leq n,\quad Y_i=\mathbb{E}\big[f(X_1,...,X_n)|X_1,...,X_i\big] \] Theorem: The Doob sequence \(Y_0,Y_1,...,Y_n\) is a martingale with respect to \(X_1,...,X_n\).

Then McDiarmid’s Inequality can be proved using Azuma's Inequality of a Doob Martingale with respect to independent random variables.

Dimension Reduction

Johnson-Lindenstrauss Theorem

Theorem:

\(\forall 0<\epsilon<1\), for any set \(S\) of \(n\) points from \(\mathbb{R}^d\), there is a \(\phi:\mathbb{R}^d\rightarrow\mathbb{R}^k\) with \(k=O(\epsilon^{-2}\log n)\), such that \(\forall x,y\in S:\) \[ (1-\epsilon)||x-y||_2^2\leq||\phi(x)-\phi(y)||_2^2\leq(1+\epsilon)||x-y||_2^2 \] Actually, \(\phi(x)=Ax\)

The probabilistic method: for random \(A\in\mathbb{R}^{k\times d}\) \[ \Pr\big[\forall x,y\in S:(1-\epsilon)||x-y||_2^2\leq||\phi(x)-\phi(y)||_2^2\leq(1+\epsilon)||x-y||_2^2\big]=1-O\bigg(\frac{1}{n}\bigg) \] Efficient construction of random \(A\in\mathbb{R}^{k\times d}\):

  • projection onto uniform random \(k\)-dimensional subspace
  • independent Gaussian entries
  • i.i.d. -1/+1 entries

JLT (i.i.d. Gaussian entries)

entries of \(A\) are chosen i.i.d. from \(\mathcal{N}(0,1/k)\)

JLT (uniform k-dim subspace)

the \(k\) rows \(A_1,...,A_k\) of \(A\in\mathbb{R}^{k\times d}\) are orthogonal unit vectors \(\in\mathbb{R}^d\) chosen uniformly at random and let \(y_i=\sqrt{\frac{d}{k}}Ax_i\)

(?)

Data: \(n\) points \(y_1,y_2,...,y_n\in X\)

Query: a point \(x\in X\)

Find the data-point \(y_i\) that is closest to \(x\).

\(c\)-ANN (Approximate Nearest Neighbor):

Find a \(y_i\) such that \(\mathrm{dist}(x,y_i)\leq c\cdot\min_{1\leq j\leq n}\mathrm{dist}(x,y_j)\)

\((c,r)\)-ANN (Approximate Near Neighbor):

return a \(y_i\) that \(\mathrm{dist}(x,y_i)\leq c\cdot r\) if \(\exists y_j\) s.t. \(\mathrm{dist}(x,y_j)\leq r\)

​ "no" if \(\forall y_i, \mathrm{dist}(x,y_i)>c\cdot r\)

​ arbitrary if otherwise

if \((\sqrt{c},r)\)-ANN can be solved for any \(r\), then \(c\)-ANN can be solved by using it \(\log_\sqrt{c}(D_\max/D_\min)\) times

\((c,r)\)-ANN in Hamming space (by Dimension Reduction)

  • sample \(k\times d\) Boolean matrix \(A\) with i.i.d. entries \(\in\mathrm{Bernoulli}(p)\)

  • let \(z_i=Ay_i\in\{0,1\}^k\) on finite field \(\mathrm{GF}(2)\)

  • store all \(s\)-balls \(B_s(u)=\{y_i\mid\mathrm{dist}(u,z_i)\leq s\}\) for all \(u\in\{0,1\}^k\)

  • To answer query \(x\in\{0,1\}^d:\)

    • retrieve \(B_s(Ax)\)
    • if \(B_s(Ax)=\emptyset\) return "no"
    • else return any \(y_i\in B_s(Ax)\)

space: \(O(n2^k)\) query time: \(O(kd)\)

for suitable \(k=O(\log n)\), \(p\) and \(s\): \[ \begin{align*} \forall x,y\in\{0,1\}^d:\quad \mathrm{dist}(x,y)\leq r\quad\quad\ \Rightarrow\Pr[\mathrm{dist}(Ax,Ay)>s]<1/n^2\\ \quad\quad\quad\quad\quad\quad\quad\quad\ \mathrm{dist}(x,y)> c\cdot r\quad\Rightarrow\Pr[\mathrm{dist}(Ax,Ay)\leq s]<1/n^2 \end{align*} \]

Locality Sensitive Hashing

A random \(h:X\rightarrow U\) is an \((r,cr,p,q)\)-LSH if \(\forall x,y\in X:\) \[ \begin{align*} \mathrm{dist}(x,y)\leq r\Longrightarrow\Pr[h(x)=h(y)]\geq p\\ \mathrm{dist}(x,y)>c\cdot r\Longrightarrow\Pr[h(x)=h(y)]\leq q \end{align*} \]

an \((r,cr,p,q)\)-LSH \(h:X\rightarrow U\) \(\Longrightarrow\) an \((r,cr,p^k,q^k)\)-LSH \(h:X\rightarrow U^k\)

\((c,r)\)-ANN by LSH

suppose we have \((r,cr,p^*,1/n)\)-LSH \(g:X\rightarrow U\)

Hash functions:

  • i.i.d. instances \(g_1,\ldots,g_l\) of \(g\), where \(l=1/p^*\)

Data structure:

  • \(l\) sorted tables
  • in table-\(j\): sort \(y_1,y_1,\ldots,y_n\) according to \(g_j(y_i)\)

Upon query \(x\in X\):

  • find \(10l\) such \(y_i\) that \(\exists j,g_j(x)=g_j(y_i)\) by binary search
  • if encounter a \(y_i\) that \(\mathrm{dist}(x,y_i)\leq c\cdot r\) return this \(y_i\)
  • else return "no"

space: \(O(n/p^*)\) time: \(O((\log n)/p^*)\) one-sided error <0.5

suppose we have \((r,cr,p,q)\)-LSH \(h:X\rightarrow U\),

then we have \((r,cr,n^{-\rho},1/n)\)-LSH \(g:X\rightarrow U^k\),

where \(\rho=\frac{\log p}{\log q}\)

then we can solve \((c,r)\)-ANN with

space: \(O(n^{1+\rho})\) time: \(O(n^\rho\cdot\log n)\) one-sided error <0.5

\((c,r)\)-ANN in Hamming space (by LSH)

\(\forall x\in\{0,1\}^d:h(x)=x_i\) for uniform random \(i\in[d]\)

\(h:\{0,1\}^d\rightarrow \{0,1\}\) is an \((r,cr,1-r/d,1-cr/d)\)-LSH

\(\rho=\frac{\log(1-r/d)}{\log(1-cr/d)}\leq\frac{1}{c}\)

space: \(O(n^{1+1/c})\) time: \(O(n^{1/c}\cdot\log n)\) one-sided error <0.5

Max-Cut

Greedy Algorithm for Max-Cut

Local Search Algorithm for Max-Cut

  • initially, \((S,T)\) is an arbitrary cut
  • if \(\exists v\) switching side increases cut, then switch side

in a local optima (fixed point): \[ \begin{align*} \forall &v\in S:|E(v,S)|\leq|E(v,T)|\Longrightarrow2|E(S,S)|\leq|E(S,T)|\\ \forall &v\in T:|E(v,T)|\leq|E(v,S)|\Longrightarrow2|E(T,T)|\leq|E(S,T)|\\ &\Longrightarrow |E(S,S)|+|E(T,T)|\leq|E(S,T)|\\ &\Longrightarrow |E(S,T)|\geq\frac{1}{2}|E|\geq\frac{1}{2}OPT \end{align*} \]

Scheduling

Instance: \(n\) jobs \(j=1,\ldots,n\) with processing times \(p_j\in\mathbb{R}^+\)

Solution: An assignment of \(n\) jobs to \(m\) identical machines that minimizes the makespan \(C_\max\)

Graham's List Algorithm

assign job to the current least heavily loaded machine

\(OPT\geq\max_{1\leq j\leq n} p_j\)

\(OPT\geq\frac{1}{m}\sum_{j=1}^n p_j\)

\(C_{i^*}-p_l\leq \frac{1}{m}\sum_{j\neq l}p_j\)

\(p_l\leq\max_{1\leq j\leq n}p_j\)

\(C_\max=C_{i^*}\leq(1-\frac{1}{m})p_l+\frac{1}{m}\sum_{1\leq j\leq n}p_j\leq(2-\frac{1}{m})OPT\)

tight

Longest Processing Time (LPT)

\(p_1\geq p_2\geq\dots\geq p_n\)

assign job to the current least heavily loaded machine

WLOG: \(C_{i^*}>\max_{1\leq j\leq n}p_j\Longrightarrow p_l\leq p_{m+1}\)

\(OPT\geq p_m+p_{m+1}\geq 2p_{m+1}\)

\(p_l\leq\frac{1}{2}OPT\)

\(C_\max=C_{i^*}\leq p_l+\frac{1}{m}\sum_{1\leq j\leq n}p_j\leq \frac{3}{2}OPT\)

can be improved to \(4/3\)-approx. with a more careful analysis

has a PTAS

PTAS (Polynomial-Time Approximation Scheme):

\(\forall\epsilon>0\), a \((1+\epsilon)\)-approx. solution can be returned in time \(f(\epsilon)\cdot\mathrm{poly(n)}\)

competitive ratio of an online algorithm

Set Cover

Instance: A sequence of subsets \(S_1,\dots,S_m\subseteq U\).

Find the smallest \(C\subseteq\{1,\dots,m\}\) s.t. \(\bigcup_{i\in C}S_i=U\).

Hitting Set is exactly Set Cover

Greedy Cover:

initially \(C=\emptyset\)

while \(U\neq\emptyset\) do:

add \(i\) with largest \(|S_i\cap U|\) to \(C\)

\(U=U\backslash S_i\)

Rounding Dynamic Programs

(to be done)