-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathunix.txt
182 lines (160 loc) · 8.29 KB
/
unix.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
Liberating the Smalltalk lurking in C and Unix: https://www.youtube.com/watch?v=LwicN2u6Dro
procps https://gitlab.com/procps-ng/procps :
provides ps, top--but also lesser-known things like w, pmap, pwdx, slabtop.
graph processes/fifos/pipes/sockets https://github.com/zevv/lsofgraph
sudo lsof -n -F | ./lsofgraph | dot -Tpng > foo.png
IPC
Beej's Guide to Unix IPC http://beej.us/guide/bgipc/
STDIO buffering http://www.pixelbeat.org/programming/stdio_buffering/
Default buffering modes:
stdin : buffered (line-buffered if TTY)
stdout: buffered (line-buffered if TTY)
stderr: always unbuffered
Default buffer size:
- based on the kernel page-size (typically 4096)
- TTY-connected stdin/stdout default size = 1024
Example:
$ tail -f access.log | cut -d' ' -f1 | uniq
- `cut` stdout buffer collects 4096-byte chunks before sending to uniq.
`tail` would also have this problem, except `tail -f` calls fflush()
on the stdout stream when new data is received (as do `tcpdump -l`,
`grep --line-buffered` and `sed --unbuffered`).
- `uniq` stdout buffer is TTY-connected, so it is automatically flushed when a newline is written to it.
`stdbuf` command (uses setvbuf()) can control this:
$ for i in $(seq 1 99); do printf "$i" ; sleep 1 ; done | stdbuf -o0 head
show processes in "D" state (waiting-on-IO, "U" on macOS) every 5: http://bencane.com/2012/08/06/troubleshooting-high-io-wait-in-linux/
for x in `seq 1 1 10`; do ps -eo state,pid,cmd | grep "^D"; echo "----"; sleep 5; done
The autotools/pkgconfig dance:
# generate build files
sudo apt install autoconf libtool
aclocal
autoconf # requires libtool
automake --add-missing
autoreconf -vfi # kick it if something went wrong
# build it…
./configure
make
sudo make install
# verify that the lib was installed
ldconfig -p | grep foo
sudo ldconfig -v # kick it if something went wrong
The cmake dance:
mkdir build
cd build
cmake ../
debug
strace: relative timestamps, log to file
strace -o log -r python -c 'import time; time.sleep(1);'
show all open() calls performed by foo and descendants
strace -f -e open foo
# ^ ^ event (syscall) filter
# follow subprocesses
strace -k shows application backtrace (requires strace 4.9 +libunwind)
strace -k -e trace=chdir nvim -u NONE -echdir +":q"
chdir("/home/raf") = 0
> /lib64/libc-2.20.so(chdir+0x7) [0xd8d27]
> /usr/bin/nvim(uv_chdir+0x9) [0x1ac3e9]
> /usr/bin/nvim(init_homedir+0x62) [0xa0c82]
> /usr/bin/nvim(early_init+0xae) [0x15c47e]
> /usr/bin/nvim(main+0xa9) [0x33129]
> /lib64/libc-2.20.so(__libc_start_main+0xf5) [0x21b45]
> /usr/bin/nvim(_start+0x29) [0x354f8]
which files/sockets is a given program using?
lsof -c foo
which processes are using a given port?
lsof -i tcp:8080 -P
which processes are using a file?
lsof /path/to/file
lsof +D /path/to/dir
# linux
fuser -u $(which bash)
/usr/local/bin/bash: 47327(justin) 47349(justin)
macos
snapshot/sample a running process
sample 49814 -f out.log
hold option/alt + click wifi symbol
$ networkQuality
==== SUMMARY ====
Uplink capacity: 25.120 Mbps
Downlink capacity: 128.988 Mbps
Responsiveness: Low (61 RPM)
Idle Latency: 32.708 milliseconds
linux
Linux perf 60-second checklist https://medium.com/netflix-techblog/linux-performance-analysis-in-60-000-milliseconds-accc10403c55
uptime # load of "0.22 0.00 0.00" means "system was idle until recently"
dmesg -wH # kernel often says what's wrong
vmstat 1
mpstat -P ALL 1 # CPU balance
pidstat 1 # per-CPU work
iostat -xz 1
- r/s, w/s, rkB/s, wkB/s: Delivered reads, writes, read Kbytes, and
write Kbytes per second to the device.
- await: Average time for I/O in milliseconds. This is the time that
the application suffers, as it includes both time queued and time
being serviced. Large average times can indicate device
saturation, or device problems.
- avgqu-sz: The average number of requests issued to the device.
Values greater than 1 can be evidence of saturation (although
devices can typically operate on requests in parallel, especially
virtual devices which front multiple back-end disks.)
- %util: Device utilization. This is really a busy percent, showing
the time each second that the device was doing work. Values
greater than 60% typically lead to poor performance (which should
be seen in await). ~100% may indicate saturation. But if the
storage device is a logical device fronting many backend disks,
then 100% utilization may just mean that some I/O is being
processed 100% of the time, however, the back-end disks may be far
from saturated, and may be able to handle much more work.
free -h
- buffers: buffer cache used for block device I/O.
- cached: page cache used by file systems.
sar -n DEV 10
- network interface throughput: rxkB/s and txkB/s, as a measure of workload
sar -n TCP,ETCP 10
- active/s : Locally-initiated TCP connections per second (e.g. connect()).
- passive/s : Remotely-initiated TCP connections per second (e.g. accept()).
- retrans/s : TCP retransmits per second. Indicative of a network or
server issue (unreliable network, or server overloaded and dropping
packets).
Perf-triggered core dumps! https://github.com/Microsoft/ProcDump-for-Linux
# Trigger core dump at CPU >= 65%, up to 3 times, 10-second intervals.
procdump -C 65 -n 3 -p 1234
CPU perf/debug
perf top -d8 --stdio # -g shows callstack
kernel debug
slabinfo, slabtop : kernel heap objects
network perf/debug
netstat -s
tcpretrans
network: simulate slow perf using `tc` ("traffic control") https://stackoverflow.com/a/615757
# `tc qdisc add` creates a rule of the interface. `change` modifies the rule.
tc qdisc {add,change} dev {eth0,wlan0,…} root netem delay 2000ms 10ms 25%
# undo
tc qdisc del dev {eth0,wlan0,…} root
# Show all TCP,UDP sockets in LISTEN state.
ss -tunpo state listening
Linux scheduler
Scheduler maintains hierarchicy of "scheduling domains" based on the physical _hardware_ layout.
Higher-level scheduling domains group physically adjacent scheduling domains, such as chips on the same book.
migration = moving a virtual CPU from one run-queue to another
happens when the wait time is too long, run-queue is full, or
another run-queue is under-utilized
mmap https://lobste.rs/s/gfllo3/use_mmap_with_care
mmap dedicates a region of address space to some kind of backing storage.
The particular kind of backing storage makes all the difference.
People colloquially use "mmap" to mean "mmap of a conventional disk file", but that is only a particular case.
* sbrk is a legacy path that essentially amounts to an anonymous mmap.
the mmap() model sort of assumes no errors happen. This breaks when files are
truncated and you get SIGBUS. It can also break if files disappear (failing
media or removed USB stick) or network filesystem.
context switch
store/restore the state (context) of a process so that execution can be resumed later.
1. save execution state (all registers and many special control structures)
of current running process to memory
2. load execution state of other process (read in from memory).
3. TLB flush (if needed) is a small part of total overhead.
5-7 microseconds
cache invalidation is The Largest Contributor to context-switch costs
Indirect costs:
loss of effectiveness for all "cache like" CPU state: caches, TLBs, branch
prediction stuff (branch direction, branch target, return buffer), etc.