Linux

一则利用bcc工具包的filetop和opensnoop工具跟踪IO的案例

一 背景说明

两台机器,在server1上运行一个Docker容器,跑的是Python程序的Flask应用,提供一个访问单词热度的接口;server2(172.16.17.36)向server1发起请求调用。

这个案例有意思的是,从server1上明明看到有消耗高CPU资源和磁盘IO的进程,但是,不太直观的看到该进程具体读写的是哪个文件。于是,这里给出不同的分析思路和解决方案。

server1: 172.16.17.7,4C 8G;

server2: 172.16.11.36,24C 64G;

二 分析步骤和思路

1 server1启动container

docker run --name=app -p 10000:80 -itd feisky/word-pop
[root@guoxin7 ~]# docker run --name=app -p 10000:80 -itd feisky/word-pop
3ea1b081cf6d9565b487a56481d0e1182b96d58ae6a33b0f14b1ccc57400ee5f
[root@guoxin7 ~]# 

2 server2验证服务正常

[root@centos-master ~]# curl http://172.16.17.7:10000
hello world
[root@centos-master ~]# 

3 server2向server1发起请求

curl http://172.16.17.7:10000/popularity/word

发现响应特别慢。

4 server1 查看top

Tasks: 244 total,   1 running, 243 sleeping,   0 stopped,   0 zombie
%Cpu(s): 11.3 us, 13.8 sy,  0.0 ni, 74.8 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem : 41037092 total, 35284336 free,   951280 used,  4801476 buff/cache
KiB Swap:  8257532 total,  8257532 free,        0 used. 38038384 avail Mem 
​
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                  
26611 root      20   0   97048  20532   4056 S  98.2  0.1   1:10.09 /usr/local/bin/python /app.py                                                                                                                            
    6 root      20   0       0      0      0 S   0.4  0.0   0:00.13 [ksoftirqd/0]                                                                                                                                            
    9 root      20   0       0      0      0 S   0.4  0.0   0:09.86 [rcu_sched]  
 ...

发现server1 的进程26611耗费了比较高的CPU资源,top界面按下c字符键,调出进程执行的命令,看到该进程执行了/usr/local/bin/python /app.py 。

5 server1通过iostat查看系统IO资源情况

通过iostat -dx 1; -d 列出磁盘设备利用率;-x 列出extended statistics ;数字1,每秒输出1个报告;

[root@guoxin7 ~]# iostat -dx 1
Linux 3.10.0-1160.59.1.el7.x86_64 (guoxin7)     2022年03月21日  _x86_64_        (4 CPU)
​
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.02     0.24    4.36   18.55   433.67  6473.81   603.08    10.89  474.14   16.02  581.70   6.01  13.77
scd0              0.00     0.00    0.00    0.00     0.15     0.00   114.22     0.00    6.89    6.89    0.00   6.67   0.00
dm-0              0.00     0.00    4.28   18.80   430.82  6481.31   598.76    11.02  476.23   16.71  580.93   5.96  13.75
dm-1              0.00     0.00    0.01    0.00     0.33     0.00    50.09     0.00    1.77    1.77    0.00   1.57   0.00
​
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00  156.00     0.00 73668.00   944.46   144.42 1061.29    0.00 1061.29   6.42 100.10
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00  160.00     0.00 75732.00   946.65   145.41 1039.44    0.00 1039.44   6.26 100.10
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
​
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00  257.00     0.00 123472.00   960.87   143.15  929.06    0.00  929.06   3.89 100.00
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00  256.00     0.00 122032.00   953.38   145.58  938.41    0.00  938.41   3.95 101.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
​
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    1.00  183.00     8.00 85364.00   927.96   144.19  474.96  119.00  476.90   5.43 100.00
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    1.00  194.00     8.00 89984.00   922.99   145.18  451.15  119.00  452.86   5.13 100.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
​
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00  184.00     0.00 85640.00   930.87   146.85 1091.28    0.00 1091.28   5.43 100.00
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00  161.00     0.00 75776.00   941.32   147.84 1253.72    0.00 1253.72   6.21 100.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
​
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00  318.00     0.00 148888.00   936.40   144.58  520.76    0.00  520.76   3.14 100.00
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00  323.00     0.00 150236.00   930.25   145.55  517.09    0.00  517.09   3.10 100.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
​
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00  157.00     0.00 71612.00   912.25   147.31  453.65    0.00  453.65   6.37 100.00
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00  163.00     0.00 74660.00   916.07   148.31  438.91    0.00  438.91   6.13 100.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

持续观察,可以看到,系统的磁盘使用率有抖动,%util有时会达到100%,如上。iops(r/s+w/s)以及throughout(rkB/s+wkB/s)也会明显的变化。这意味着,系统的磁盘IO使用率会随着客户端发起请求,持续升高。那么,究竟是哪个进程在消耗较高的磁盘IO呢?

6 pidstat跟踪server1上进程的IO使用情况

[root@guoxin7 ~]# pidstat -p 26611 1
Linux 3.10.0-1160.59.1.el7.x86_64 (guoxin7)     2022年03月21日  _x86_64_        (4 CPU)
​
13时49分22秒   UID       PID    %usr %system  %guest    %CPU   CPU  Command
13时49分23秒     0     26611   63.00   38.00    0.00  100.00     0  python
13时49分24秒     0     26611   62.00   38.00    0.00  100.00     0  python
13时49分25秒     0     26611   44.00   29.00    0.00   73.00     0  python
13时49分26秒     0     26611    0.00    7.00    0.00    7.00     2  python
13时49分27秒     0     26611    2.00   48.00    0.00   50.00     2  python
13时49分28秒     0     26611   30.00   19.00    0.00   49.00     2  python
13时49分29秒     0     26611   62.00   39.00    0.00  100.00     0  python
13时49分30秒     0     26611   64.00   36.00    0.00  100.00     2  python
13时49分31秒     0     26611   62.00   38.00    0.00  100.00     3  python
13时49分32秒     0     26611   64.00   37.00    0.00  100.00     1  python
13时49分33秒     0     26611   63.00   38.00    0.00  100.00     1  python
13时49分34秒     0     26611   62.00   37.00    0.00   99.00     1  python
^C
平均时间:     0     26611   48.17   33.67    0.00   81.83     -  python
[root@guoxin7 ~]# 

pidstat -d,列出进程磁盘使用率信息:

[root@guoxin7 ~]# pidstat -d -p 26611 1
Linux 3.10.0-1160.59.1.el7.x86_64 (guoxin7)     2022年03月21日  _x86_64_        (4 CPU)
​
13时50分12秒   UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
13时50分13秒     0     26611      0.00 169382.18      0.00  python
13时50分14秒     0     26611      0.00 109853.47      0.00  python
13时50分15秒     0     26611      0.00 134661.39      0.00  python
13时50分16秒     0     26611      0.00 122948.00      0.00  python
13时50分17秒     0     26611      0.00 259640.00      0.00  python
13时50分18秒     0     26611      0.00 284580.00      0.00  python
13时50分19秒     0     26611      0.00 293028.00      0.00  python
13时50分20秒     0     26611      0.00 299768.00      0.00  python
13时50分21秒     0     26611      0.00 279688.00      0.00  python
13时50分22秒     0     26611      0.00 286160.00      0.00  python
13时50分23秒     0     26611      0.00 288036.00      0.00  python
13时50分24秒     0     26611      0.00 291524.00      0.00  python
13时50分25秒     0     26611      0.00  22760.00      0.00  python
13时50分26秒     0     26611      0.00      0.00      0.00  python
13时50分27秒     0     26611      0.00      0.00      0.00  python
13时50分28秒     0     26611      0.00      0.00      0.00  python
13时50分29秒     0     26611      0.00      0.00      0.00  python
13时50分30秒     0     26611      0.00      0.00      0.00  python
13时50分31秒     0     26611      0.00      0.00 778244.00  python
13时50分32秒     0     26611      0.00      0.00 2025068.00  python
13时50分33秒     0     26611      0.00 162403.96      0.00  python
13时50分34秒     0     26611      0.00 282608.00      0.00  python
13时50分35秒     0     26611      0.00 282020.00      0.00  python
13时50分36秒     0     26611      0.00 278336.00      0.00  python
^C
平均时间:     0     26611      0.00 160281.20 116610.32  python
[root@guoxin7 ~]# 

的确,看到26611进程的磁盘IO比较高。

7 strace跟踪进程26611的读写文件

[root@guoxin7 ~]# strace -p 26611
strace: Process 26611 attached
select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=62570}) = 0 (Timeout)
futex(0x7ffcab9dd974, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4998819}) = -1 EAGAIN (资源暂时不可用)
futex(0x7ffcab9dd974, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999436}) = 0
stat("/usr/local/lib/python3.7/importlib/_bootstrap.py", {st_mode=S_IFREG|0644, st_size=39278, ...}) = 0
stat("/usr/local/lib/python3.7/importlib/_bootstrap.py", {st_mode=S_IFREG|0644, st_size=39278, ...}) = 0
stat("/usr/local/lib/python3.7/importlib/_bootstrap.py", {st_mode=S_IFREG|0644, st_size=39278, ...}) = 0
stat("/usr/local/lib/python3.7/importlib/_bootstrap_external.py", {st_mode=S_IFREG|0644, st_size=59012, ...}) = 0
stat("/usr/local/lib/python3.7/importlib/_bootstrap_external.py", {st_mode=S_IFREG|0644, st_size=59012, ...}) = 0
stat("/usr/local/lib/python3.7/importlib/_bootstrap_external.py", {st_mode=S_IFREG|0644, st_size=59012, ...}) = 0
stat("/usr/local/lib/python3.7/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5668, ...}) = 0
stat("/usr/local/lib/python3.7/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5668, ...}) = 0
stat("/usr/local/lib/python3.7/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5668, ...}) = 0
futex(0x7ffcab9dd8b4, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999442}) = -1 ETIMEDOUT (连接超时)
futex(0x7ffcab9dd8b4, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4998549}) = 0
futex(0x7f557934c2a4, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f557eb2b324, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f557eb2b2cc, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f557934c234, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f557eb2b2cc, FUTEX_WAKE_PRIVATE, 1) = 1
stat("/usr/local/lib/python3.7/codecs.py", {st_mode=S_IFREG|0644, st_size=36287, ...}) = 0
stat("/usr/local/lib/python3.7/codecs.py", {st_mode=S_IFREG|0644, st_size=36287, ...}) = 0
futex(0x7ffcab9dd564, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999526}) = -1 EAGAIN (资源暂时不可用)
futex(0x7ffcab9dd564, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999624}) = 0
stat("/usr/local/lib/python3.7/codecs.py", {st_mode=S_IFREG|0644, st_size=36287, ..
...
...

并没有看到进程26611对应的读写文件。怎么办呢?有没有其它思路?

8 通过BCC的filetop来跟踪

由于在操作系统层面并没有跟踪到进程26611实际读写的文件是哪个?我们通过BCC的filetop来跟踪内核里读写文件的信息

安装BCC(BPF Compiler Collection)工具包,调用其filetop查看File reads and writes by filename and process。

安装之后,位于/usr/share/bcc/tools路径下:-C表示当前屏幕不清屏的方式持续输出,第1列,表示线程号,COMM线程执行的命令,READS+WRITES分别表示读写次数,R_Kb、W_Kb表示的是读写的大小,T表示方式,R是Read,FILE列,表示的读写的文件名。

[root@guoxin7 tools]# ./filetop -C
Tracing... Output every 1 secs. Hit Ctrl-C to end
​
14:01:07 loadavg: 3.76 2.56 2.18 4/282 7336
​
TID    COMM             READS  WRITES R_Kb    W_Kb    T FILE
7336   python           0      1      0       5175    R 84.txt
7336   python           0      1      0       4785    R 113.txt
7336   python           0      1      0       4589    R 134.txt
7336   python           0      1      0       4492    R 150.txt
7336   python           0      1      0       4443    R 123.txt
7336   python           0      1      0       4394    R 85.txt
7336   python           0      1      0       4345    R 56.txt
7336   python           0      1      0       4296    R 106.txt
7336   python           0      1      0       4248    R 73.txt
7336   python           0      1      0       4052    R 59.txt
7336   python           0      1      0       4052    R 107.txt
7336   python           0      1      0       3662    R 147.txt
7336   python           0      1      0       3466    R 118.txt
7336   python           0      1      0       3369    R 89.txt
7336   python           0      1      0       3369    R 69.txt
7336   python           0      1      0       3369    R 128.txt
7336   python           0      1      0       3320    R 67.txt
7336   python           0      1      0       3320    R 119.txt
7336   python           0      1      0       3271    R 114.txt
7336   python           0      1      0       3222    R 61.txt
​
14:01:08 loadavg: 3.76 2.56 2.18 3/282 7336
​
TID    COMM             READS  WRITES R_Kb    W_Kb    T FILE
7336   python           0      1      0       4394    R 397.txt
7336   python           0      1      0       4296    R 388.txt
7336   python           0      1      0       4199    R 365.txt
7336   python           0      1      0       3955    R 363.txt
7336   python           0      1      0       3857    R 386.txt
7336   python           0      1      0       3857    R 398.txt
7336   python           0      1      0       3808    R 360.txt
7336   python           0      1      0       3710    R 367.txt
7336   python           0      1      0       3710    R 392.txt
7336   python           0      1      0       3417    R 362.txt
7336   python           0      1      0       3173    R 402.txt
7336   python           0      1      0       3027    R 359.txt
7336   python           0      1      0       3027    R 385.txt
7336   python           0      1      0       2978    R 368.txt
7336   python           0      1      0       2929    R 371.txt
7336   python           0      1      0       2929    R 378.txt
7336   python           0      1      0       2880    R 361.txt
7336   python           0      1      0       2880    R 381.txt
7336   python           0      1      0       2832    R 387.txt
7336   python           0      1      0       2783    R 380.txt
Detaching...
[root@guoxin7 tools]# 

9 BCC的opensnoop查看内核open执行的系统调用

同样使用BCC工具包下提供的另一个工具,opensnoop。

[root@guoxin7 tools]# ./opensnoop -p 26611
PID    COMM               FD ERR PATH
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/0.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/1.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/2.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/3.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/4.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/5.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/6.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/7.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/8.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/9.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/10.txt
26611  python              6   0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/11.txt
...

这次,可以看到26611进程读取的文件的路径,持续跟踪,发现文件的路径会发生变化。大概率猜测一下,可能是前面的Python进程,在创建临时文件,然后操作处理之后,删除了文件。

10 查看Docker 应用代码

进入container内部,查看app.py源代码:

[root@guoxin7 tools]# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                   NAMES
3ea1b081cf6d        feisky/word-pop     "python /app.py"    About an hour ago   Up About an hour    0.0.0.0:10000->80/tcp   app
[root@guoxin7 tools]# docker exec -it app /bin/sh
/ # ls
app.py  bin     dev     etc     home    lib     media   mnt     proc    root    run     sbin    srv     sys     tmp     usr     var
/ # cat app.py 
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
​
import os
import uuid
import random
import shutil
from concurrent.futures import ThreadPoolExecutor
from flask import Flask, jsonify
...
@app.route("/popularity/<word>")
def word_popularity(word):
    dir_path = '/tmp/{}'.format(uuid.uuid1())
    count = 0
    sample_size = 1000
​
    def save_to_file(file_name, content):
        with open(file_name, 'w') as f:
            f.write(content)
​
    try:
        # initial directory firstly
        os.mkdir(dir_path)
​
        # save article to files
        for i in range(sample_size):
            file_name = '{}/{}.txt'.format(dir_path, i)
            article = generate_article()
            save_to_file(file_name, article)
​
        # count word popularity
        for root, dirs, files in os.walk(dir_path):
            for file_name in files:
                with open('{}/{}'.format(dir_path, file_name)) as f:
                    if validate(word, f.read()):
                        count += 1
    finally:
        # clean files
        shutil.rmtree(dir_path, ignore_errors=True)
​
    return jsonify({'popularity': count / sample_size * 100, 'word': word})
...
​

的确,发现了创建临时目录和文件,然后处理之后,再删除的操作。

三 小结

1 分析思路

  • 先top看到高CPU的进程;
  • 然后,iostat -dx 1;查看系统IO
  • 接下来,pidstat -d -p pid 1 ;查看指定进程的磁盘IO开销;
  • 想要通过strace -p pid;找到进程读写的文件信息未果;
  • 到此,转而使用BCC工具包的filetop查看内核中的文件读写信息,找到基本的文件名等读写相关信息;
  • 最后,继续使用BCC工具包的opensnoop工具查看内核中open调用的栈信息;

2 另外一种思路

使用strace -f -p pid;可以跟踪进程及其线程的读写信息:

[root@guoxin7 tools]# strace -f -p 26611|grep Write
strace: Process 26611 attached with 3 threads
[pid 12362] close(6)                    = 0
[pid 26612] restart_syscall(<... resuming interrupted poll ...> <unfinished ...>
[pid 26611] select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=989194} <unfinished ...>
[pid 12362] mmap(NULL, 3002368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578b8e000
[pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0
[pid 12362] munmap(0x7f5578e6b000, 1953792) = 0
[pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/737.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 6
[pid 12362] fcntl(6, F_SETFD, FD_CLOEXEC) = 0
[pid 12362] fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bff0) = -1 ENOTTY (对设备不适当的 ioctl 操作)
[pid 12362] lseek(6, 0, SEEK_CUR)       = 0
[pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bf10) = -1 ENOTTY (对设备不适当的 ioctl 操作)
[pid 12362] lseek(6, 0, SEEK_CUR)       = 0
[pid 12362] mmap(NULL, 3002368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578f2f000
[pid 12362] write(6, "The king explodes a cat with rab"..., 2999999) = 2999999
[pid 12362] munmap(0x7f5578f2f000, 3002368) = 0
[pid 12362] close(6)                    = 0
[pid 26612] <... restart_syscall resumed>) = 0
[pid 26612] futex(0x7f5579450874, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999199} <unfinished ...>
[pid 12362] mmap(NULL, 2752512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578f6c000
[pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0
[pid 12362] munmap(0x7f5578b8e000, 3002368) = 0
[pid 12362] futex(0x7f5579450874, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/738.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666 <unfinished ...>
[pid 26612] <... futex resumed>)        = 0
[pid 26612] poll([{fd=4, events=POLLIN}], 1, 500 <unfinished ...>
[pid 12362] <... open resumed>)         = 6
[pid 12362] fcntl(6, F_SETFD, FD_CLOEXEC) = 0
[pid 12362] fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bff0) = -1 ENOTTY (对设备不适当的 ioctl 操作)
[pid 12362] lseek(6, 0, SEEK_CUR)       = 0
[pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bf10) = -1 ENOTTY (对设备不适当的 ioctl 操作)
[pid 12362] lseek(6, 0, SEEK_CUR)       = 0
[pid 12362] mmap(NULL, 2752512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578ccc000
[pid 12362] write(6, "Some guy kicks some guy to know "..., 2749999) = 2749999
[pid 12362] munmap(0x7f5578ccc000, 2752512) = 0
[pid 12362] close(6)                    = 0
[pid 12362] mmap(NULL, 2002944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578d83000
[pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0
[pid 12362] munmap(0x7f5578f6c000, 2752512) = 0
[pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/739.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 6
[pid 12362] fcntl(6, F_SETFD, FD_CLOEXEC) = 0
[pid 12362] fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bff0) = -1 ENOTTY (对设备不适当的 ioctl 操作)
[pid 12362] lseek(6, 0, SEEK_CUR)       = 0
[pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bf10) = -1 ENOTTY (对设备不适当的 ioctl 操作)
[pid 12362] lseek(6, 0, SEEK_CUR)       = 0
[pid 12362] mmap(NULL, 2002944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5579023000
[pid 12362] write(6, "The king treats some guy to make"..., 1999999) = 1999999
[pid 12362] munmap(0x7f5579023000, 2002944) = 0
[pid 12362] close(6)                    = 0
[pid 12362] mmap(NULL, 4100096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f557899a000
[pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0
[pid 12362] munmap(0x7f5578d83000, 2002944) = 0
[pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/740.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 6

3 个人收获

学习到strace -f -p pid;可以跟踪进程以及其线程的文件读写信息;

学会安装使用BCC工具包中的filetop跟踪内核中文件读写信息;opensnoop;跟踪内核中使用open系统调用的栈信息。

4 关于filetop和opensnoop的使用帮助

可以分别加上-h命令行查看,或者到GitHub站点查看相关示例。

四 参考和链接

1 倪朋飞老师的Linux性能优化实战,27 | 案例篇:为什么我的磁盘I/O延迟很高?;

2 BPF Compiler Collection

3 Installing BCC