2008/10/20

dtrace를 이용한 프로세스의 화일(fd)당 write 패턴 분석

간혹 프로세스가 작동하는 것을 보면 프로세스가 오픈한 화일들에 대한 읽기/쓰기 패턴이 궁금할때가 있어집니다. 
예를 들면, 평소에 문제가 없었는데, DB로부터 데이타를 읽어와서 화면에 출력하는 어떤 경우 화면에 출력이 느려지는 경우가 있습니다.

이런 환경에서 이 프로세스가 오픈한 화일들을 대상으로 어떻게 작동하는 지를 볼 수 있다면 매우 편리할 것입니다.

솔라리스에서는 pfiles라는 유틸리티나 mdb를 이용하여 프로세스가 오픈한 화일들을 볼 수가 있습니다. 다음은, 이메일 클라이언트로 유명한 썬더버드라는 툴이 어떤 화일들을 오픈했는 지 확인하는 방법입니다.

#pfiles 1180
bhkim@vaio-bhkim:~/Desktop$ pfiles 1180  > /tmp/thunderbird.1180.pfiles
bhkim@vaio-bhkim:~/Desktop$ cat /tmp/thunderbird.1180.pfiles
1180:    /usr/lib/thunderbird/thunderbird-bin
  Current rlimit: 512 file descriptors
   0: S_IFCHR mode:0666 dev:301,0 ino:6815752 uid:0 gid:3 rdev:13,2
      O_RDONLY|O_LARGEFILE
      /devices/pseudo/mm@0:null
   1: S_IFIFO mode:0000 dev:299,0 ino:182 uid:0 gid:50 size:0
      O_RDWR
   2: S_IFIFO mode:0000 dev:299,0 ino:182 uid:0 gid:50 size:0
      O_RDWR
   3: S_IFDOOR mode:0444 dev:311,0 ino:42 uid:0 gid:0 size:0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[1156]
      /var/run/name_service_door
   4: S_IFSOCK mode:0666 dev:310,0 ino:50338 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
    SOCK_STREAM
    SO_SNDBUF(16384),SO_RCVBUF(5120)
    sockname: AF_UNIX
    peername: AF_UNIX /tmp/.X11-unix/X0
   5: S_IFIFO mode:0000 dev:299,0 ino:671 uid:101 gid:10 size:0
      O_RDWR
   6: S_IFIFO mode:0000 dev:299,0 ino:671 uid:101 gid:10 size:0
      O_RDWR
   7: S_IFREG mode:0644 dev:182,65544 ino:18136 uid:101 gid:10 size:0
      O_WRONLY|O_CREAT|O_TRUNC
      /export/home/bhkim/.thunderbird/5x1njpem.default/.parentlock
   8: S_IFIFO mode:0000 dev:299,0 ino:672 uid:101 gid:10 size:0
      O_RDWR|O_NONBLOCK
   9: S_IFIFO mode:0000 dev:299,0 ino:672 uid:101 gid:10 size:0
      O_RDWR|O_NONBLOCK
  10: S_IFIFO mode:0000 dev:299,0 ino:673 uid:101 gid:10 size:0
      O_RDWR|O_NONBLOCK
  11: S_IFIFO mode:0000 dev:299,0 ino:673 uid:101 gid:10 size:0
      O_RDWR|O_NONBLOCK
  12: S_IFREG mode:0644 dev:182,65544 ino:18139 uid:101 gid:10 size:140912
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/abook.mab
  13: S_IFREG mode:0644 dev:182,65538 ino:37521 uid:0 gid:2 size:851050
      O_RDONLY|O_LARGEFILE
      /usr/lib/thunderbird/chrome/classic.jar
  14: S_IFREG mode:0644 dev:182,65538 ino:37527 uid:0 gid:2 size:837452
      O_RDONLY|O_LARGEFILE
      /usr/lib/thunderbird/chrome/en-US.jar
  15: S_IFSOCK mode:0666 dev:310,0 ino:31861 uid:0 gid:0 size:0
      O_RDWR FD_CLOEXEC
    SOCK_STREAM
    SO_SNDBUF(16384),SO_RCVBUF(5120)
    sockname: AF_UNIX
    peername: AF_UNIX /tmp/.iiim-bhkim/:0.0
  16: S_IFIFO mode:0000 dev:299,0 ino:674 uid:101 gid:10 size:0
      O_RDWR
  17: S_IFIFO mode:0000 dev:299,0 ino:674 uid:101 gid:10 size:0
      O_RDWR
  18: S_IFIFO mode:0000 dev:299,0 ino:675 uid:101 gid:10 size:0
      O_RDWR
  19: S_IFIFO mode:0000 dev:299,0 ino:675 uid:101 gid:10 size:0
      O_RDWR|O_NONBLOCK
  20: S_IFCHR mode:0644 dev:301,0 ino:78118918 uid:0 gid:3 rdev:149,1
      O_RDONLY
      /devices/pseudo/random@0:urandom
  21: S_IFSOCK mode:0666 dev:310,0 ino:33534 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
    SOCK_STREAM
    SO_SNDBUF(16384),SO_RCVBUF(5120)
    sockname: AF_UNIX
    peername: AF_UNIX /var/tmp/orbit-bhkim/linc-292-0-48e4789fdb523
  22: S_IFSOCK mode:0666 dev:310,0 ino:38996 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
    SOCK_STREAM
    SO_REUSEADDR,SO_SNDBUF(49152),SO_RCVBUF(49152)
    sockname: AF_INET 127.0.0.1  port: 41233
  23: S_IFSOCK mode:0666 dev:310,0 ino:31860 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
    SOCK_STREAM
    SO_REUSEADDR,SO_SNDBUF(16384),SO_RCVBUF(5120)
    sockname: AF_UNIX /var/tmp/orbit-bhkim/linc-49c-0-48e474737f48b
  24: S_IFSOCK mode:0666 dev:310,0 ino:31858 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
    SOCK_STREAM
    SO_REUSEADDR,SO_SNDBUF(16384),SO_RCVBUF(5120)
    sockname: AF_UNIX /var/tmp/orbit-bhkim/linc-49c-0-48e474737f48b
    peername: AF_UNIX
  25: S_IFREG mode:0644 dev:182,65538 ino:180840 uid:0 gid:2 size:1916121
      O_RDONLY|O_LARGEFILE
      /usr/lib/thunderbird/chrome/toolkit.jar
  26: S_IFREG mode:0755 dev:182,65538 ino:180947 uid:0 gid:2 size:790881
      O_RDONLY|O_LARGEFILE
      /usr/lib/thunderbird/chrome/ko.jar
  27: S_IFREG mode:0644 dev:182,65544 ino:70397 uid:101 gid:10 size:2400716
      O_RDONLY|O_LARGEFILE
      /export/home/bhkim/.thunderbird/5x1njpem.default/XUL.mfasl
  28: S_IFREG mode:0400 dev:182,65544 ino:18228 uid:101 gid:10 size:74694
      O_RDONLY|O_LARGEFILE
      /export/home/bhkim/.thunderbird/5x1njpem.default/extensions/{C0CB8BA3-6C1B-47e8-A6AB-1FAB889562D9}/chrome/quicknote.jar
  29: S_IFREG mode:0644 dev:182,65538 ino:180837 uid:0 gid:2 size:2712048
      O_RDONLY|O_LARGEFILE
      /usr/lib/thunderbird/chrome/messenger.jar
  30: S_IFREG mode:0644 dev:182,65544 ino:18131 uid:101 gid:10 size:32825
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/panacea.dat
  31: S_IFIFO mode:0000 dev:299,0 ino:681 uid:101 gid:10 size:0
      O_RDWR|O_NONBLOCK
  32: S_IFREG mode:0644 dev:182,65544 ino:18143 uid:101 gid:10 size:1617
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/mail-apac.sun.com/%EB%B0%9B%EC%9D%80%20%ED%8E%B8%EC%A7%80%ED%95%A7926b93e.msf
  33: S_IFREG mode:0644 dev:182,65544 ino:29639 uid:101 gid:10 size:1508
      O_RDWR|O_CREAT|O_TRUNC
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/mail-apac.sun.com/%EB%B0%9B%EC%9D%80%20%ED%8E%B8%EC%A7%80%ED%95%A36b3cbd8.msf
  34: S_IFREG mode:0644 dev:182,65544 ino:38481 uid:101 gid:10 size:1965
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/mail-apac.sun.com/%EB%B0%9B%EC%9D%80%20%ED%8E%B8%EC%A7%80%ED%95%A9ecefa66.msf
  35: S_IFIFO mode:0000 dev:299,0 ino:681 uid:101 gid:10 size:0
      O_RDWR|O_NONBLOCK
  36: S_IFREG mode:0600 dev:182,65544 ino:18120 uid:101 gid:10 size:98304
      O_RDWR FD_CLOEXEC
      /export/home/bhkim/.thunderbird/5x1njpem.default/cert8.db
  37: S_IFREG mode:0600 dev:182,65544 ino:18129 uid:101 gid:10 size:32768
      O_RDWR FD_CLOEXEC
      /export/home/bhkim/.thunderbird/5x1njpem.default/key3.db
  38: S_IFREG mode:0644 dev:182,65544 ino:18186 uid:101 gid:10 size:3554267
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/mail-apac.sun.com/INBOX.msf
  39: S_IFSOCK mode:0666 dev:310,0 ino:25977 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
    SOCK_STREAM
    SO_SNDBUF(49152),SO_RCVBUF(49640)
    sockname: AF_INET 129.158.2.81  port: 36053
    peername: AF_INET 192.18.19.180  port: 993
  41: S_IFREG mode:0644 dev:182,65538 ino:180836 uid:0 gid:2 size:1575881
      O_RDONLY|O_LARGEFILE
      /usr/lib/thunderbird/chrome/comm.jar
  42: S_IFREG mode:0644 dev:182,65544 ino:117215 uid:101 gid:10 size:169599
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/mail-apac.sun.com/Trash.msf
  43: S_IFREG mode:0644 dev:182,65544 ino:18126 uid:101 gid:10 size:0
      O_RDWR|O_CREAT|O_LARGEFILE
      /export/home/bhkim/.thunderbird/5x1njpem.default/urlclassifier2.sqlite
  44: S_IFREG mode:0644 dev:182,65544 ino:18126 uid:101 gid:10 size:0
      O_RDWR|O_CREAT|O_LARGEFILE
      /export/home/bhkim/.thunderbird/5x1njpem.default/urlclassifier2.sqlite
  45: S_IFSOCK mode:0666 dev:310,0 ino:61169 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK FD_CLOEXEC
    SOCK_STREAM
    SO_SNDBUF(16384),SO_RCVBUF(5120)
    sockname: AF_UNIX
    peername: AF_UNIX /tmp/dbus-6tM2grR7lu
  46: S_IFREG mode:0644 dev:182,65544 ino:64775 uid:101 gid:10 size:375842
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/imap.gmail.com/INBOX.msf
  47: S_IFSOCK mode:0666 dev:310,0 ino:25978 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
    SOCK_STREAM
    SO_SNDBUF(49152),SO_RCVBUF(49680)
    sockname: AF_INET 129.158.2.81  port: 54161
    peername: AF_INET 209.85.135.109  port: 993
  48: S_IFREG mode:0644 dev:182,65544 ino:65840 uid:101 gid:10 size:65031
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/imap.gmail.com/Trash.msf
  49: S_IFSOCK mode:0666 dev:310,0 ino:13077 uid:0 gid:0 size:0
      O_RDWR
    SOCK_STREAM
    SO_SNDBUF(16384),SO_RCVBUF(5120)
    sockname: AF_UNIX
    peername: AF_UNIX /tmp/fam-bhkim/fam-
  50: S_IFREG mode:0644 dev:182,65544 ino:18130 uid:101 gid:10 size:238135
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/history.mab
  51: S_IFREG mode:0644 dev:182,65544 ino:18176 uid:101 gid:10 size:813104
      O_RDWR
      /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/mail-apac.sun.com/Sent-1.msf


많은 화일들을 오픈하는 군요.

일단 이정보를 확보한 상태에서, 다음과 같이  dtrace를 실행볼 수 있겠습니다.
#dtrace -q -n 'syscall::write:entry/pid==1180/{self->trace=timestamp;fd=arg0;}' -n 'syscall::write:return/self->trace/{delta=timestamp-self->trace;@[fd]=avg(delta/1000);printf("transfer size : %d[%d], response time:%d us\n",arg0,fd,delta/1000);self->trace=0;}'

예쁘게 스크립트용으로 정리를 해본다면,

#/usr/bin/dtrace

#option D quiet
syscall::write:entry
/pid==1180/
{
    self->trace=timestamp;
    fd=arg0;
}

syscall::write:return
/self->trace/
{
   delta=timestamp-self->trace;
   printf("transfer
size : %d[%d], response time:%d
us\n",arg0,fd,delta/1000);
   @[fd]=avg(delta/1000);
   self->trace=0;
}

뭐 이쯤 되지 않겠나 싶습니다.  출력을 위해서 세련되게 포맷팅은 하지 않았습니다.
아무런 작동을 하지 않고 있을때  출력은 대개 이렇군요.

transfer size : 1[9], response time:8 us
transfer size : 1[11], response time:6 us
transfer size : 1[9], response time:8 us
transfer size : 1[9], response time:8 us
transfer size : 44[4], response time:8 us
transfer size : 1[9], response time:8 us
transfer size : 1[9], response time:8 us
transfer size : 1[9], response time:8 us
^C
transfer size : 1[9], response time:7 us

               11                6
                9                7
                4                8

printf에서 첫번째 아규먼트는 write()한 크기이고, 두번째는 대상 fd 번호이고, 세번째는 write() 호출에 대한 turnaround time입니다. 따라서, 위 출력의 앞부분에서

transfer size : 44[4], response time:8 us
은 4번 fd에서 44바이트 쓰기를 했는데, 8 microsecond 걸렸다는 얘기로 보면 되겠죠.

위 출력의 최종부분은 각 fd에 대한 평균 turnaround time을 출력하는 부분입니다.
11번 fd에서 평균 6us, 9번 fd에서 7us, 4번 fd에서 8us 정도가 걸렸군요.

이 글 맨앞에 언급한 pfiles를 이용하여 각 fd가 어떤 화일인가를 측정해본다면 매우 가시적인 결과를 얻을 수 있습니다.
fd[4]   AF_UNIX /tmp/.X11-unix/X0 ; 음. X11도에 화면 출력을 위한 fd이군요.
fd[9]
S_IFIFO mode:0000 dev:299,0 ino:672 uid:101 gid:10 size:0 ; IFIFO는 내부 통신용이겠죠.
fd[11]
S_IFIFO mode:0000 dev:299,0 ino:673 uid:101 gid:10 size:0 ; 역시 내부 통신용이네요.

첨부 화일을 하나 다운받고 있는 와중에 dtrace를 떠봤습니다.
...
transfer size : 1[35], response time:2 us
transfer size : 1[35], response time:2 us
transfer size : 4096[54], response time:27 us
transfer size : 73[54], response time:31 us
transfer size : 1[9], response time:4 us
transfer size : 1[9], response time:4 us
transfer size : 1[9], response time:3 us
transfer size : 1[9], response time:3 us
transfer size : 1[9], response time:2 us
transfer size : 1[9], response time:3 us
transfer size : 1[11], response time:4 us
transfer size : 1[9], response time:2 us
transfer size : 4096[53], response time:48 us
transfer size : 4096[53], response time:36 us
transfer size : 4096[53], response time:35 us
transfer size : 4096[53], response time:38 us
transfer size : 4096[53], response time:37 us
transfer size : 4096[53], response time:39 us
transfer size : 1536[53], response time:40 us
transfer size : 1[9], response time:3 us
transfer size : 1[9], response time:2 us
transfer size : 1[9], response time:2 us
...
               11                2
               35                2
                9                3
                6                4
                4                5
               49               11
               39               21
               54               25
               53               28
위와 같은 정보가 나오네요. 53번 54번 fd가 나왔는데, 위에 화일에는 없는 fd이므로 첨부화일을 다운받으면서 만들어진 새 fd라고 보여집니다. 이럴때는 open을 새롭게 감시해서 어떤 화일이 새로 오픈 되었는지도 알 수 있겠죠.

위의 결과에서 각 fd가 write한 총합과 평균 응답시간을 합해서 출력을 하려면 다음과 같이 할 수 있습니다.
dtrace를 살짝 바꿔봅니다.

#dtrace -q -n 'syscall::write:entry/pid==1180/{self->trace=timestamp;fd=arg0;}' -n 'syscall::write:return/self->trace/{delta=timestamp-self->trace;@w[fd]=sum(arg0);@l[fd]=avg(delta/1000);printf("transfer size : %d[%d], response time:%d us\n",arg0,fd,delta/1000);self->trace=0;}' -n 'END{printa(@w,@l);}'

...(생략)
transfer size : 4096[53], response time:39 us
transfer size : 4096[53], response time:11 us
transfer size : 4096[53], response time:10 us
transfer size : 4096[53], response time:10 us
transfer size : 4096[53], response time:10 us
transfer size : 4096[53], response time:10 us
transfer size : 4096[53], response time:10 us
transfer size : 4096[53], response time:10 us
transfer size : 2079[53], response time:9 us
transfer size : 120[4], response time:11 us
transfer size : 8[4], response time:3 us
...(중략)


                6                2                4
               30               40            72036
               49               46               11
               38               73            73137
               35              346                2
                9              406                3
               11              533                2
               39              859               17
               54            14925               24
               53           461993               23
                4          1199979                5

4번 fd에서 가장 많은 쓰기를 하네요. 화면에 출력하기 바쁘단 얘기죠. 그렇지만, 매 쓰기당 평균 응답시간은 5us입니다. 그다지 느리지 않네요. 반면, 그위에 새로이 나타난 53,54번 fd는 쓰기의 양은 덜하지만 평균 쓰기 응답시간이 23us,24us입니다. 첨부 화일의 fd인데, 역시 disk(화일 시스템)이 느리다는 것을 의미합니다. 53번 화일은 총 460K 바이트정도 썼으며, 매 쓰기당 23us 걸렸네요. 갑자기 전부 몇번 write를 했는지 쓰기하는 총 걸린 시간이 얼마인지를 알 수 있으면 더 좋겠다는 생각이 드는 군요.

dtrace를 조금더 바꿔봅니다. 그리고 이번에는 '지운편지함'을 액세스 해봤습니다.

#dtrace -q -n 'syscall::write:entry/pid==1180/{self->trace=timestamp;fd=arg0;}' -n 'syscall::write:return/self->trace/{delta=timestamp-self->trace;@c[fd]=count();@w[fd]=sum(arg0);@t[fd]=sum(delta/1000); @l[fd]=avg(delta/1000);printf("transfer size : %d[%d], response time:%d us\n",arg0,fd,delta/1000);self->trace=0;}' -n 'END{printa(@c,@w,@t,@l);}'

transfer size : 216[4], response time:5 us
transfer size : 212[4], response time:4 us
transfer size : 112[4], response time:6 us
transfer size : 172[4], response time:6 us
transfer size : 6420[4], response time:11 us
transfer size : 1[9], response time:7 us
^C
               fd       total # of writes total size of write total elapsed  average tt
               30                5              813               78               15
               42                5            29654            52081            10416
               53               12              869              198               16
               35               77             5946              222                2
               11              141              187              334                2
                9              313              313             1022                3
                4             3025          1133659            15468                5

fd 42번의 값이 눈에 확띄는군요. 총 쓰기 시간이 52ms이고, 총 write 호출 회수는 5번, 각 호출당 응답시간은 10ms이네요.
42번이 뭘까요 ?
  42: /export/home/bhkim/.thunderbird/5x1njpem.default/ImapMail/mail-apac.sun.com/Trash.msf

아하, 지운 편지함의 메일박스 화일이군요.
결론적으로 본다면, 이 메일박스를 좀 더 빠른 디렉토리에 두면 시간은 아주 빨라지겠군요.

댓글 없음: