lsof在运维中的应用
Posted zhchoutai
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了lsof在运维中的应用相关的知识,希望对你有一定的参考价值。
原因:在unix系统中。假设有两个进程同一时候使用一个文件,假设当中一个进程删除了这个文件,可是这个文件此刻不会正真被释放,一直要等待引用它的全部进程都释放后才会正真被删除,那么假设别的进程一直在向这个文件写数据,就会造成文件系统非常大,可是用普通命令找不到
測试:
Testlsof.sh #!/bin/sh cnt=1 while (( cnt < 1000 )) do echo " TEST lsof command " sleep 1 cnt=`expr $cnt + 1` done 在一个窗体执行Testlsof.sh [[email protected] ~]# sh Testlsof.sh > testlsof.log 打开另外一个窗体 [[email protected] ~]# rm -rf testlsof.log 这个时候ls会发现testlsof.log不存在了 然后查看Testlsof.sh进程号 [[email protected] ~]# ps -ef|grep Testlsof.sh root 10996 10269 0 10:06 pts/3 00:00:00 sh Testlsof.sh root 11269 10162 0 10:07 pts/2 00:00:00 grep Testlsof 使用lsof命令能够看到testlsof.log仍被 Testlsof.sh进程打开。可是后面有个delete的字样 [[email protected] ~]# lsof -p 10996 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sh 10996 root cwd DIR 253,1 4096 524289 /root sh 10996 root rtd DIR 253,1 4096 2 / sh 10996 root txt REG 253,1 938768 475247 /bin/bash sh 10996 root mem REG 253,1 156872 975392 /lib64/ld-2.12.so sh 10996 root mem REG 253,1 22536 975399 /lib64/libdl-2.12.so sh 10996 root mem REG 253,1 1922152 975393 /lib64/libc-2.12.so sh 10996 root mem REG 253,1 138280 975059 /lib64/libtinfo.so.5.7 sh 10996 root mem REG 253,10 69952 394158 /usr/lib64/gconv/libGB.so sh 10996 root mem REG 253,10 16976 393954 /usr/lib64/gconv/EUC-CN.so sh 10996 root mem REG 253,10 99154480 524306 /usr/lib/locale/locale-archive sh 10996 root mem REG 253,10 26060 394165 /usr/lib64/gconv/gconv-modules.cache sh 10996 root 0u CHR 136,3 0t0 6 /dev/pts/3 sh 10996 root 1w REG 253,1 2160 526093 /root/testlsof.log (deleted) //此处便是占用空间的文件 sh 10996 root 2u CHR 136,3 0t0 6 /dev/pts/3 sh 10996 root 255r REG 253,1 105 526097 /root/Testlsof.sh
对于场景一我们能够例如以下找出仍占用文件系统的文件:
lsof不带參数为默认列出全部进程打开的文件
[[email protected] ~]# lsof|grep delete sh 10996 root 1w REG 253,1 2720 526093 /root/testlsof.log (deleted) 然后依据进程号找出进程名 [[email protected] ~]# ps -ef|grep 10996 root 10996 10269 0 10:06 pts/3 00:00:00 sh Testlsof.sh
场景二:别人打开了一个应用进程,没有记录应用日志输出到那。通过lsof命令找打日志输出位置
例如以下:
模拟他人打开一个应用,并输出日志
[email protected] ~]# nohup sh Testlsof.sh > 111.log & [1] 14170 [[email protected] ~]# nohup: 忽略输入重定向错误到标准输出端 获得进程名字。查看此进程打开的文件 [[email protected] ~]# ps -ef|grep Testlsof root 14170 13132 0 10:21 pts/5 00:00:00 sh Testlsof.sh root 14183 13132 0 10:21 pts/5 00:00:00 grep Testlsof [[email protected] ~]# lsof -p 14170 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sh 14170 root cwd DIR 253,1 4096 524289 /root sh 14170 root rtd DIR 253,1 4096 2 / sh 14170 root txt REG 253,1 938768 475247 /bin/bash sh 14170 root mem REG 253,1 156872 975392 /lib64/ld-2.12.so sh 14170 root mem REG 253,1 22536 975399 /lib64/libdl-2.12.so sh 14170 root mem REG 253,1 1922152 975393 /lib64/libc-2.12.so sh 14170 root mem REG 253,1 138280 975059 /lib64/libtinfo.so.5.7 sh 14170 root mem REG 253,10 69952 394158 /usr/lib64/gconv/libGB.so sh 14170 root mem REG 253,10 16976 393954 /usr/lib64/gconv/EUC-CN.so sh 14170 root mem REG 253,10 99154480 524306 /usr/lib/locale/locale-archive sh 14170 root mem REG 253,10 26060 394165 /usr/lib64/gconv/gconv-modules.cache sh 14170 root 0w CHR 1,3 0t0 3842 /dev/null sh 14170 root 1w REG 253,1 280 526102 /root/111.log //此处便是日志输出文件(0,1,2是进程默认打开的文件描写叙述符) sh 14170 root 2w REG 253,1 280 526102 /root/111.log //这里我们是把1,2的文件描写叙述符重定向到111.log sh 14170 root 255r REG 253,1 105 526097 /root/Testlsof.sh
场景三:查看某个port是被那个进程打开
</pre><pre code_snippet_id="557768" snippet_file_name="blog_20141221_3_2022099" name="code" class="python">[[email protected] ~]# lsof -i:22 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sshd 2589 root 3u IPv4 16268 0t0 TCP *:ssh (LISTEN) sshd 2589 root 4u IPv6 16270 0t0 TCP *:ssh (LISTEN) sshd 2665 root 3u IPv4 16523 0t0 TCP 192.168.1.104:ssh->limt:59580 (ESTABLISHED) sshd 10157 root 3u IPv4 75435 0t0 TCP 192.168.1.104:ssh->limt:49212 (ESTABLISHED) sshd 13110 root 3u IPv4 94454 0t0 TCP 192.168.1.105:ssh->limt:49285 (ESTABLISHED) 能够看到2589进程在IPv4和IPv6打开了22进程port
场景四:在卸载一个文件系统时候报device is busy
[[email protected] ~]# umount /yunwei/ umount: /yunwei: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) [[email protected] ~]# lsof /yunwei/ COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME bash 13132 root cwd DIR 253,5 4096 2 /yunwei [[email protected] ~]# ps -ef|grep 13132 root 13132 13110 0 10:15 pts/5 00:00:00 -bash root 17262 15204 0 10:42 pts/6 00:00:00 grep 13132 [[email protected] ~]# [[email protected] ~]# kill -9 13132 [[email protected] ~]# umount /yunwei/
五。利用/proc文件查看进程打开文件
能够看到有个more程序打开了111.log文件
[[email protected] ~]# ps -ef|grep more root 17369 17330 0 10:44 pts/0 00:00:00 more 111.log root 17390 15204 0 10:45 pts/6 00:00:00 grep more [[email protected] ~]# cd /proc/17369 [[email protected] 17369]# pwd /proc/17369 [[email protected] 17369]# ls attr clear_refs cwd fdinfo maps mountstats oom_score root smaps status autogroup cmdline environ io mem net oom_score_adj sched stack syscall auxv coredump_filter exe limits mountinfo numa_maps pagemap schedstat stat task cgroup cpuset fd loginuid mounts oom_adj personality sessionid statm wchan [[email protected] 17369]# cd fd [[email protected] fd]# ls -lrt 总用量 0 lrwx------. 1 root root 64 12月 19 10:45 2 -> /dev/pts/0 lr-x------. 1 root root 64 12月 19 10:46 3 -> /root/111.log lrwx------. 1 root root 64 12月 19 10:46 1 -> /dev/pts/0 lrwx------. 1 root root 64 12月 19 10:46 0 -> /dev/pts/0 能够看到文件描写叙述符3就是/root/111.log
六,/proc文件系统与lsof命令
[[email protected] fd]# ps -ef|grep ssh root 2589 1 0 07:40 ? 00:00:00 /usr/sbin/sshd root 13110 2589 0 10:15 ?00:00:00 sshd: [email protected]/0,pts/6 root 17491 15204 0 10:47 pts/6 00:00:00 grep ssh [[email protected] fd]# cd /proc/2589 [[email protected] 2589]# cd fd [[email protected] fd]# ls 0 1 2 3 4 [[email protected] fd]# ls -lrt 总用量 0 lrwx------. 1 root root 64 12月 19 10:29 4 -> socket:[16270] //打开了两个socket lrwx------. 1 root root 64 12月 19 10:29 3 -> socket:[16268] lrwx------. 1 root root 64 12月 19 10:29 2 -> /dev/null lrwx------. 1 root root 64 12月 19 10:29 1 -> /dev/null lrwx------. 1 root root 64 12月 19 10:29 0 -> /dev/null [[email protected] fd]# [[email protected] fd]# netstat -an|grep 22 tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 52 192.168.1.105:22 192.168.1.103:49285 ESTABLISHED tcp 0 0 :::22 :::* LISTEN [[email protected] fd]# [[email protected] fd]# lsof -p 2589 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sshd 2589 root cwd DIR 253,1 4096 2 / sshd 2589 root rtd DIR 253,1 4096 2 / sshd 2589 root txt REG 253,10 526008 581654 /usr/sbin/sshd sshd 2589 root mem REG 253,1 65928 974878 /lib64/libnss_files-2.12.so sshd 2589 root mem REG 253,1 243096 975421 /lib64/libnspr4.so sshd 2589 root mem REG 253,1 17096 975423 /lib64/libplds4.so sshd 2589 root mem REG 253,1 21256 975422 /lib64/libplc4.so sshd 2589 root mem REG 253,10 177952 439737 /usr/lib64/libnssutil3.so sshd 2589 root mem REG 253,1 145720 975396 /lib64/libpthread-2.12.so sshd 2589 root mem REG 253,1 12592 975408 /lib64/libkeyutils.so.1.3 sshd 2589 root mem REG 253,1 46368 975410 /lib64/libkrb5support.so.0.1 sshd 2589 root mem REG 253,1 386040 975419 /lib64/libfreebl3.so sshd 2589 root mem REG 253,1 1922152 975393 /lib64/libc-2.12.so sshd 2589 root mem REG 253,10 1286744 451926 /usr/lib64/libnss3.so sshd 2589 root mem REG 253,1 17256 975412 /lib64/libcom_err.so.2.1 sshd 2589 root mem REG 253,1 177520 975411 /lib64/libk5crypto.so.3.1 sshd 2589 root mem REG 253,1 944712 975413 /lib64/libkrb5.so.3.3 sshd 2589 root mem REG 253,1 280520 975414 /lib64/libgssapi_krb5.so.2.2 sshd 2589 root mem REG 253,1 113952 975403 /lib64/libresolv-2.12.so sshd 2589 root mem REG 253,1 43392 975420 /lib64/libcrypt-2.12.so sshd 2589 root mem REG 253,1 116368 974900 /lib64/libnsl-2.12.so sshd 2589 root mem REG 253,1 91096 975398 /lib64/libz.so.1.2.3 sshd 2589 root mem REG 253,1 17520 975404 /lib64/libutil-2.12.so sshd 2589 root mem REG 253,10 1665328 436731 /usr/lib64/libcrypto.so.1.0.0 sshd 2589 root mem REG 253,1 124624 975409 /lib64/libselinux.so.1 sshd 2589 root mem REG 253,1 22536 975399 /lib64/libdl-2.12.so sshd 2589 root mem REG 253,1 58480 975442 /lib64/libpam.so.0.82.2 sshd 2589 root mem REG 253,1 115536 975434 /lib64/libaudit.so.1.0.0 sshd 2589 root mem REG 253,1 43256 975444 /lib64/libwrap.so.0.7.6 sshd 2589 root mem REG 253,1 12688 975098 /lib64/libfipscheck.so.1.1.0 sshd 2589 root mem REG 253,1 156872 975392 /lib64/ld-2.12.so sshd 2589 root 0u CHR 1,3 0t0 3842 /dev/null sshd 2589 root 1u CHR 1,3 0t0 3842 /dev/null sshd 2589 root 2u CHR 1,3 0t0 3842 /dev/null sshd 2589 root 3u IPv4 16268 0t0 TCP *:ssh (LISTEN) sshd 2589 root 4u IPv6 16270 0t0 TCP *:ssh (LISTEN) 能够看到lsof显示的两个TCP文件的DEVICE与/proc一致(16268,16270)
以上是关于lsof在运维中的应用的主要内容,如果未能解决你的问题,请参考以下文章