linux下如何监听进程

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了linux下如何监听进程相关的知识,希望对你有一定的参考价值。

在linux下,监听进程状态是运行的还是等待的还是已经死了的。同时监听由该进程启动的子进程状态。

一、supervise

Supervise是daemontools的一个工具,可以用来监控管理unix下的应用程序运行情况,在应用程序出现异常时,supervise可以重新启动指定程序。

使用:
mkdir test
cd test
vim run 写入希望执行的操作
supervise test (注意这里是的参数是run文件上层的文件夹,改变run的为可执行 chmod +x run)

二、monit

monit是一个小型的开放源码工具来管理和监控Unix系统。Monit可以自动维护进程,及时避免进程异常退出等产生的问题。

系统: monit可以监控问题的发生,包括进程状态、系统cpu负载、内存占用情况等,例如当apache服务的cpu负载以及内存闸弄情况过高时候,它会重启apache服务。
进程: monit可以监控守护进程,包括系统进程。例如当某个进行down掉,它会自动恢复重启该进程。
文件系统:Monit可以监控本地文件、目录、文件系统的变化,包括时间戳、校验值、大小的变化。例如,可以监控文件sha1以及md5的值,来监控文件是否发生变化。
网络:monit可以监控网络连接,支持TCP、UDP、Unix domain sockets以及HTTP、SMTP等。
定时脚本:monit可以用来定时测试程序和脚本,获取程序输出结果,进而判断是否成功或其他情况。
安装:

sudo apt-get install monit
编辑配置:
sudo vim /etc/monit/monitrc
启动、停止、重启:
sudo /etc/init.d/monit start
sudo /etc/init.d/monit stop
sudo /etc/init.d/monit restart
设置页面监控状态:
set httpd port 2812 and
allow 0.0.0.0/0.0.0.0
allow localhost
增加监控:
需要注意的是,这里需要添加start和stop,缺一个都是不行的

1.根据程序名称来监控

check process test with MATCHING test.py
start program = "/home/yxd/test.py"
stop program = "xxxxx"
2.根据pid监控

check process apache with pidfile /var/run/httpd.pid
start program = "/etc/init.d/rcWebServer.sh start https"
stop program = "/etc/init.d/rcWebServer.sh stop https"
if changed pid then aler
参考:用monit监控系统关键进程
supervisord

Supervisor是一个C/S系统,它可以在类unix操作系统让用户来监视和控制后台服务进程的数量。它是由python编写的,常用于进程异常退出的重启保护。
安装:

pip install supervisor
查看配置文件:

echo_supervisord_conf
从该命令的结果中,可以看到各个模块的配置信息。
创建配置文件:

echo_supervisord_conf > /etc/supervisord.conf
配置应用:

[program:test]
command=python /root/test_supervisor.py
process_name=%(program_name)s
stdout_logfile=/root/test.log
stderr_logfile=/root/test.log
保存,启动:

/usr/bin/supervisord -c /etc/supervisord.conf
参考技术A 现用 netstart -ant 查看能否监听到他的端口!要是没有的话是就是没用运行这个服务。要是想监听进程的话就用 pstree ! 参考技术B ps -aux | grep wait 就是等待的进程了 参考技术C ps -aux | grep 进程 应该可以吧 参考技术D top
pstree

Linux hostname对Oracle实例以及监听的影响

    在Linux平台中,对hostname的修改,是否对ORACLE数据库实例或监听进程有影响呢?如果有影响,又要如何解决问题呢?另外/etc/hosts下相关内容的修改,是否也会影响实例或监听呢?这里涉及的场景非常多,当然关系也非常复杂,我们下面通过几个例子来测试验证一下。

    如下所示,服务器/etc/hosts 与/etc/sysconfig/network的原始配置信息如下

[root@test ~]# more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
#::1            localhost6.localdomain6 localhost6
127.0.0.1       localhost.localdomain localhost
192.168.27.134  test test 
[root@test ~]# 
 
 
[root@test ~]# more /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=test
GATEWAY=192.168.27.1
[root@test ~]# 

 

 

1: 首先假设有个需求,需要修改hostname,使之变成test.edution.com(加上域名部分), 那么此时是否有问题呢?我们先修改/etc/sysconfig/network下的HOSTNAME,然后重启服务器

[root@test ~]# more  /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=test.eduction.com
GATEWAY=192.168.27.1
 
[root@test ~]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
#::1            localhost6.localdomain6 localhost6
127.0.0.1       localhost.localdomain localhost
192.168.27.134  test test

 

然后我们重启数据库实例后,并没有任何问题,但是重启监听的时候遇到下面错误:

[oracle@test ~]$ sqlplus / as sysdba
 
SQL*Plus: Release 10.2.0.5.0 - Production on Sat Jun 18 16:42:21 2016
 
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
 
Connected to an idle instance.
 
SQL> startup
ORACLE instance started.
 
Total System Global Area 1509949440 bytes
Fixed Size                  2096472 bytes
Variable Size            1392509608 bytes
Database Buffers           67108864 bytes
Redo Buffers               48234496 bytes
Database mounted.
Database opened.
SQL> exit
Disconnected from Oracle Database 10g Release 10.2.0.5.0 - 64bit Production
[oracle@test ~]$ lsnrctl start
 
LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 18-JUN-2016 16:42:47
 
Copyright (c) 1991, 2010, Oracle.  All rights reserved.
 
Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...
 
TNSLSNR for Linux: Version 10.2.0.5.0 - Production
Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=test.eduction.com)(PORT=1521)))
 
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
TNS-12535: TNS:operation timed out
 TNS-12560: TNS:protocol adapter error
  TNS-00505: Operation timed out
   Linux Error: 110: Connection timed out
[oracle@test ~]$ 

clip_image001

 

出现这个问题时,必须修改/etc/hosts下主机名的部分,使之与/etc/sysconfig/network下的HOSTNAME一致,上面错误就能解决。如下红色部分所示:

[root@test ~]# more /etc/sysconfig/network

NETWORKING=yes

NETWORKING_IPV6=no

HOSTNAME=test.eduction.com

GATEWAY=192.168.27.1

[root@test ~]# more /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

#::1 localhost6.localdomain6 localhost6

127.0.0.1 localhost.localdomain localhost

192.168.27.134 test.eduction.com test

[root@test ~]#

 

由于这里测试,我修改了域名,如果服务器真实域名部分跟/etc/resolv.conf一致,那么数据库实例启动过程中,监控告警日志,就会发现告警日志里面会出现大量ORA-07445 & ORA-00108错误

clip_image002

 

2:修改/etc/sysconfig/network下的hostname并使之生效,如下所示

[oracle@kerry ~]$ more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
#::1            localhost6.localdomain6 localhost6
127.0.0.1       localhost.localdomain localhost
192.168.27.134  test.eduction.com  test 
[oracle@kerry ~]$ more /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=kerry.eduction.com
GATEWAY=192.168.27.1

 

数据库实例启动并没有任何问题,但是监听启动出现上面一样的错误。 如果域名使用真实的域名,则会遇到另外一种情况,告警日志里面也会出现下面错误

Errors in file /u01/app/oracle/admin/SCM2/bdump/scm2_ora_4494.trc:

ORA-07445: exception encountered: core dump [kslgetl()+120] [SIGSEGV] [Address not mapped to object] [0x000000210] [] []

ORA-00108: failed to set up dispatcher to accept connection asynchronously

clip_image003

关于这个,可以参考官方文档ORA-07445: [kslgetl()+80] Followed by ORA-108: failed to set up dispatcher to accept connection asynchronously (文档 ID 1298804.1)

 

3: 如果屏蔽/etc/hosts下的localhost部分,如下所示,此时有可能会影响监听

 

[root@kerry ~]# more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
#::1            localhost6.localdomain6 localhost6
#127.0.0.1       localhost.localdomain localhost
192.168.27.134  kerry.eduction.com  kerry 
[root@kerry ~]# 
 
 
[oracle@kerry ~]$ lsnrctl start
 
LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 18-JUN-2016 17:45:37
 
Copyright (c) 1991, 2010, Oracle.  All rights reserved.
 
Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...
 
TNSLSNR for Linux: Version 10.2.0.5.0 - Production
Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=kerry.eduction.com)(PORT=1521)))
 
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
TNS-12547: TNS:lost contact
 TNS-12560: TNS:protocol adapter error
  TNS-00517: Lost contact
   Linux Error: 104: Connection reset by peer

clip_image004

这是因为我没有在$ORACLE_HOME/network/admin下配置listener.ora,所以在注释或删除了/etc/hosts下localhost部分后就会出现这个错误,因为在没有listener.ora下的情况下,都会使用默认值(如下官方文档描述),监听进程会使用本机配置127.0.0.1注册监听服务,所以会出现上面错误信息,官方文档关于这方面的描述如下所示:

 

Oracle Net Listener Configuration Overview

Note:

Oracle Database 10g and later databases require a version 10 or later listener. Earlier versions of the listener are not supported for use with Oracle Database 10g and later databases. However, you can use a version 10 listener with previous versions of Oracle Database.

A listener is configured with one or more listening protocol addresses, information about supported services, and parameters that control its runtime behavior. The listener configuration is stored in a configuration file named listener.ora.

Because all of the configuration parameters have default values, it is possible to start and use a listener with no configuration. This default listener has a name of LISTENER, supports no services on startup, and listens on the following TCP/IP protocol address:

(ADDRESS=(PROTOCOL=tcp)(HOST=host_name)(PORT=1521))

Supported services, that is, the services to which the listener forwards client requests, can be configured in the listener.ora file or this information can be dynamically registered with the listener. This dynamic registration feature is called service registration. The registration is performed by the PMON process—an instance background process—of each database instance that has the necessary configuration in the database initialization parameter file. Dynamic service registration does not require any configuration in the listener.ora file.

 

解决方案两种:

1:在$ORACLE_HOME/network/admin/下配置listener.ora文件。则屏蔽或删除/etc/hosts下127.0.0.1后,监听不会有任何问题。

2:在配置文件/etc/hosts下增加localhost(红色部分所示)也能解决这个问题。

[root@kerry ~]# more /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

#::1 localhost6.localdomain6 localhost6

#127.0.0.1 localhost.localdomain localhost

192.168.27.134 kerry.eduction.com kerry localhost

[root@kerry ~]#

具体可以参考官方文档Starting TNS Listener or LSNRCTL Start Yields TNS-12541, Linux Error: 111: Connection Refused (文档 ID 343295.1)

 

  另外,我们这里也忽略了lisnter.ora里面的配置,如果该配置文件使用的是hostname而不是IP,那么也会遇到一些问题。

以上是关于linux下如何监听进程的主要内容,如果未能解决你的问题,请参考以下文章

Linux下如何查看哪些端口处于监听状态

linux下查看监听port相应的进程

linux下tomcat启动没有日志,没有进程,没有报错,没有监听端口

linux系统实现多个进程监听同一个端口

linux下如何屏蔽端口

Linux系统之查看进程监听端口方法