OS4X check running daemons

From OS4X
Jump to navigation Jump to search

Which processes must run

In order to keep OS4X running, the following processes must exist:

  • os4xrd: The OS4X master receive daemon (parent process of the subsequent receive daemon processes)
    • os4xrd_tcpip: The OS4X receive daemon for accepting TCP/IP connections
    • os4xrd_tcpip_tls: The OS4X receive daemon for accepting TLS secured TCP/IP connections for OFTP2
    • os4xrd_capi_0: The OS4X receive daemon for accepting ISDN connections on the first ISDN controller
    • (os4xrd_capi_x: subsequent ISDN controller receivig processes, if more controllers are configured)
  • os4xsqd: OS4X send queue daemon. Parent process of running actions started by this process, like sending send queue entries, polling partners, updating CRLs etc.
  • os4xdebugd: OS4X debug daemon which collects debugging information from every OS4X process for problem reporting

In case you have OS4X Enterprise, the following process must also exist:

  • os4xclientd: OS4X client daemon, responsible for user authentification and OS4X Enterprise job processing

How do the processes save their state

Every OS4X daemon process saves its PID (process ID) in the database. If a value exists in the database on startup, the daemon refuses to start with the message that another daemon must be running at the moment. In some rare cases, the database information may not meet the real situation, when PIDs are saved in the database and no processes are running. In this case, you can start the daemon in 'forced' mode which overrides the PID check at daemon start.

Beware: If more than one daemon is running, race conditions occur between the different running programs, mostly leading to problems in receiving or sending files. This situation is highly unsupported and absolutely to be avoided!

Identify running processes

In this example, the OS4X receive daemon will be checked for existance. Furthermore, the listening functionality on the configured TCP/IP ports is being checked.

All examples are being shown in a Linux environment. If you have another supported Unix environment (like AIX, HP/UX or Solaris) you may need to modify the command accordingly.

Beware that some operating system environments may resolve port numbers to service names, so a check for open ports may lead to service names defined in a naming resolution service (such as "/etc/services", NIS or others) instead of port numbers.

normal situation

In a running envirnonment, a normal situation can be checked manually with the following command. In case of problems please refer to your operating system manual or system administrator.

PID files

The daemons write a PID file into the configurable temporary directory. The name convention is:

<daemon>.pid

The content of the file is the process ID of the running process, excluding any newline character. When the daemon starts, it overwrites an existing PID file. The file will be created if non-existant. Upon normal daemon stop, the file will be deleted.

Example:

root@os4xvirtual:~# ls -l /opt/os4x/tmp/*.pid
-rw-rw-rw- 1 root     root     5 Mar  9 14:32 /opt/os4x/tmp/os4xclientd.pid
-rw-rw-rw- 1 www-data www-data 5 Mar  9 14:32 /opt/os4x/tmp/os4xdebugd.pid
-rw-rw-rw- 1 www-data www-data 5 Mar  9 14:32 /opt/os4x/tmp/os4xrd.pid
-rw-rw-rw- 1 www-data www-data 5 Mar  9 14:32 /opt/os4x/tmp/os4xsqd.pid
root@os4xvirtual:~# cat /opt/os4x/tmp/os4xclientd.pid
25858root@os4xvirtual:~#

checking OS4X receive daemon

The process table will be checked for running "os4xrd" processes:

os4xbox:~# ps -ef  | grep os4xrd | grep -v grep
www-data  3771     1  0 08:58 ?        00:00:00 /opt/os4x/bin/os4xrd
www-data  3772  3771  0 08:58 ?        00:00:00 os4xrd_tcpip        
www-data  3773  3771  0 08:58 ?        00:00:00 os4xrd_tcpip_tls    
www-data  3774  3771  0 08:58 ?        00:00:00 os4xrd_capi_0       

Process IDs explained:

  • 3771: OS4X receive daemon master process
  • 3772: daemon listening on port 3305
  • 3773: daemon listening on port 6619
  • 3774: daemon listening on first configured ISDN controller

When checking the open ports, the tool "lsof" is very handy:

os4xbox:~# lsof -p 3772 | grep LISTEN
os4xrd  3772 www-data    3u  IPv4 84960962             TCP *:3305 (LISTEN)
os4xbox:~# lsof -p 3773 | grep LISTEN
os4xrd  3773 www-data    3u  IPv4 84960964             TCP *:6619 (LISTEN)

Checking ISDN incoming call acceptance is a little bit more complicated, which is described elsewhere.

checking OS4X send queue daemon

In a dormant state, only one "os4xsqd" process is running:

os4xbox:~# ps -ef  | grep os4xsqd | grep -v grep
www-data  3823     1  2 08:59 ?        00:00:35 /opt/os4x/bin/os4xsqd

If running child processes are active, they may transfer files or manage OS4X internal information, such as TSL management or actualizing CRLs.

checking OS4X debug daemon

The OS4X debug daemon must run only once, listening on the configured port for debug messages:

os4xbox:~# ps -ef  | grep os4xdebugd | grep -v grep
www-data  4724     1  2 09:20 ?        00:00:00 /opt/os4x/bin/os4xdebugd
os4xbox:~# lsof -p 4724 | grep LISTEN
os4xdebug 4724 www-data    3u  IPv4 84970858             TCP localhost:10001 (LISTEN)

checking OS4X client daemon

The OS4X Enterprise client daemon must be running at least once. Child processes may exist, but they must have no open ports to the listening port of the configured OS4X client daemon.

os4xbox:~# ps -ef  | grep os4xclientd | grep -v grep
www-data  4979     1  2 09:21 ?        00:00:00 /opt/os4x/bin/os4xclientd
os4xbox:~# lsof -p 4979 | grep LISTEN
os4xdebug 4979 www-data    3u  IPv4 84639858             TCP localhost:60000 (LISTEN)

critical situation

A critical situation is, when PIDs are saved in the database and no corresponding OS4X process exists.

In this example, the OS4X administrative web interface shows that the daemons should be running with the following PIDs:

PIDList.png

But processes don't exist:

os4xbox:~# ps -ef |grep 3772 | grep -v grep
os4xbox:~# 

Since no process is running, this is a stale situation.

When to start in 'forced' mode?

If the critical situation is identifed, and you're absolutely sure that the processes really are not running, you can start them forced.

Starting daemons forced on commandline

When trying to start normally, an error shows the situation:

os4xbox:~# /opt/os4x/bin/os4xrd
ERROR: There seems to run another os4xrd2 with pid 3771!
       Either you stop it or you clean up the database.
       Database name: 'os4x'
       SQL statement needed for cleanup:
 UPDATE `os4x_pids` SET `pid`='-1' WHERE `program`='os4xrd' AND `serverID`=0
alternatively, start program with parameter '-f':
  /opt/os4x/bin/os4xrd -f

All daemons understand the optional start parameter "-f" for a forced start:

os4xbox:~# /opt/os4x/bin/os4xrd -f
os4xbox:~# /opt/os4x/bin/os4xsqd -f
os4xbox:~# /opt/os4x/bin/os4xclientd -f
os4xbox:~# /opt/os4x/bin/os4xdebugd -f

Starting daemon forced via web interface

The web interface offers an easy way to start the processes in forced way: just click on the "start forced" link. A popup asks you if you are really sure about this situation, which you must accept with "OK" in order to start forced:

StartForcedQuestion.png