gen_server 中的错误也会终止调用进程?

Posted

技术标签:

【中文标题】gen_server 中的错误也会终止调用进程?【英文标题】:Error in gen_server also terminates the calling process? 【发布时间】:2016-03-07 22:13:44 【问题描述】:

我的 gen_server 包含这样的方法:

handle_call(error, From, State) ->
    io:format("Inside the handle_call error~n"),
    1/0.

它提供了一个start(不是start_link)函数:

start() ->
    gen_server:start(local, ?MODULE, ?MODULE, [], []).

normal_call() ->
    gen_server:call(?MODULE, normal).

error_call() ->
    gen_server:call(?MODULE, error).

我从 shell 调用 start 函数:

c(some_module).
some_module:start().

然后我调用错误调用,由于被零除错误而终止服务器进程,但它也终止了 shell(调用进程)。我不明白为什么?它们没有链接,但 shell 仍以新的 pid 重新启动。这是 gen_server 的预期行为,还是我做错了什么?

更新: 它仍然无法正常工作,以帮助我发布完整的代码。

-module(some_test).
-behaviour(gen_server).
-compile(export_all).

%% api functions, can be directly used by the user

start() ->
    gen_server:start(local, ?MODULE, ?MODULE, [], []).

normal_call() ->
    gen_server:call(?MODULE, normal, infinity).

error_call() ->
    gen_server:call(?MODULE, error, infinity).

%% service specific function shoule not be call directly

init(_) ->
    io:format("Inside the init ( ~p )~n", [self()]),
    io:format("Inside the init...~n"),
    ok, nil.

handle_call(normal, _, _) ->
    io:format("Inside the handle_call normal~n"),
    reply, ok, nil;

handle_call(error, _, nil) ->
    io:format("Inside the handle_call error~n"),
    1/0.

terminate(Reason, _) ->
    io:format("Inside the terminate~p~n", [Reason]).

%% just to complete

handle_info(_, _) ->
    noreply, nil.

handle_cast(_, _) ->
    noreply, nil.

code_change(_, _, _) ->
    ok, nil.

%% a single test function that prove that called process was removed immeditely, did not wait for 5 seconds

test_function() ->
    io:format("Id is : ~p~n", [self()]),
    ?MODULE:start(),
    ?MODULE:normal_call(),
    ?MODULE:error_call(),
    io:format("Id is : ~p~n", [self()]).   %% never reached :(

首先我使用了这个:-

c(some_test).
some_test:test_function().

并得到输出:-

20> some_test:test_function().
Id is : <0.84.0>
Inside the init ( <0.92.0> )
Inside the init...
Inside the handle_call normal
Inside the handle_call error
Inside the terminatebadarith,
                        [some_test,handle_call,3,
                             [file,"some_test.erl",line,29],
                         gen_server,try_handle_call,4,
                             [file,"gen_server.erl",line,629],
                         gen_server,handle_msg,5,
                             [file,"gen_server.erl",line,661],
                         proc_lib,init_p_do_apply,3,
                             [file,"proc_lib.erl",line,240]]

=ERROR REPORT==== 3-Dec-2015::18:36:22 ===
** Generic server some_test terminating
** Last message in was error
** When Server state == nil
** Reason for termination ==
** badarith,[some_test,handle_call,3,[file,"some_test.erl",line,29
              gen_server,try_handle_call,4,
                          [file,"gen_server.erl",line,629],
              gen_server,handle_msg,5,[file,"gen_server.erl",line,6
              proc_lib,init_p_do_apply,3,
                        [file,"proc_lib.erl",line,240]]
** exception exit: badarith,
                        [some_test,handle_call,3,
                             [file,"some_test.erl",line,29],
                         gen_server,try_handle_call,4,
                             [file,"gen_server.erl",line,629],
                         gen_server,handle_msg,5,
                             [file,"gen_server.erl",line,661],
                         proc_lib,init_p_do_apply,3,
                             [file,"proc_lib.erl",line,240]],
                    gen_server,call,[some_test,error,infinity]
     in function  gen_server:call/3 (gen_server.erl, line 212)
     in call from some_test:test_function/0 (some_test.erl, line 51)
21>

所以我们可以看到 some_test:error_call() 之后的最后一行永远不会被调用??因为调用进程也终止了??

【问题讨论】:

【参考方案1】:

gen_server:call/2,3 是一个同步消息传递函数。这意味着在 gen_server 向发送者发送回复之前它不会返回。在这种情况下,gen_server 正在崩溃,因此永远不会发送响应。这样做的结果是调用函数超时,超时的结果就是崩溃。

您会注意到被调用者立即崩溃,但调用者在 5 秒后崩溃。那是因为默认超时是 5000 毫秒(查看上面链接的文档)。尝试将其设置为 infinity - 您的调用进程将挂起,永远阻塞等待永远不会到来的响应。

[更新:直接来自 shell 的调用立即崩溃,因为在其执行过程中引发了异常。这与超时到期不同——这两种情况都会导致崩溃,但会立即出现异常。]

解决此问题的方法是使用gen_server:cast/2 发送异步 消息。尝试定义这个:

handle_cast(error, State) ->
    io:format("Inside the handle_cast error~n"),
    1/0.

这只会导致被调用者崩溃;呼叫者将在消息消失的那一刻继续。这就像扔棒球然后走开,而不是扔回旋镖然后等待。

随着您获得使用 Erlang 的经验,您将倾向于使用尽可能多的 casts 编写代码,并且仅在情况确实需要同步消息(某些状态)时才使用 calls值实际上取决于事件的顺序)。这使得系统在许多方面更加松散耦合,并使您的调用函数能够抵抗它们向其发送数据的进程中的故障。

编辑

shell 崩溃是因为它正在执行的调用在执行期间崩溃了。但是,异步消息不会发生这种情况。我添加了另一个模块并扩展了您的模块以说明这一点。现在有一个名为some_tester 的新模块会中继我们发送的所有内容,因此它会崩溃,而不是shell。以下是相关位:

-module(some_tester).
-compile(export_all).

start() ->
    gen_server:start(local, ?MODULE, ?MODULE, [], []).

relay(Nature, Message) ->
    gen_server:cast(?MODULE, Nature, Message).


init(State) ->
    ok = z("Starting..."),
    ok, State.

handle_call(Message, From, S) ->
    ok = z("Received ~tp from ~tp", [Message, From]),
    reply, ok, S.

handle_cast(cast, Message, S) ->
    ok = some_test:cast(Message),
    noreply, S;
handle_cast(call, Message, S) ->
    ok = z("Sending call with ~tp", [Message]),
    Reply = some_test:normal_call(Message),
    ok = z("Received ~tp as reply", [Reply]),
    noreply, S;
handle_cast(infinite, Message, S) ->
    ok = z("Sending infinite with ~tp", [Message]),
    Reply = some_test:infinite_call(Message),
    ok = z("Received ~tp as reply", [Reply]),
    noreply, S;
handle_cast(Message, S) ->
    ok = z("Unexpected ~tp", [Message]),
    noreply, S.

这里是some_test的相关位:

start() ->
    gen_server:start(local, ?MODULE, ?MODULE, [], []).

normal_call(Message) ->
    gen_server:call(?MODULE, Message).

infinite_call(Message) ->
    gen_server:call(?MODULE, Message, infinity).

cast(Message) ->
    gen_server:cast(?MODULE, Message).

% ...

handle_call(normal, _, S) ->
    io:format("Inside the handle_call normal~n"),
    reply, ok, S;
handle_call(error, _, S) ->
    io:format("Inside the handle_call error~n"),
    reply, 1/0, S;
handle_call(bad_reply, _, S) ->
    io:format("Inside the handle_call error~n"),
    foo;
handle_call(Message, From, S) ->
    io:format("Received ~tp from ~tp~n", [Message, From]),
    reply, ok, S.

handle_cast(error, S) ->
    io:format("Bad arith: ~tp~n", [1/0]),
    noreply, S;
handle_cast(Message, S) ->
    io:format("Received ~tp~n", [Message]),
    noreply, S.

这是一个玩弄它的跑步。注意 shell 自己调用 self()flush() 的输出:

1> c(some_test).
some_test.erl:31: Warning: this expression will fail with a 'badarith' exception
some_test.erl:32: Warning: variable 'S' is unused
some_test.erl:40: Warning: this expression will fail with a 'badarith' exception
ok,some_test
2> c(some_tester).
ok,some_tester
3> ok, Test = some_test:start().
Inside the init ( <0.45.0> )
Inside the init...
ok,<0.45.0>
4> ok, Tester = some_tester:start().
<0.47.0> some_tester: Starting...
ok,<0.47.0>
5> monitor(process, Test).
#Ref<0.0.2.178>
6> monitor(process, Tester).
#Ref<0.0.2.183>
7> self().
<0.33.0>
8> some_tester:relay(call, foo).
<0.47.0> some_tester: Sending call with foo
ok
Received foo from <0.47.0>,#Ref<0.0.2.196>
<0.47.0> some_tester: Received ok as reply
9> some_tester:relay(cast, bar).
Received bar
ok
10> some_tester:relay(call, error).
<0.47.0> some_tester: Sending call with error
ok
Inside the handle_call error
Inside the terminatebadarith,
                        [some_test,handle_call,3,
                             [file,"some_test.erl",line,31],
                         gen_server,try_handle_call,4,
                             [file,"gen_server.erl",line,629],
                         gen_server,handle_msg,5,
                             [file,"gen_server.erl",line,661],
                         proc_lib,init_p_do_apply,3,
                             [file,"proc_lib.erl",line,240]]
Inside the terminatebadarith,
                         [some_test,handle_call,3,
                              [file,"some_test.erl",line,31],
                          gen_server,try_handle_call,4,
                              [file,"gen_server.erl",line,629],
                          gen_server,handle_msg,5,
                              [file,"gen_server.erl",line,661],
                          proc_lib,init_p_do_apply,3,
                              [file,"proc_lib.erl",line,240]],
                     gen_server,call,[some_test,error]
11> 
=ERROR REPORT==== 3-Dec-2015::22:52:17 ===
** Generic server some_test terminating 
** Last message in was error
** When Server state == nil
** Reason for termination == 
** badarith,[some_test,handle_call,3,[file,"some_test.erl",line,31],
              gen_server,try_handle_call,4,
                          [file,"gen_server.erl",line,629],
              gen_server,handle_msg,5,[file,"gen_server.erl",line,661],
              proc_lib,init_p_do_apply,3,
                        [file,"proc_lib.erl",line,240]]

=ERROR REPORT==== 3-Dec-2015::22:52:17 ===
** Generic server some_tester terminating 
** Last message in was '$gen_cast',call,error
** When Server state == []
** Reason for termination == 
** badarith,[some_test,handle_call,3,[file,"some_test.erl",line,31],
                gen_server,try_handle_call,4,
                            [file,"gen_server.erl",line,629],
                gen_server,handle_msg,5,[file,"gen_server.erl",line,661],
                proc_lib,init_p_do_apply,3,
                          [file,"proc_lib.erl",line,240]],
     gen_server,call,[some_test,error],
    [gen_server,call,2,[file,"gen_server.erl",line,204],
     some_tester,handle_cast,2,[file,"some_tester.erl",line,24],
     gen_server,try_dispatch,4,[file,"gen_server.erl",line,615],
     gen_server,handle_msg,5,[file,"gen_server.erl",line,681],
     proc_lib,init_p_do_apply,3,[file,"proc_lib.erl",line,240]]

11> self().
<0.33.0>
12> flush().
Shell got 'DOWN',#Ref<0.0.2.178>,process,<0.45.0>,
              badarith,
                  [some_test,handle_call,3,
                       [file,"some_test.erl",line,31],
                   gen_server,try_handle_call,4,
                       [file,"gen_server.erl",line,629],
                   gen_server,handle_msg,5,
                       [file,"gen_server.erl",line,661],
                   proc_lib,init_p_do_apply,3,
                       [file,"proc_lib.erl",line,240]]
Shell got 'DOWN',#Ref<0.0.2.183>,process,<0.47.0>,
              badarith,
                   [some_test,handle_call,3,
                        [file,"some_test.erl",line,31],
                    gen_server,try_handle_call,4,
                        [file,"gen_server.erl",line,629],
                    gen_server,handle_msg,5,
                        [file,"gen_server.erl",line,661],
                    proc_lib,init_p_do_apply,3,
                        [file,"proc_lib.erl",line,240]],
               gen_server,call,[some_test,error]
ok

仔细阅读。

【讨论】:

这不是原因,我也用完整的代码和输出更新了问题,它在 5 秒后没有终止,它立即终止了。现在我在 gen_server:call 中添加了无穷大,所以如果这是原因,那么它现在不应该崩溃,但它正在发生,所以请再看一次:) 谢谢 @Tiger shell 正在崩溃,因为它正在直接进行调用,并且异常正在向它传播而没有被捕获,所以它立即崩溃了。这是因为同步调用在收集任何返回值之前就失败了,所以 splat!。如果我们通过让另一个进程充当消息中继来将其分开,我们会看到 shell 没有崩溃,但中继进程确实崩溃了。尝试运行observer:start() 以了解在所有这些过程中系统内部发生的情况。

以上是关于gen_server 中的错误也会终止调用进程?的主要内容,如果未能解决你的问题,请参考以下文章

Erlang gen_server:如何捕获错误?

Erlang,尝试制作 gen_server: 调用有很多响应

如果 gen_server 进程中的 init/1 函数向自己发送一条消息,它是不是保证在任何其他消息之前到达?

setsid()

在 C 中使用 execve 加载程序时子进程如何终止

当父进程在python中终止时如何避免进程终止