Zygote

Posted 2022-12-11 ayanwan

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Zygote相关的知识，希望对你有一定的参考价值。

相关源码：

/frameworks/base/cmds/app_process/App_main.cpp （内含AppRuntime类）
/frameworks/base/core/jni/androidRuntime.cpp
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
/frameworks/base/core/java/com/android/internal/os/Zygote.java
/frameworks/base/core/java/android/net/LocalServerSocket.java

在Android中，zygote是整个系统创建新进程的核心进程。zygote进程在内部会先启动Dalvik虚拟机，继而加载一些必要的系统资源和系统类，最后进入一种监听状态。

在之后的运作中，当其他系统模块（比如AMS）希望创建新进程时，只需向zygote进程发出请求，zygote进程监听到该请求后，会相应地fork出新的进程，于是这个新进程在初生之时，就先天具有了自己的Dalvik虚拟机以及系统资源。

zygote进程是由init进程启动起来，由init.rc 脚本中关于zygote的描述可知：zygote对应的可执行文件就是/system/bin/app_process，也就是说系统启动时会执行到这个可执行文件的main()函数里。

service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
    class main
    socket zygote stream 660 root system
    onrestart write /sys/android_power/request_state wake
    onrestart write /sys/power/state on
    onrestart restart media
    onrestart restart netd

【解析】

zygote作为一个名字标识了这个service（可执行程序）；

/system/bin/app_process表示可执行文件的位置；

class、user、group、onrestart这些关键字所对应的行都被称为options, options是用来描述的service一些特点，不同的service有着不同的options。

第三行表示在Zygote启动过程中，要在其内部创建一个名为zygote的socket，它在Linux下的权限是666，即所有用户多可以对它进行读写。

下面看一个Zygote的启动流程图：

一、Zygote启动函数调用类的栈关系：
App_main.main
AndroidRuntime.start
startVm
startReg
ZygoteInit.main
registerZygoteSocket
preload
startSystemServer
runSelectLoop

zygote服务的main()函数位于frameworks\\base\\cmds\\app_process\\App_main.cpp。关键代码如下：

int main(int argc, char* const argv[])

    . . . . . .
    AppRuntime runtime;
    const char* argv0 = argv[0];    // -Xzygote
    argc--;
    argv++;
    . . . . . .
    int i = runtime.addVmArguments(argc, argv);
    . . . . . .
    while (i < argc) 
        const char* arg = argv[i++];		// 应该是/system/bin目录
        if (!parentDir) 
            parentDir = arg;
         else if (strcmp(arg, "--zygote") == 0) 
            zygote = true;
            niceName = "zygote";
         else if (strcmp(arg, "--start-system-server") == 0) 
            startSystemServer = true;
         
        . . . . . .
    

    if (niceName && *niceName) 
        setArgv0(argv0, niceName);
        set_process_name(niceName);     // 一般改名为“zygote”
    
    runtime.mParentDir = parentDir;
    if (zygote) 
      runtime.start("com.android.internal.os.ZygoteInit", args);
    else if (className) 
      runtime.start("com.android.internal.os.RuntimeInit", args);
    else 
      fprintf(stderr, "Error: no class name or --zygote supplied.\\n");
      app_usage();
      LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
      return 10;

main()函数里先构造了一个AppRuntime对象，即AppRuntime runtime；而后把进程名改成“zygote”，并利用runtime对象，把工作转交给java层的相应的***Init类处理。

根据传入参数的不同可以有两种启动方式，一个是 "com.android.internal.os.RuntimeInit", 另一个是 ”com.android.internal.os.ZygoteInit", 对应RuntimeInit 和 ZygoteInit 两个类，这两个类的主要区别在于Java端，可以明显看出，ZygoteInit 相比 RuntimeInit 多做了很多事情，比如说 “preload", "gc" 等等。但是在Native端，他们都做了相同的事， startVM() 和 startReg()

简单地说就是：Zygote进程的main()函数在启动Dalvik虚拟机后，会调用另一个ZygoteInit类的main()静态函数。

1、 zygote在native层的调用示意图如下：

2、 zygote在java层的调用主要是由ZygoteInit.java完成的，调用示意图如下：

ZygoteInit.java关键代码如下（frameworks/base/core/java/com/android/internal/os/ZygoteInit.java）

public class ZygoteInit 
	......

	public static void main(String argv[]) 
		try 
			......

			registerZygoteSocket();
			
			......

			......

			if (argv[1].equals("true")) 
				startSystemServer();
			 else if (!argv[1].equals("false")) 
				......
			

			......

			if (ZYGOTE_FORK_MODE) 
				......
			 else 
				runSelectLoopMode();
			

			......
		 catch (MethodAndArgsCaller caller) 
			......
		 catch (RuntimeException ex) 
			......
		
	

	......

主要做了四件事情：

（1）调用registerZygoteSocket函数创建了一个socket接口，用来和ActivityManagerService等通讯；

（2）预加载一些类与资源；

（3）调用startSystemServer函数来启动SystemServer组件；

startSystemServer()并不是在函数体内直接调用Java类的main()函数的，而是通过抛异常的方式，在startSystemServer()之外加以处理的。

【为什么要以异常方式启动】

（a）首先、我们要先清楚，抛异常这一操作会引发什么？

我们知道，当一个函数抛出异常后，这个异常会依次传递给调用它的函数，知道这个异常被捕获，如果这个异常一直没有被处理，最终就会引起程序的崩溃。

（b）其次、在传递异常的时候，应用程序的栈发生了什么变化？

这就要牵涉到函数的执行模型了，我们知道，程序都是有一个个函数组成的(除了汇编程序)，c/c++/java/..等高级语言编写的应用程序，在执行的时候，他们都拥有自己的栈空间（是一种先进后出的内存区域），用于存放函数的返回地址和函数的临时数据，每调用一个函数时，就会把函数的返回地址和相关数据压入栈中，当一个函数执行完后，就会从栈中弹出，cpu会根据函数的返回地址，执行上一个调用函数的下一条指令。

所以，在抛出异常后，如果异常没有在当前的函数中捕获，那么当前的函数执行就会异常的退出，从应用程序的栈弹出，并将这个异常传递给上一个函数，直到异常被捕获处理，否则，就会引起程序的崩溃。

因此，这里通过抛异常的方式启动主要是清理应用程序栈中ZygoteInit.main以上的函数栈帧，以实现当相应的main函数退出时，能直接退出整个应用程序。当当前的main退出后，就会退回到MethodAndArgsCaller.run而这个函数直接就退回到ZygoteInit.main函数，而ZygoteInit.main也无其他的操作，直接退出了函数，这样整个应用程序将会完全退出。

（4）调用runSelectLoopMode函数进入一个无限循环在前面创建的socket接口上等待ActivityManagerService等的请求。

2.1 startSystemServer()

其并不是在函数体内直接调用Java类的main()函数的，而是通过抛异常的方式，在startSystemServer()之外加以处理的。

private static boolean startSystemServer()
        throws MethodAndArgsCaller, RuntimeException 

    . . . . . .
    /* Hardcoded command line to start the system server */
    String args[] = 
        "--setuid=1000",
        "--setgid=1000",
        "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1032,
                        3001,3002,3003,3006,3007",
        "--capabilities=" + capabilities + "," + capabilities,
        "--runtime-init",
        "--nice-name=system_server",
        "com.android.server.SystemServer",
    ;
    ZygoteConnection.Arguments parsedArgs = null;
    int pid;
    try 
        parsedArgs = new ZygoteConnection.Arguments(args);
        ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);
        ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);

        // fork出系统服务对应的进程
        pid = Zygote.forkSystemServer(parsedArgs.uid, parsedArgs.gid,
                                            parsedArgs.gids, parsedArgs.debugFlags, null,
                                            parsedArgs.permittedCapabilities,
                                            parsedArgs.effectiveCapabilities);
     catch (IllegalArgumentException ex) 
        throw new RuntimeException(ex);
    

    // 对新fork出的系统进程，执行handleSystemServerProcess()
    if (pid == 0) 
        handleSystemServerProcess(parsedArgs);
    
    return true;

其中：

（1）Zygote.forkSystemServer()会通过jni调用linux的fork函数；

（2）startSystemServer()会在新fork出的子进程中调用handleSystemServerProgress()，进而抛出异常MethodAndArgsCaller，通过caller.run()启动com.android.server.SystemServer的main 方法。

可参考：Systemserver

2.2、runSelectLoop()

private static void runSelectLoop() throws MethodAndArgsCaller 

    ArrayList<FileDescriptor> fds = new ArrayList<FileDescriptor>();
    ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>();
    FileDescriptor[] fdArray = new FileDescriptor[4];

    fds.add(sServerSocket.getFileDescriptor());
    peers.add(null);

    int loopCount = GC_LOOP_COUNT;
    while (true) 
        int index;

        if (loopCount <= 0) 
            gc();
            loopCount = GC_LOOP_COUNT;
         else 
            loopCount--;
        

        try 
            fdArray = fds.toArray(fdArray);
            index = selectReadable(fdArray);
         catch (IOException ex) 
            throw new RuntimeException("Error in select()", ex);
        

        if (index < 0) 
            throw new RuntimeException("Error in select()");
         else if (index == 0) 
            ZygoteConnection newPeer = acceptCommandPeer();
            peers.add(newPeer);
            fds.add(newPeer.getFileDesciptor());
         else 
            boolean done;
            done = peers.get(index).runOnce();
            if (done) 
                peers.remove(index);
                fds.remove(index);

2.2.1、在一个while循环中，不断调用selectReadable()。

selectReadable函数是个native函数，主要就是调用select()而已。在Linux的socket编程中，select()负责监视若干文件描述符的变化情况。内部调用select等待客户端的连接，客户端连接上之后就会返回。
返回值：
<0: 内部发生错误
=0: 该客户端第一次连接到服务端。服务端调用accept与客户端建立连接。客户端在zygote中以ZygoteConnection对象表示。
>0: 客户端与服务端已经建立连接，并开始发送数据。表明发送数据的客户端的index，peers.get(index)取得发送数据客户端的ZygoteConnection对象，之后调用runOnce 函数处理具体的请求。

2.2.2、runOnce()

boolean runOnce( ) 
       Arguments parsedArgs = null;
       FileDescriptor[] descriptors;

       //Reads one start command from the command socket.
       args = readArgumentList();
       descriptors = mSocket.getAncillaryFileDescriptors();

       //创建/Forks a new VM instance /process.
       //使用Jni 调用nativeFork
       pid = Zygote.forkAndSpecialize(parsedArgs.uid, parsedArgs.gid,
              parsedArgs.gids, parsedArgs.debugFlags, rlimits);

       //返回两次
       if (pid == 0) 
              // in child    
              serverPipeFd = null;
              handleChildProc(parsedArgs, descriptors, childPipeFd, newStderr);

              // should never get here, the child is expected to either
              return true;
        else 

              // in parent...pid of < 0 means failure
              childPipeFd = null;
              return handleParentProc(pid, descriptors, serverPipeFd, parsedArgs);

从上面的代码中可以看到创建进程之后返回：
子进程：handleChildProc
父进程：handleParentProc
我们关心的是子进程的执行，继续到handleChildProc中。

// Handles post-fork setup of child proc
private void handleChildProc(Arguments parsedArgs,...)
       ……
       if (parsedArgs.runtimeInit) 
           if (parsedArgs.invokeWith != null) 
　　　　　　　　//通过系统调用执行进程
　　　　　　　　WrapperInit.execApplication(parsedArgs.invokeWith,
　　　　　　　　　　parsedArgs.niceName, parsedArgs.targetSdkVersion,
　　　　　　　　　　pipeFd, parsedArgs.remainingArgs);

            else 
　　　　　　　　//通过寻找到相应目标类的main()函数并执行
　　　　　　　　RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion,
　　　　　　　　　　parsedArgs.remainingArgs);
           
       
       ……

看到子进程的执行有两种方式：
　　WrapperInit.execApplication和RuntimeInit.zygoteInit

1）通过系统调用的方式执行进程 WrapperInit.execApplication：

public static void execApplication(……) 
　　……
　　Zygote.execShell(command.toString());


public static void execShell(String command) 
　　// using the exec() system call
　　nativeExecShell(command);

2）通过寻找到相应目标类的main()函数并执行 RuntimeInit.zygoteInit：

// The main function called when started through the zygote process.
public static final void zygoteInit( )
zygoteInitNative();
applicationInit(targetSdkVersion, argv);


private static void applicationInit( ) 
// Remaining arguments are passed to the start class's static main
invokeStaticMain(args.startClass, args.startArgs);

通过RuntimeInit调用startClass的main函数，进而以异常的方式启动新的进程。

 static void invokeStaticMain(ClassLoader loader,
            String className, String[] argv)
            throws ZygoteInit.MethodAndArgsCaller 
    ....
        /*
         * This throw gets caught in ZygoteInit.main(), which responds
         * by invoking the exception's run() method. This arrangement
         * clears up all the stack frames that were required in setting
         * up the process.
         */
        throw new ZygoteInit.MethodAndArgsCaller(m, argv);

以上是关于Zygote的主要内容，如果未能解决你的问题，请参考以下文章