Zygote进程原理简单介绍，源码解析

Posted 2022-11-05 wodongx123

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Zygote进程原理简单介绍，源码解析相关的知识，希望对你有一定的参考价值。

文章目录

1. Zygote介绍
2. fork进程源码解析
附：时序图代码
参考材料

1. Zygote介绍

我们的android系统是基于Linux系统，所以当我们开机的时候，第一个启动的是Init进程，而后面所有的进程都是Init的子进程，zygote就是Init进程通过解析init.rc文件之后，启动的一个进程

Zygote的进程的主要作用只有两个：

启动SystemServer进程。SystemServer是一个用于启动手机内部各种服务的进程，我们常说的PMS，AMS等都是由SystemServer所启动。
在系统运行过程中，即时的去孵化APP进程，也就是我们每次点击APP图标启动的APP的时候，zygote就开始运作了。

关于孵化：
Zygote进程创建别的进程的时候，用的不是创建，而是孵化这个词，那怎么理解孵化这个意思呢？

我们Android系统的程序，都是基于虚拟机所启动的，如果我们每次启动一个APP都要新启动一个虚拟机，那未免也太卡太慢了。
所以为了避免这种场景，zygote进程在启动的时候，就会直接预加载虚拟机所需要的内存等资源，等后面创建应用需要用到的时候，直接共享使用，这样就避免了多次启动虚拟机的情况。
zygote进程在创建进程的时候，最后会调到Linux内核自带的的fork方法，来复制一个子进程，从而达到1所说的共享虚拟机内容的情况。
zygote进程的作用，就是生成别的进程，自己是不负责做事的。

关于fork：
fork在zygote进程中是一个很重要的概念，这是Linux内核自带的一个方法，这个方法的作用就是复制一个子进程。那么，怎么理解这个复制的意思，简单来说，就是从成员变量，到内存空间，再到当前所执行的代码指令，都会生成一个副本后放到子进程中。

当进程A调用fork后，进程A和复制的子进程B，都会得到fork方法的return值，并且继续往下执行。
返回值：若成功调用一次则返回两个值，子进程返回0，父进程返回子进程ID；否则，出错返回-1

进程A在开辟一块内存空间之后，持有这个内存空间的引用的话，在fork之后，子进程B也会持有这个内存空间。

2. fork进程源码解析

2.1 Native层启动Zygote进程

由于也不怎么做C语言的开发工作，这部分就快速略过，简单来说就是init进程在读取init.rc文件之后，根据文件里面的指令：

启动了虚拟机。
注册了JNI（Java Native Interface，也就是我们常看见的那些native方法啥的）。
启动了Zygote进程，具体启动方式就是通过反射，拿到ZygoteInit.java这个类，然后去调用里面的main方法。

2.2 启动SystemServer

我还是先搬出我的时序图，对这个启动过程先做一个总结。

fork生成SystemServer进程

public class ZygoteInit 

	public static void main(String argv[]) 
		// 省略了无关代码
		
		if (startSystemServer) 
           Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);

           // 如果返回的值不为空，就代表当前处于子进程，执行子进程找到的main方法
           if (r != null) 
               r.run();
               return;
           
        
	

	private static Runnable forkSystemServer(String abiList, String socketName,
            ZygoteServer zygoteServer) 
        // 省略了无关代码
        
        int pid;
        /* fork生成SystemServer进程 */
        pid = Zygote.forkSystemServer(
                parsedArgs.mUid, parsedArgs.mGid,
                parsedArgs.mGids,
                parsedArgs.mRuntimeFlags,
                null,
                parsedArgs.mPermittedCapabilities,
                parsedArgs.mEffectiveCapabilities);
                
        // 0表示当前是子进程，也就是SystemServer进程
        if (pid == 0) 
            if (hasSecondZygote(abiList)) 
                waitForSecondaryZygote(socketName);
            

            zygoteServer.closeServerSocket();
            return handleSystemServerProcess(parsedArgs);
        
		// 非0表示当前是父进程，也就是Zygote进程
        return null;
    


public class Zygote 
	 static int forkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
            int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) 
        ZygoteHooks.preFork();

        int pid = nativeForkSystemServer(
                uid, gid, gids, runtimeFlags, rlimits,
                permittedCapabilities, effectiveCapabilities);

        // Set the Java Language thread priority to the default value for new apps.
        Thread.currentThread().setPriority(Thread.NORM_PRIORITY);

        ZygoteHooks.postForkCommon();
        return pid;
    
    
	/** 最后是调用了内核的fork方法来复制进程 */
	private static native int nativeForkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
            int[][] rlimits, long permittedCapabilities, long effectiveCapabilities);

这里就可以看出来，对于fork方法的基本处理，因为生成的进程和父进程唯一的区别就是return的值不一样，所以就根据这个return值来区分是父进程还是子进程，从而选择接下来要处理的逻辑。

SystemServer进程寻找main方法

从handleSystemServerProcess开始继续看。

public class ZygoteInit 
	private static Runnable handleSystemServerProcess(ZygoteArguments parsedArgs) 
		// 省略了无关代码
        return ZygoteInit.zygoteInit(parsedArgs.mTargetSdkVersion,
                parsedArgs.mDisabledCompatChanges,
                parsedArgs.mRemainingArgs, cl);
	

	public static final Runnable zygoteInit(int targetSdkVersion, long[] disabledCompatChanges,
            String[] argv, ClassLoader classLoader) 
        // 省略了无关代码
        RuntimeInit.commonInit();
        ZygoteInit.nativeZygoteInit();
        return RuntimeInit.applicationInit(targetSdkVersion, disabledCompatChanges, argv,
                classLoader);
    


public class RuntimeInit 

	protected static Runnable applicationInit(int targetSdkVersion, long[] disabledCompatChanges,
            String[] argv, ClassLoader classLoader) 
        // 省略了无关代码
        return findStaticMain(args.startClass, args.startArgs, classLoader);
    
	
	/** 
	 * 找到className中的静态main方法 
	 * 其实就是用反射去找类中的这个方法而已
  	*/
	protected static Runnable findStaticMain(String className, String[] argv,
            ClassLoader classLoader) 
        Class<?> cl;

        try 
            cl = Class.forName(className, true, classLoader);
         catch (ClassNotFoundException ex) 
            throw new RuntimeException(
                    "Missing class when invoking static main " + className,
                    ex);
        

        Method m;
        try 
            m = cl.getMethod("main", new Class[]  String[].class );
         catch (NoSuchMethodException ex) 
            throw new RuntimeException(
                    "Missing static main on " + className, ex);
         catch (SecurityException ex) 
            throw new RuntimeException(
                    "Problem getting static main on " + className, ex);
        

        int modifiers = m.getModifiers();
        if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) 
            throw new RuntimeException(
                    "Main method is not public and static on " + className);
        

        /** 这个return的Runnable，最终会在ZygoteInit的main方法中被执行。*/
        return new MethodAndArgsCaller(m, argv);
    

	/** RuntimeInit中封装的Runnable，run方法就是去跑找到的method而已 */
	static class MethodAndArgsCaller implements Runnable 
        /** method to call */
        private final Method mMethod;

        /** argument array */
        private final String[] mArgs;

        public MethodAndArgsCaller(Method method, String[] args) 
            mMethod = method;
            mArgs = args;
        

        public void run() 
            try 
                mMethod.invoke(null, new Object[]  mArgs );
             catch (IllegalAccessException ex) 
                throw new RuntimeException(ex);
             catch (InvocationTargetException ex) 
                Throwable cause = ex.getCause();
                if (cause instanceof RuntimeException) 
                    throw (RuntimeException) cause;
                 else if (cause instanceof Error) 
                    throw (Error) cause;
                
                throw new RuntimeException(ex);

这段代码不难看出，SystemServer进程做的事，就是通过反射去找到SystemServer.java这个类的静态main方法去执行，同时将ZygoteInit的main方法给return了。

2.3 启动APP

启动APP主要是靠ZygoteServer类，这里还是先搬出时序图

看着有点小乱，主要是因为套了一个死循环方法。

class ZygoteInit 
	public static void main(String argv[]) 
		// 省略无关代码
		ZygoteServer zygoteServer = null;
		Runnable caller;
		try 
			zygoteServer = new ZygoteServer(isPrimaryZygote);
			// SelcetLoop方法在fork之后的子进程中会return出一个caller
            // SelectLoop方法会在Zygote进程中无限循环不停止。
			caller = zygoteServer.runSelectLoop(abiList);
		 catch (Throwable ex) 
            Log.e(TAG, "System zygote died with exception", ex);
            throw ex;
         finally 
            if (zygoteServer != null) 
                zygoteServer.closeServerSocket();
            
        
		// 执行子进程中return出来的Runnable方法
		if (caller != null) 
            caller.run();
        
	


class ZygoteServer 

	Runnable runSelectLoop(String abiList) 
		// 省略无关代码

		// 将SocketFD添加到列表头
		ArrayList<FileDescriptor> socketFDs = new ArrayList<FileDescriptor>();
		socketFDs.add(mZygoteSocket.getFileDescriptor());
		
		while (true) 
			StructPollfd[] pollFDs = null;
			pollFDs = new StructPollfd[socketFDs.size()];

			int pollIndex = 0;
            for (FileDescriptor socketFD : socketFDs) 
                pollFDs[pollIndex] = new StructPollfd();
                pollFDs[pollIndex].fd = socketFD;
                pollFDs[pollIndex].events = (short) POLLIN;
                ++pollIndex;
            
			
			try 
				// 等待文件描述符上的POLLIN事件，代码会阻塞在这里直到响应之后才继续往下执行
                Os.poll(pollFDs, -1);
             catch (ErrnoException ex) 
                throw new RuntimeException("poll failed", ex);
            

			while (--pollIndex >= 0) 

				// 如果没有实际发生的事件并不是可读事件就跳过该文件描述符
				if ((pollFDs[pollIndex].revents & POLLIN) == 0) 
                    continue;
                
				
				if (pollIndex == 0) 
                    // 创建一个连接
                    ZygoteConnection newPeer = acceptCommandPeer(abiList);
                    peers.add(newPeer);
                    socketFDs.add(newPeer.getFileDescriptor());
                 else if (pollIndex < usapPoolEventFDIndex) 
                	try 
                		// 获取Zygote连接
                        ZygoteConnection connection = peers.get(pollIndex);
                        // 获取一个方法，实际上就是获取子进程的静态Main方法
                        final Runnable command = connection.processOneCommand(this);

						// 成员变量，当fork完子进程的时候，子进程会将该变量设置为true
                        if (mIsForkChild) 
                            if (command == null) 
                                throw new IllegalStateException("command == null");
                            

                            return command;
                        else 
                        	//在主进程中关闭掉该连接
                       	    if (connection.isClosedByPeer()) 
                               connection.closeSocket();
                               peers.remove(pollIndex);
                               socketFDs.remove(pollIndex);
                            
                        
         			 catch (Exception e)

这个代码看着其实和Looper.loop非常相似，都是先阻塞在一个地方，等到响应之后才开始继续往下处理。

文件描述符FileDescriptor

在runSelectLoop这个方法里面，会出现FileDescriptor或者FD，这个东西是文件描述符，我把百科的话简单提炼介绍一下：

就是我们每个进程在运行的时候，在虚拟机底层都会有个表（一段连续的存储地址），这个表专门用于记录我们每次打开文件时的信息，包括目标文件的地址，应用层对文件所进行的操作等信息。

每次在对文件进行打开操作的时候，我们都可以获得该文件在这个文件表的下标。这个下标就叫做文件描述符，我们可以根据这个下标通过LINUX系统的API来操作或者监听那些曾经被打开过的文件。

还想具体了解的话，见百科：https://baike.baidu.com/item/%E6%96%87%E4%BB%B6%E6%8F%8F%E8%BF%B0%E7%AC%A6/9809582?fr=aladdin

OS.poll

OS.poll，让runSelectLoop阻塞的方法，这个方法的底层调用，就是调用linux系统的poll方法，具体功能为：

等待文件描述符上的某个事件，具体等待什么事件，根据设置决定，看一下StructPollfd 这个对象

public final class StructPollfd 
 /** The file descriptor to poll. */
    public FileDescriptor fd;

    /**
     * The events we're interested in. POLLIN corresponds to being in select(2)'s read fd set,
     * POLLOUT to the write fd set.
     */
    public short events;

    /** The events that actually happened. */
    public short revents;

其中，fd就代表了文件描述符，event代表了关注的事件，revent代表了实际发生的事件。

套入到我们的场景当中，我们等待的就是POLLIN这个事件，也就是可读事件。

创建连接

在if (pollIndex == 0) 中看一下如何创建一个连接，主要是acceptCommandPeer这个方法，所以就从这里看起

class ZygoteServer 
	private ZygoteConnection acceptCommandPeer(String abiList) 
        try 
        	// mZygoteSocket的类型为LocalServerSocket
            return createNewConnection(mZygoteSocket.accept(), abiList);
         catch (IOException ex) 
            throw new RuntimeException(
                    "IOException during accept()", ex);
        
    
	
	protected ZygoteConnection createNewConnection(LocalSocket socket, String abiList)
            throws IOException 
        return new ZygoteConnection(socket, abiList);
    



public class LocalServerSocket implements Closeable 

	private final LocalSocketImpl impl;
	
	public LocalSocket accept() throws IOException
    
        LocalSocketImpl acceptedImpl = new LocalSocketImpl();

        impl.accept(acceptedImpl);

        return LocalSocket.createLocalSocketForAccept(acceptedImpl);
    


class LocalSocketImpl 
	
	protected void accept(LocalSocketImpl s) throws IOException 
        if (fd == null) 
            throw new IOException("socket not created");
        

        try 
        	//重点就是这行Os.accept
            s.fd = Os.accept(fd, null /* address */);
            s.mFdCreatedInternally = true;
         catch (ErrnoException e) 
            throw e.rethrowAsIOException();

最后核心的部分是，调用了Os.accept，这个方法的功能简单来说就是，当调用Os.accept后，就会根据参数的fd创建一个新的Socket连接，并返回该连接的FD。

启动一个新的进程流程

综合一下前面的内容，我们就可以总结出这个死循环总体的功能了

在一开始的时候，我们对mZygoteSocket的FD监听POLLIN事件（此时socketFDs 列表里面只有mZygoteSocket对象）。
另一个进程A通过ZygoteProcess.connect这个方法，来创建一个连接并发出消息（此时调用了Os.socket，创建Socket连接）。
Os.poll触发，此时一定是触发if (pollIndex == 0) 的条件，然后创建一个新的Zygote连接（此时调用了Os.accept）。
进程A通过在Zygote.connect这个方法的流程中，调用LocalSocketlmpl.connectLocal方法，确认连接。
然后第二次监听POLLIN的时候，socketFDs里面就有mZygoteSocket和我们在刚刚创建的新连接。
此时触发的是else if (pollIndex < usapPoolEventFDIndex) ，去fork一个新的进程，并且找到对应的main方法运行，子进程就算创建完成了。

附：时序图代码

@startuml

participant ZygoteInit as init
participant Zygote as zygote
participant RuntimeInit as rinit

 
init -> init : main
activate init
init -> init : forkSystemServer
activate init
	init -> zygote : forkSystemServer
	activate zygote
	zygote -> zygote : nativeForkSystemServer\\n调用内核的方法复制进程。
	activate zygote
	 zygote --> zygote : return pid\\n父进程返回子进程的进程id\\n子进程返回0
	deactivate zygote
	zygote --> init : return pid
deactivate zygote
alt pid != 0
	init --> init :return null
else pid == 0
	init -> init : handleSystemServerProcess\\n处理SystemServer进程
	activate init
		init -> init : zygoteInit
		init -> rinit : applicationInit
		activate rinit
		rinit -> rinit : findstaticMain\\n寻找静态main方法
		rinit --> init : return mainMethod\\n返回找到的Main方法\\n就是SystemServer.Main
		deactivate rinit
end 

deactivate init

alt return null
	init -> init : 代表这是zygote进程，处理后续逻辑。
else return mainMethod
	init -> init : run MainMethod\\n代表这是SystemServer进程
	init -> init : return
end

deactivate init
deactivate init

@enduml

参考材料

fork（函数）_百度百科
https://baike.baidu.com/item/fork/7143171?fr=aladdin
Android源码分析 - Zygote进程 - 掘金
https://juejin.cn/post/7051507161955827720
zygote - 简书
https://www.jianshu.com/p/cbb44fb9d989

Android29源码中贴的文档

https://man7.org/linux/man-pages/man2/accept.2.html

以上是关于Zygote进程原理简单介绍，源码解析的主要内容，如果未能解决你的问题，请参考以下文章