iOS之深入解析类加载的底层原理:类如何加载到内存中?

Posted Forever_wj

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了iOS之深入解析类加载的底层原理:类如何加载到内存中?相关的知识,希望对你有一定的参考价值。

一、App 启动与 dylb 加载

  • App 启动会由 libdyld.dylib 库先于 main 函数调用 start,执行 _dyld_start 方法,然后运用汇编实现调用 dyldbootstrap::start 方法,随后执行到 dyld::_main 方法中;
  • dyld::_main 是 dyld 的入口,内核加载 dyld,然后执行环境变量的配置,检查是否开启共享缓存,主程序的初始化,插入动态库,链接主程序,链接动态库,弱符号绑定,执行初始化方法,寻找程序入口等一系列启动工作;
  • App 启动与 dylb 加载的具体流程请参考我之前的博客:iOS之深入解析App启动dyld加载流程的底层原理

二、dyld 与 ObjC 关联

  • 在 main 函数执行过程中,当 dyld 加载到开始链接主程序的时候 , 递归调用 recursiveInitialization 函数;
  • recursiveInitialization 函数第一次执行,进行 libsystem 的初始化,其执行过程为:recursiveInitialization -> doInitialization -> doModInitFunctions -> libSystemInitialized 。
  • libsystem 的初始化,它会调用起 libdispatch_init,libdispatch 的 init 会调用 _os_object_init,这个函数里面调用了 _objc_init。
  • _objc_init 中注册并保存了 map_images , load_images , unmap_image 函数地址,从而进入了类的加载过程。
  • 从系统库 libSystem 的 Runtime 入口函数 _objc_init 跳转到 map_images:
	/***********************************************************************
	* map_images
	* Process the given images which are being mapped in by dyld.
	* Calls ABI-agnostic code after taking ABI-specific locks.
	*
	* Locking: write-locks runtimeLock
	**********************************************************************/
	void
	map_images(unsigned count, const char * const paths[],
	           const struct mach_header * const mhdrs[]) {
	    mutex_locker_t lock(runtimeLock);
	    return map_images_nolock(count, paths, mhdrs);
	}
  • 从 map_images 函数的注释部分,可以可出:map_images 主要作用是处理由 dyld 映射的 image (泛指二进制可执行程序)。

三、map_images 流程分析

① map_images_nolock
  • 继续进入 map_images_nolock 函数的实现部分,底层实现代码比较长,因此主要去关注类的信息是如何加载的,如下所示:
	void 
	map_images_nolock(unsigned mhCount, const char * const mhPaths[],
	                  const struct mach_header * const mhdrs[])
	{
	    static bool firstTime = YES;
	    header_info *hList[mhCount];
	    uint32_t hCount;
	    size_t selrefCount = 0;
	
	    // Perform first-time initialization if necessary.
	    // This function is called before ordinary library initializers. 
	    // fixme defer initialization until an objc-using image is found?
	    // 判断是否是第一次,是第一次就开始准备初始化环境
	    if (firstTime) {
	        preopt_init();
	    }
	
	    if (PrintImages) {
	        _objc_inform("IMAGES: processing %u newly-mapped images...\\n", mhCount);
	    }
	
	
	    // Find all images with Objective-C metadata.
	    hCount = 0;
	
	    // Count classes. Size various table based on the total.
	    // 计算 class 数量,根据总数调整各种表的大小
	    int totalClasses = 0;
	    int unoptimizedTotalClasses = 0;
	    {
	        uint32_t i = mhCount;
	        while (i--) {
	            const headerType *mhdr = (const headerType *)mhdrs[i];
	
	            auto hi = addHeader(mhdr, mhPaths[i], totalClasses, unoptimizedTotalClasses);
	            if (!hi) {
	                // no objc data in this entry
	                continue;
	            }
	            
	            if (mhdr->filetype == MH_EXECUTE) {
	                // Size some data structures based on main executable's size
	#if __OBJC2__
	                size_t count;
	                _getObjc2SelectorRefs(hi, &count);
	                selrefCount += count;
	                _getObjc2MessageRefs(hi, &count);
	                selrefCount += count;
	#else
	                _getObjcSelectorRefs(hi, &selrefCount);
	#endif
	                
	#if SUPPORT_GC_COMPAT
	                // Halt if this is a GC app.
	                if (shouldRejectGCApp(hi)) {
	                    _objc_fatal_with_reason
	                        (OBJC_EXIT_REASON_GC_NOT_SUPPORTED, 
	                         OS_REASON_FLAG_CONSISTENT_FAILURE, 
	                         "Objective-C garbage collection " 
	                         "is no longer supported.");
	                }
	#endif
	            }
	            
	            hList[hCount++] = hi;
	            
	            if (PrintImages) {
	                _objc_inform("IMAGES: loading image for %s%s%s%s%s\\n", 
	                             hi->fname(),
	                             mhdr->filetype == MH_BUNDLE ? " (bundle)" : "",
	                             hi->info()->isReplacement() ? " (replacement)" : "",
	                             hi->info()->hasCategoryClassProperties() ? " (has class properties)" : "",
	                             hi->info()->optimizedByDyld()?" (preoptimized)":"");
	            }
	        }
	    }
	
	    // Perform one-time runtime initialization that must be deferred until 
	    // the executable itself is found. This needs to be done before 
	    // further initialization.
	    // (The executable may not be present in this infoList if the 
	    // executable does not contain Objective-C code but Objective-C 
	    // is dynamically loaded later.
	    // 执行一次运行时初始化,推迟到找到可执行文件为止,这需要在初始化完成之前,如果可执行文件不包含 OC 文件,则稍后会动态加载 OC,那么该执行文件可能不会出现在 infoList 文件中
	    if (firstTime) {
	        sel_init(selrefCount);
	        arr_init();
	
	#if SUPPORT_GC_COMPAT
	        // Reject any GC images linked to the main executable.
	        // We already rejected the app itself above.
	        // Images loaded after launch will be rejected by dyld.
	
	        for (uint32_t i = 0; i < hCount; i++) {
	            auto hi = hList[i];
	            auto mh = hi->mhdr();
	            if (mh->filetype != MH_EXECUTE  &&  shouldRejectGCImage(mh)) {
	                _objc_fatal_with_reason
	                    (OBJC_EXIT_REASON_GC_NOT_SUPPORTED, 
	                     OS_REASON_FLAG_CONSISTENT_FAILURE, 
	                     "%s requires Objective-C garbage collection "
	                     "which is no longer supported.", hi->fname());
	            }
	        }
	#endif
	
	#if TARGET_OS_OSX
	        // Disable +initialize fork safety if the app is too old (< 10.13).
	        // Disable +initialize fork safety if the app has a
	        //   __DATA,__objc_fork_ok section.
	
	        if (dyld_get_program_sdk_version() < DYLD_MACOSX_VERSION_10_13) {
	            DisableInitializeForkSafety = true;
	            if (PrintInitializing) {
	                _objc_inform("INITIALIZE: disabling +initialize fork "
	                             "safety enforcement because the app is "
	                             "too old (SDK version " SDK_FORMAT ")",
	                             FORMAT_SDK(dyld_get_program_sdk_version()));
	            }
	        }
	
	        for (uint32_t i = 0; i < hCount; i++) {
	            auto hi = hList[i];
	            auto mh = hi->mhdr();
	            if (mh->filetype != MH_EXECUTE) continue;
	            unsigned long size;
	            if (getsectiondata(hi->mhdr(), "__DATA", "__objc_fork_ok", &size)) {
	                DisableInitializeForkSafety = true;
	                if (PrintInitializing) {
	                    _objc_inform("INITIALIZE: disabling +initialize fork "
	                                 "safety enforcement because the app has "
	                                 "a __DATA,__objc_fork_ok section");
	                }
	            }
	            break;  // assume only one MH_EXECUTE image
	        }
	#endif
	
	    }
		// 读取 image
	    if (hCount > 0) {
	        _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);
	    }
	
	    firstTime = NO;
	    
	    // Call image load funcs after everything is set up.
	    for (auto func : loadImageFuncs) {
	        for (uint32_t i = 0; i < mhCount; i++) {
	            func(mhdrs[i]);
	        }
	    }
	}
  • map_images_nolock 流程分析:
    • 判断 firstTime,如果为 YES 那么准备执行环境初始化;
    • 计算 class 数量,然后根据总数调整各种表的大小;
    • 判断 firstTime,如果为 YES 则执行各种表的初始化操作;
    • 执行 _read_images,对镜像文件进行读取,然后把 firstTime 置为 NO,下次进入直接执行 _read_images。
② _read_images
  • 条件控制进行一次的加载
    • 实现源码如下:
    if (!doneOnce) {
        doneOnce = YES;
        launchTime = YES;

#if SUPPORT_NONPOINTER_ISA
        // Disable non-pointer isa under some conditions.

# if SUPPORT_INDEXED_ISA
        // Disable nonpointer isa if any image contains old Swift code
        for (EACH_HEADER) {
            if (hi->info()->containsSwift()  &&
                hi->info()->swiftUnstableVersion() < objc_image_info::SwiftVersion3)
            {
                DisableNonpointerIsa = true;
                if (PrintRawIsa) {
                    _objc_inform("RAW ISA: disabling non-pointer isa because "
                                 "the app or a framework contains Swift code "
                                 "older than Swift 3.0");
                }
                break;
            }
        }
# endif

# if TARGET_OS_OSX
        // Disable non-pointer isa if the app is too old
        // (linked before OS X 10.11)
        if (dyld_get_program_sdk_version() < DYLD_MACOSX_VERSION_10_11) {
            DisableNonpointerIsa = true;
            if (PrintRawIsa) {
                _objc_inform("RAW ISA: disabling non-pointer isa because "
                             "the app is too old (SDK version " SDK_FORMAT ")",
                             FORMAT_SDK(dyld_get_program_sdk_version()));
            }
        }

        // Disable non-pointer isa if the app has a __DATA,__objc_rawisa section
        // New apps that load old extensions may need this.
        for (EACH_HEADER) {
            if (hi->mhdr()->filetype != MH_EXECUTE) continue;
            unsigned long size;
            if (getsectiondata(hi->mhdr(), "__DATA", "__objc_rawisa", &size)) {
                DisableNonpointerIsa = true;
                if (PrintRawIsa) {
                    _objc_inform("RAW ISA: disabling non-pointer isa because "
                                 "the app has a __DATA,__objc_rawisa section");
                }
            }
            break;  // assume only one MH_EXECUTE image
        }
# endif

#endif
		// 重置及初始化TaggedPointer环境
        if (DisableTaggedPointers) {
            disableTaggedPointers();
        }
        
        initializeTaggedPointerObfuscator();

        if (PrintConnecting) {
            _objc_inform("CLASS: found %d classes during launch", totalClasses);
        }

        // namedClasses
        // Preoptimized classes don't go in this table.
        // 4/3 is NXMapTable's load factor
        // 创建哈希表 gdb_objc_realized_classes
        int namedClassesSize = 
            (isPreoptimized() ? unoptimizedTotalClasses : totalClasses) * 4 / 3;
        gdb_objc_realized_classes =
            NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);

        ts.log("IMAGE TIMES: first time tasks");
    }
    • gdb_objc_realized_classes 的类型是 NXMapTable。可以简单理解为 NXMapTable == NSMapTable ,也就是对应常用的 NSDictionary,并且额外提供了 weak 指针来使用垃圾回收机制(NSDictionary 底层实现也是使用了 NSMapTable(散列表))。
  • 修复预编译阶段的 @selector 的混乱问题
    • 源码实现如下:
	// Fix up @selector references
    static size_t UnfixedSelectors;
    {
        mutex_locker_t lock(selLock);
        for (EACH_HEADER) {
            if (hi->hasPreoptimizedSelectors()) continue;

            bool isBundle = hi->isBundle();
            // _getObjc2SelectorRefs 是从 mach-o 中的静态段 __objc_selrefs 中遍历列表,然后通过 sel_registerNameNoLock 将 sel 添加到 namedSelectors 中
            SEL *sels = _getObjc2SelectorRefs(hi, &count);
            UnfixedSelectors += count;
            for (i = 0; i < count; i++) {
                const char *name = sel_cname(sels[i]);
                // sel 并不是一个简单的字符串,而是带地址的字符串
                SEL sel = sel_registerNameNoLock(name, isBundle);
                if (sels[i] != sel) {
                    sels[i] = sel;
                }
            }
        }
    }

	ts.log("IMAGE TIMES: fix up selector references");
    • 调试如下:

在这里插入图片描述

    • 通过控制台打印输出的结果,可以看到两个方法的名称相同,但是方法的地址却不相同,主要原因是什么呢?这是因为,整个苹果系统中,会有很多系统框架,比如 CoreFoundation、 CoreMedia 等等,当每个框架都有一个相同的方法,比如上图的 class 方法的时候,就需要将方法平移到程序的最前面进行执行,例如 CoreFoundation 的 class 方法的 index = 0,而 CoreMedia 的 class 方法 index = 0 + CoreFoundation 的大小,因此要将方法进行平移操作。
  • 错误混乱的类处理
    • 实现源码如下:
	// Discover classes. Fix up unresolved future classes. Mark bundle classes.
    bool hasDyldRoots = dyld_shared_cache_some_image_overridden();

    for (EACH_HEADER) {
        if (! mustReadClasses(hi, hasDyldRoots)) {
            // Image is sufficiently optimized that we need not call readClass()
            continue;
        }
		// 从 mach-o 的静态段 __objc_classlist 类列表中读取出所有类
        classref_t const *classlist = _getObjc2ClassList(hi, &count);

        bool headerIsBundle = hi->isBundle();
        bool headerIsPreoptimized = hi->hasPreoptimizedClasses();

        for (i = 0; i < count; i++) {
            Class cls = (Class)classlist[i];
            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            if (newCls != cls  &&  newCls) {
                // Class was moved but not deleted. Currently this occurs 
                // only when the new class resolved a future class.
                // Non-lazily realize the class below.
                resolvedFutureClasses = (Class *)
                    realloc(resolvedFutureClasses, 
                            (resolvedFutureClassCount+1) * sizeof(Class));
                resolvedFutureClasses[resolvedFutureClassCount++] = newCls;
            }
        }
    }

	ts.log("IMAGE TIMES: discover classes");
    • 在 readClass 方法调用的前后都下个断点,然后看看打印输出有什么变化:

在这里插入图片描述

    • 可以看到在 readClass 方法调用之后,对 cls 进行了类名的赋值操作,此时类的信息目前仅存储了地址和名称。我们再进去看一下 readClass 的源码:
	Class readClass(Class cls, bool headerIsBundle, bool headerIsPreoptimized)
	{
	    const char *mangledName = cls->mangledName();
	    
	    if (headerIsPreoptimized  &&  !replacing) {
	        // class list built in shared cache
	        // fixme strict assert doesn't work because of duplicates
	        // ASSERT(cls == getClass(name));
	        ASSERT(getClassExceptSomeSwift(mangledName));
	    } else {
	        // 添加类名
	        addNamedClass(cls, mangledName, replacing);
	        // 插入哈希表中,即从 mach-o 中把类读取到内存当中
	        addClassTableEntry(cls);
	    }
	    return cls;
	}
    • 继续 addNamedClass 的源码:
	static void addNamedClass(Class cls, const char *name, Class replacing = nil)
	{
	    Class old;
	    if ((old = getClassExceptSomeSwift(name))  &&  old != replacing) {
	        inform_duplicate(name, old, cls);
	
	        // getMaybeUnrealizedNonMetaClass uses name lookups.
	        // Classes not found by name lookup must be in the
	        // secondary meta->nonmeta table.
	        addNonMetaClass(cls);
	    } else {
	        // 将 name 与 cls 的地址进行映射,并插入到内存当中
	        NXMapInsert(gdb_objc_realized_classes, name, cls);
	    }
	}
    • 查看 mangledName 方法:
	const char *mangledName() { 
	    // fixme can't assert locks here
	    ASSERT(this);
	    // 这个初始化判断在我们前面分析的 lookuoImp 也有出现过
	    if (isRealized()  ||  isFuture()) {
	        // 如果类已经初始化过,则从 ro 中直接获取 name
	        return data()->ro()->name;
	    } else {
	        // 否则从 mach-o 中读取 data 里面的 name
	        return ((const class_ro_t *)data())->name;
	    }
	}
    • readClass 的主要作用就是将 mach-o 的类读取到内存当中,当前的类中仅有两个信息,即地址和名称,data 数据会在步骤九中读取出来并赋值到类中。
  • 修复重映射没有被镜像文件加载的类
	// Fix up remapped classes
    // Class list and nonlazy class list remain unremapped.
    // Class refs and super refs are remapped for message dispatching.
    
    if (!noClassesRemapped()) {
        for (EACH_HEADER) {
            Class *classrefs = _getObjc2ClassRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
            // fixme why doesn't test future1 catch the absence of this?
            classrefs = _getObjc2SuperRefs(hi, &count);
            for (i = 0; i < count; i++) {
                remapClassRef(&classrefs[i]);
            }
        }
    }
    ts.log("IMAGE TIMES: remap classes");