golang interface 与反射

Posted 2022-12-07 惜暮

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了golang interface 与反射相关的知识，希望对你有一定的参考价值。

golang interface 与反射

golang interface 使用场景
golang interface 数据结构
golang interface 一些使用场景原理
为什么用反射
反射实现原理以及与interface关系
反射的性能损耗原因以及性能评估

base go 1.13.5

golang interface 使用场景

这里先简单描述一下 interface 的使用场景。我们通常有两种方式使用interface，一种是带方法的interface，一种是空interface。

我们一般用带方法的interface作为一个通用的抽象。用空的interface 来作为一种泛型使用。

具体的使用姿势的形式上，一般也就是作为函数入参，返回值，属性域等等。

除了要会用、用对以外，我觉得有必要搞清楚内部原理。比如作为函数入参，返回值，值和指针接受者的函数调用等的性能损耗。

golang interface 数据结构

interface变量前面说了有两种，一种是带方法的，一种是不带方法的。编译器会自动映射成底层的两种结构：iface 和 eface。区别在于 iface 描述的接口包含方法，而 eface 则是不包含任何方法的空接口：interface。

下面看一下源码的定义: runtime/runtime2.go

type iface struct 
	tab  *itab
	data unsafe.Pointer


type eface struct 
	_type *_type
	data  unsafe.Pointer


// 描述带方法的interface的类型信息以及接口信息
type itab struct 
	inter *interfacetype
	_type *_type
	hash  uint32 // copy of _type.hash. Used for type switches.
	_     [4]byte
	fun   [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.


// 描述接口的方法信息
type interfacetype struct 
	typ     _type
	pkgpath name
	mhdr    []imethod


// 描述interface存储的实际对象的类型信息
type _type struct 
	size       uintptr
	ptrdata    uintptr // size of memory prefix holding all pointers
	hash       uint32
	tflag      tflag
	align      uint8
	fieldalign uint8
	kind       uint8
	alg        *typeAlg
	// gcdata stores the GC type data for the garbage collector.
	// If the KindGCProg bit is set in kind, gcdata is a GC program.
	// Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
	gcdata    *byte
	str       nameOff
	ptrToThis typeOff

从eface和iface的定义可知道，interface的portal层的定义实际上是2个指针，一个类型相关的信息，一个是指向实际存储对象的数据指针。也就是16个字节。itab的fun 字段放置和接口方法对应的具体数据类型的方法地址，实现接口调用方法的动态分派，一般在每次给接口赋值发生转换时会更新此表，或者直接拿缓存的 itab。

另外，你可能会觉得奇怪，为什么 fun 数组的大小为 1，要是接口定义了多个方法可怎么办？实际上，这里存储的是第一个方法的函数指针，如果有更多的方法，在它之后的内存空间里继续存储。从汇编角度来看，通过增加地址就能获取到这些函数指针，没什么影响。顺便提一句，这些方法是按照函数名称的字典序进行排列的。

再看一下 interfacetype 类型，它描述的是接口的类型：

type interfacetype struct 
	typ     _type
	pkgpath name
	mhdr    []imethod

可以看到，它包装了 _type 类型，_type 实际上是描述 Go 语言中各种数据类型的结构体。我们注意到，这里还包含一个 mhdr 字段，表示接口所定义的函数列表， pkgpath 记录定义了接口的包名。

下面用一张图描述 iface 的全貌：

下面可以看一个实例：

package main

import "fmt"

func main() 
	x := 100
	var inter interface = x
	fmt.Println(inter)

	g := Gopher"Go"
	var c coder = g
	fmt.Println(c)


type coder interface 
	code()
	debug()


type Gopher struct 
	language string


func (p Gopher) code() 
	fmt.Printf("I am coding %s language\\n", p.language)


func (p Gopher) debug() 
	fmt.Printf("I am debuging %s language\\n", p.language)

通过 go tool compile -S 输出汇编代码，可以看到，main 函数里调用了两个函数：

func convT64(val uint64) (x unsafe.Pointer)
func convTstring(val string) (x unsafe.Pointer)

这里编译器可以自动识别数据的类型，并转换成对应的值。

上面的convTXXX函数定义在 runtime/iface.go 里面，这个文件里面有一段注释：

// The conv and assert functions below do very similar things.
// The convXXX functions are guaranteed by the compiler to succeed.
// The assertXXX functions may fail (either panicking or returning false,
// depending on whether they are 1-result or 2-result).
// The convXXX functions succeed on a nil input, whereas the assertXXX
// functions fail on a nil input.

下面的conv和assert函数做的事情非常类似。
编译器保证了convXXX函数的成功。
assertXXX函数可能失败(panic或返回false，这取决于它们是1-结果，还是2-结果)。
convXXX函数在nil输入时成功，而assertXXX则失败

这里列出所有的函数：

//下面的这些方法是将指定的类型转换成interface类型，但是下面的这些方法返回的仅仅是返回data指针

// 转换对象成一个 interface
func convT2E(t *_type, elem unsafe.Pointer) (e eface)
// 转换uint16成一个interface的data指针
func convT16(val uint16) (x unsafe.Pointer)
// 转换uint32成一个interface的data指针
func convT32(val uint32) (x unsafe.Pointer)
// 转换uint64成一个interface的data指针
func convT64(val uint64) (x unsafe.Pointer)
// 转换string成一个interface的data指针
func convTstring(val string) (x unsafe.Pointer)
// 转换slice成一个interface的data指针
func convTslice(val []byte) (x unsafe.Pointer)
// 转换t类型的元素到interface, 这里的t不是指针类型
func convT2Enoptr(t *_type, elem unsafe.Pointer) (e eface) 
// 指定类型的到 interface 的转换
func convT2I(tab *itab, elem unsafe.Pointer) (i iface)
// 指定类型到 interface 的转换，不是指针
func convT2Inoptr(tab *itab, elem unsafe.Pointer) (i iface)
// interface到interface的转换。
func convI2I(inter *interfacetype, i iface) (r iface) 

// 下面是断言调用的一些函数
func assertI2I(inter *interfacetype, i iface) (r iface) 

func assertI2I2(inter *interfacetype, i iface) (r iface, b bool)

func assertE2I(inter *interfacetype, e eface) (r iface)

func assertE2I2(inter *interfacetype, e eface) (r iface, b bool)

这些函数在将指定类型转换成 interface 和 interface做类型断言时候会调用。在我当前go版本1.13.5中还做了一些优化，对于一些特定类型的，比如int等基本数字数据类型、String、slice等等，只需要做调用mallocgc 申请一片新内存，然后做赋值。但是对于具体类型准换成interface等场景，除了调用mallocgc 申请内存，还需要内存的拷贝。

具体场景看下面的内容。

golang interface 一些使用场景原理

函数参数是 interface 的成本

我们经常使用的一个场景就是函数的参数是 interface或则是一个由函数的interface。比如：

func m1(p interface)

这个时候我们传递参数，参数是一个具体的数据类型，比如是一个struct或则是一个基本类型，那么就需要将这个具体的类型转换成 interface, 这个时候是有性能损耗的。如果我们在函数内部想要获得具体的类型做类型断言，这个时候也是有性能损耗的。

具体性能损耗对比，可以参考golang type assertion and unsafe.Pointer 性能对比

interface和带方法的interface的赋值过程

赋值过程其实就是类型转换的过程，具体就是调用 conVxxxx 函数。过程也比较简单，细节可以参考源码。

动态类型与动态分发是如何实现的，动态分发什么时候进行，并且有什么样的调用成本

首先说一下动态类型是怎么实现的。对于interface来说，动态类型用 _type 来描述。对于非空interface来说，动态类型由itab 来描述。

我们看一个例子来验证对象的动态类型。

package main

import (
	"fmt"
	"reflect"
	"unsafe"
)

type iface struct 
	itab, data uintptr


func main() 
	var a interface = nil

	bi := new(int)
	*bi = 10
	var b interface = bi

	x := 5
	var c interface = (*int)(&x)

	ia := *(*iface)(unsafe.Pointer(&a))
	ib := *(*iface)(unsafe.Pointer(&b))
	ic := *(*iface)(unsafe.Pointer(&c))

	fmt.Println(ia, ib, ic)
	fmt.Println(reflect.TypeOf(b) == reflect.TypeOf(c))

看看输出的结果：

0 0 17454368 824634166904 17454368 824634166896
true

对于ib和ic的类型字段指针地址是一样的，也就是说两个是同一个对象。通过调用reflect.TypeOf也能得以验证。

如何进行类型转换

通过前面的 iface 的源码可以看到，实际上它包含接口类型 interfacetype 和实体类型 _type，这两个都是 iface 的字段 itab 的成员。也就是说生成一个 itab 同时需要接口的类型和实体的类型。

interfacetype的结构再贴一次：

type interfacetype struct 
	typ     _type
	pkgpath name
	mhdr    []imethod


type imethod struct 
	name nameOff
	ityp typeOff

我们在判断一种类型是否满足某个接口时，Go 使用类型的方法集和接口所需要的方法集进行匹配，如果类型的方法集完全包含接口的方法集，则可认为该类型实现了该接口。

比如：某个类型有 m 个方法，某个接口有 n 个方法，则很容易知道这种判定的时间复杂度为 O(mn)，Go 会对方法集的函数按照函数名的字典序进行排序，所以实际的时间复杂度为 O(m+n)。

实际的类型转换实现是通过调用 runtime/iface.go 里面的方法：
func convI2I(inter *interfacetype, i iface) (r iface)
将一个 interface 转换成另外一个 interface。

具体实现如下：

func convI2I(inter *interfacetype, i iface) (r iface) 
	tab := i.tab
	if tab == nil 
		return
	
	if tab.inter == inter 
		r.tab = tab
		r.data = i.data
		return
	
	r.tab = getitab(inter, tab._type, false)
	r.data = i.data
	return

这里面最重要的就是 getitab 函数的源码，这里源码和细节就不说了，感兴趣可以看源码。简单说就是 getitab 函数会根据 interfacetype 和 _type 去全局的 itab 哈希表中查找，如果能找到，则直接返回；否则，会根据给定的 interfacetype 和 _type 新生成一个 itab，并插入到 itab 哈希表，这样下一次就可以直接拿到 itab。

如何进行断言，断言的成本有多高

断言的实现，实际上也是调用 runtime/iface.go 里面的 assertXXX方法，具体实现参考源码。

为什么用反射

Go 语言提供了一种机制在运行时更新变量和检查它们的值、调用它们的方法，但是在编译时并不知道这些变量的具体类型，这称为反射机制。

关于为什么使用反射，这里列出两个常用场景：

有时你需要编写一个函数，但是并不知道传给你的参数类型是什么，可能是没约定好；也可能是传入的类型很多，这些类型并不能统一表示。这时反射就会用的上了。
有时候需要根据某些条件决定调用哪个函数，比如根据用户的输入来决定。这时就需要对函数和函数的参数进行反射，在运行期间动态地执行函数。

但是注意，使用反射是有有很多缺点的。比较重要的就是：性能损耗，以及代码的安全性。

Go语言作为一门静态语言，编码过程中，编译器能提前发现一些类型错误，但是对于反射代码是无能为力的。所以包含反射相关的代码，很可能会运行很久，才会出错，这时候经常是直接 panic，可能会造成严重的后果。
反射对性能影响还是比较大的，比正常代码运行速度慢一到两个数量级。所以，对于一个项目中处于运行效率关键位置的代码，尽量避免使用反射特性。

反射实现原理以及与interface关系

前面讲了，interface 是 Go 描述对象的一个非常强大的抽象。当向接口变量赋值一个实体类型的时候，接口会存储实体的类型信息，反射就是通过接口的类型信息实现的，反射建立在类型的基础上。

Go 语言在 reflect 包里定义了各种类型，实现了反射的各种函数，通过它们可以在运行时检测类型的信息、改变类型的值。

反射的Type和interface

Go是一个强类型的语言，每个类型都有一个静态类型，并且这个静态类型在编译阶段就能够确认。比如int, int[],string等等，需要注意的是，这个类型是声明时候的类型，不是底层数据类型。

比如：

type TestInt int
var i int
var j TestInt

这里i和j的存储类型虽然都是int, 但是对于Go来说，i和j却是两个不同的静态类型，也不能用于互相赋值，除非做类型转换。

理解Go的反射，就必须理解interface的结构，这两种息息相关。前面已经描述了 interface 的底层结构，这里再来复习一下：

type iface struct 
	tab  *itab
	data unsafe.Pointer

// 描述带方法的interface的类型信息以及接口信息
type itab struct 
	inter *interfacetype
	_type *_type
	hash  uint32 // copy of _type.hash. Used for type switches.
	_     [4]byte
	fun   [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.


// 描述接口的方法信息
type interfacetype struct 
	typ     _type
	pkgpath name
	mhdr    []imethod


// 描述interface存储的实际对象的类型信息
type _type struct 
	size       uintptr
	ptrdata    uintptr // size of memory prefix holding all pointers
	hash       uint32
	tflag      tflag
	align      uint8
	fieldalign uint8
	kind       uint8
	alg        *typeAlg
	// gcdata stores the GC type data for the garbage collector.
	// If the KindGCProg bit is set in kind, gcdata is a GC program.
	// Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
	gcdata    *byte
	str       nameOff
	ptrToThis typeOff

iface 描述的是非空接口，它包含方法；与之相对的是 eface，描述的是空接口，不包含任何方法，Go 语言里有的类型都 “实现了” 空接口。

我们再看看reflect里面的基本数据类型和接口。reflect 包里定义了一个接口和一个结构体，即 reflect.Type 和 reflect.Value，它们提供很多函数来获取存储在接口里的类型信息。

reflect.Type 是一个接口，提供了很多方法老获取关于类型相关的信息，rtype 实现了 Type 接口。我们可以看下图，对于Go的其余类型，比如sliceType也默认实现了reflect.Type接口。实际上sliceType等都是组合了rtype和一个类型特有的信息。

看下 rtype 的定义：

// rtype is the common implementation of most values.
// It is embedded in other struct types.
//
// rtype must be kept in sync with ../runtime/type.go:/^type._type.
type rtype struct 
	size       uintptr
	ptrdata    uintptr  // number of bytes in the type that can contain pointers
	hash       uint32   // hash of type; avoids computation in hash tables
	tflag      tflag    // extra type information flags
	align      uint8    // alignment of variable with this type
	fieldAlign uint8    // alignment of struct field with this type
	kind       uint8    // enumeration for C
	alg        *typeAlg // algorithm table
	gcdata     *byte    // garbage collection data
	str        nameOff  // string form
	ptrToThis  typeOff  // type for pointer to this type, may be zero

rtype 是Go里面其余类型的基础类型，会被内嵌在很多其余类型struct里面。也就是说所有的类型都会包含 rtype 这个字段，表示各种类型的公共信息；另外，不同类型包含自己的一些独特的部分。比如下面的：

// arrayType represents a fixed array type.
type arrayType struct 
	rtype
	elem  *rtype // array element type
	slice *rtype // slice type
	len   uintptr


// chanType represents a channel type.
type chanType struct 
	rtype
	elem *rtype  // channel element type
	dir  uintptr // channel direction (ChanDir)

.....
funcType
ptrType
sliceType
structType
......

此外rtype必须和…/runtime/type.go里面的 _type 保持一致。这里肯定用于和interface里面的类型指针做指针类型转换的。

再来看看 reflect.Value的结构：

// reflect/value.go
type Value struct 
	// typ holds the type of the value represented by a Value.
	typ *rtype

	// Pointer-valued data or, if flagIndir is set, pointer to data.
	// Valid when either flagIndir is set or typ.pointers() is true.
	ptr unsafe.Pointer

	// flag holds metadata about the value.
	// The lowest bits are flag bits:
	//	- flagStickyRO: obtained via unexported not embedded field, so read-only
	//	- flagEmbedRO: obtained via unexported embedded field, so read-only
	//	- flagIndir: val holds a pointer to the data
	//	- flagAddr: v.CanAddr is true (implies flagIndir)
	//	- flagMethod: v is a method value.
	// The next five bits give the Kind of the value.
	// This repeats typ.Kind() except for method values.
	// The remaining 23+ bits give a method number for method values.
	// If flag.kind() != Func, code can assume that flagMethod is unset.
	// If ifaceIndir(typ), code can assume that flagIndir is set.
	flag
	......

可以看到Value里面实际上是包含类型信息的，然后也包含一个指向实际value的指针。

reflect 包中提供了两个基础的关于反射的函数来获取上述的接口和结构体：

func TypeOf(i interface) Type 
func ValueOf(i interface) Value

TypeOf 函数用来提取一个接口中值的类型信息。由于它的输入参数是一个空的 interface，调用此函数时，实参会先被转化为 interface类型。这样，实参的类型信息、方法集、值信息都存储到 interface 变量里了。

ValueOf 函数返回值 reflect.Value 表示 interface 里存储的实际变量，它能提供实际变量的各种信息。相关的方法常常是需要结合类型信息和值信息。例如，如果要提取一个结构体的字段信息，那就需要用到 _type (具体到这里是指 structType) 类型持有的关于结构体的字段信息、偏移信息，以及 *data 所指向的内容 —— 结构体的实际值。

这里引用老钱《快学Go语言第十五课——反射》的一张图：

reflect.TypeOf 函数解析

// TypeOf returns the reflection Type that represents the dynamic type of i.
// If i is a nil interface value, TypeOf returns nil.
func TypeOf(i interface) Type 
	eface := *(*emptyInterface)(unsafe.Pointer(&i))
	return toType(eface.typ)


// emptyInterface is the header for an interface value.
type emptyInterface struct 
	typ  *rtype
	word unsafe.Pointer


// toType converts from a *rtype to a Type that can be returned
// to the client of package reflect. In gc, the only concern is that
// a nil *rtype must be replaced by a nil Type, but in gccgo this
// function takes care of ensuring that multiple *rtype for the same
// type are coalesced into a single Type.
func toType(t *rtype) Type 
	if t == nil 
		return nil
	
	return t

当我们调用 reflect.TypeOf 函数时候，首先会将入参实际类型转换成 interface，然后通过非类型安全的指针转换成emptyInterface。最后获取实际的类型对象 rtype (rtype实现了reflect.Type接口)。最后实际返回的是接口，reflect.Type，所以可以通过调用 reflect.Type的各种接口函数获取类型信息。

reflect.ValueOf 函数解析

func ValueOf(i interface) Value 
	if i == nil 
		return Value
	

	// TODO: Maybe allow contents of a Value to live on the stack.
	// For now we make the contents always escape to the heap. It
	// makes life easier in a few places (see chanrecv/mapassign
	// comment below).
	escapes(i)

	return unpackEface(i)


func escapes(x interface) 
	if dummy.b 
		dummy.x = x
	


// unpackEface converts the empty interface i to a Value.
func unpackEface(i interface) Value 
	e := (*emptyInterface)(unsafe.Pointer(&i))
	// NOTE: don't read e.word until we know whether it is really a pointer or not.
	t := e.typ
	if t == nil 
		return Value
	
	f := flag(t.Kind())
	if ifaceIndir(t) 
		f |= flagIndir
	
	return Valuet, e.word, f

reflect.ValueOf 函数返回的是反射的 Value 对象。主要主干流程如下：

首先会调用escapes函数确保输入对象分配在堆上；
做非类型安全指针转换成*emptyInterface；
封装emptyInterface里面的 type 和 value 到 reflect.Value。

通过reflect.Value 可以读写对象。

反射的性能损耗原因以及性能评估

reflect.TypeOf 和 reflect.ValueOf 的损耗并不多，涉及到主要是 interface 的装箱/拆箱操作，或者是创建新的Value对象。

装箱拆箱带来的性能影响可以参考 golang type assertion and unsafe.Pointer 性能对比

下面测试：

通过反射和直接New创建对象性能；
通过反射获取对象设置值、通过field的name设置值、通过index设置值、原生的直接设置值的性能；

测试环境：Mac2015款，2核心，8G内存。Go1.13.5

测试代码：

package main

import (
	"reflect"
	"testing"
)

func BenchmarkReflect_New(b *testing.B) 
	var s *Student
	sv := reflect.TypeOf(Student)
	b.ResetTimer()
	for i := 0; i < b.N; i++ 
		sn := reflect.New(sv)
		s, _ = sn.Interface().(*Student)
	
	_ = s


func BenchmarkDirect_New(b *testing.B) 
	var s *Student
	b.ResetTimer()
	for i := 0; i < b.N; i++ 
		s = new(Student)
	
	_ = s


func BenchmarkReflect_Set(b *testing.B) 
	var s *Student
	sv := reflect.TypeOf(Student)
	b.ResetTimer()
	for i := 0; i < b.N; i++ 
		sn := reflect.New(sv)
		s = sn.Interface().(*Student)
		s.Name = "Jerry"
		s.Age = 18
		s.Class = "20005"
		s.Score = 100
	

func BenchmarkReflect_SetFieldByName(b *testing.B) 
	sv := reflect.TypeOf(Student)
	b.ResetTimer()
	for i := 0; i < b.N; i++ 
		sn := reflect.New(sv).Elem()
		sn.FieldByName("Name").SetString("Jerry")
		sn.FieldByName("Age").SetInt(18)
		sn.FieldByName("Class").SetString("20005")
		sn.FieldByName("Score").SetInt(100)
	

func BenchmarkReflect_SetFieldByIndex(b *testing.B) 
	sv := reflect.TypeOf(Student)
	b.ResetTimer()
	for i := 0; 以上是关于golang interface 与 反射的主要内容，如果未能解决你的问题，请参考以下文章

golang interface 与 反射

golang interface 与 反射

golang interface 使用场景

golang interface 数据结构

golang interface 一些使用场景原理

函数参数是 interface 的成本

interface和带方法的interface的赋值过程

动态类型与动态分发是如何实现的，动态分发什么时候进行，并且有什么样的调用成本

如何进行类型转换

如何进行断言，断言的成本有多高

为什么用反射

反射实现原理以及与interface关系

反射的Type和interface

reflect.TypeOf 函数解析

reflect.ValueOf 函数解析

反射的性能损耗原因以及性能评估

golang interface 与反射

golang interface 与反射