Linux线程的创建与同步

Posted 2021-12-28 悲伤土豆拌饭

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Linux线程的创建与同步相关的知识，希望对你有一定的参考价值。

Linux线程

线程的概念与实现方式
- 线程与进程的区别
- 线程的实现方式
线程的使用
- 线程库中的接口
- 等待一个线程结束
线程同步
- 多线程并发访问同一块内存的问题
- 使用互斥锁实现线程同步
线程安全
多线程中执行fork()

线程的概念与实现方式

线程与进程的区别

进程就是一个正在运行的程序，我们把一个程序运行起来就产生了一个进程，进程有运行、阻塞、就绪三个状态，线程是进程内部的一条执行序列或执行路径，一个进程可以包含多条线程

进程是操作系统 资源分配 的基本单位
线程是操作系统 CPU调度 的基本单位
进程有自己的独立地址空间，线程共享进程中的地址空间
进程创建消耗资源大，线程的创建消耗资源相对较小
进程切换开销大，线程的切换开销相对较小

线程的实现方式

在操作系统中、线程的实现有以下三种方式：

内核级线程
用户级线程
组合级线程

用户级线程：我们在内核中并不提供创建线程的机制，只能创建一个进程，在用户空间我们用线程库（创建、管理、销毁）模拟出多条路径，在用户级创建开销较小可以创建多个线程，缺点是无法使用多处理器的资源，假如现在有两个处理器都是空闲的，内核感知不到自己有多个线程，所以只能在一个处理器上工作

内核级线程：内核直接创建、由内核直接管理线程调度以及线程的结束，内核可以感知到线程的存在，虽然创建开销过大但是可以使用多处理器的资源

组合级线程：组合模型介于以上两者之间，可以在内核创建多个路径，以利用多处理器资源（假如我们有四个处理器，在内核创建四个线程），同时可以在用户空间创建更多的用户线程，并且分别映射到四个线程中，以利用四个处理器，这样我们只需要保证内核级线程与处理器数量相当，其余线程都创建用户级线程，这样既能使用到多处理器资源并且节省了线程创建的开销

Linux中的进程实际上，是与其他进程共享某些资源的另一个进程

线程的使用

线程库中的接口

int pthread_create(pthread_t *thread, const pthread_atte_t *attrm, void *(*start_routine) (void *), void *arg);
创建线程；参数：线程id(创建成功会写入这个变量)，线程的属性（一般给NULL默认），线程函数（void*函数，返回值也为void *），线程函数的参数

我们写一段代码来看一下

#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>


void* fun(void* arg)

	for(int i =0;i<10;i++)
	
		printf("fun run\\n");
		sleep(1);
	

int main()

	pthread_t id;
	pthread_create(&id,NULL,fun,NULL);//id 属性 函数名 函数参数

	for(int i =0;i<5;i++)
	
		printf("main run  主线程会先结束\\n");
		sleep(1);

编译需要链接线程库
gcc -o main main.c -lpthread

执行发现，主线程结束后子线程也会中断（子线程结束并不会妨碍主线程继续运行），这是因为主线程结束会执行exit导致子线程结束，我们使用pthread_exit(NULL)；来组织主进程先于子线程结束

int pthread_exit(void *retval);

#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>


void* fun(void* arg)

	for(int i =0;i<10;i++)
	
		printf("fun run\\n");
		sleep(1);
	
	//我们通常会用在子线程结束后，返回一个信息
	pthread_exit("fun over"); //关闭线程 并传递回信息

int main()

	pthread_t id;
	pthread_create(&id,NULL,fun,NULL);//id 属性 函数名 函数参数

	for(int i =0;i<5;i++)
	
		printf("main run  主线程会先结束\\n");
		sleep(1);
	
	pthread_exit(NULL);

等待一个线程结束

int pthread_join(pthread_t thread, void **retval);
等待指定thread线程的退出，线程未退出时，该方法阻塞
retval：接收thread线程退出时，指定的退出信息

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>


void* fun(void* arg)

	for(int i =0;i<10;i++)
	
		printf("fun run\\n");
		sleep(1);
	
	pthread_exit("fun over"); //关闭线程 并传递回信息

int main()

	pthread_t id;
	pthread_create(&id,NULL,fun,NULL);//id 属性 函数名 函数参数

	for(int i =0;i<5;i++)
	
		printf("main run\\n主线程会先结束\\n");
		sleep(1);
	
	char* s = NULL;
	pthread_join(id,(void **)&s);//需要强转为void**再解引用
	printf("join:%s\\n",s);
	exit(0);

编译执行

线程同步

多线程并发访问同一块内存的问题

假设我们给一个全局变量g，当多线程都去访问这个变量会发生什么问题

我们写一段代码来看一看

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>

int  wg =0;
void* fun(void * arg)

	for(int i = 0;i<1000;i++)
	
		wg++;
		printf("wg = %d\\n",wg);
	
	pthread_exit(NULL);



int main()

	pthread_t id[5];
	int i =0;
	for(;i<5;i++)
	
		pthread_create(&id[i],NULL,fun,NULL);
	

	for(i = 0;i<5;i++)
	
		pthread_join(id[i],NULL);//等待进程结束 阻塞
	
	exit(0);

我们编译运行多次这个代码会发现

运行结果再发生改变，并且不是每次的结果都是正确的

这是因为当我们多个线程，其中一个线程再读取wg数据自加，但是再操作这个过程的时候，还未将值写入wg的物理地址中，就有另一个进程也去读取了wg导致这两个进程操作自加，最终只加了1

所以会发生小于我们预期的情况

使用互斥锁实现线程同步

线程同步指的是当一个线程再对某个临界资源进行操作时，其他进程都不可以对这个资源进行操作，直到该线程完成操作，其他线程才能操作，也就是协同步调，让线程按预定的先后次序进行运行。
线程同步的方法有四种：

互斥锁
信号量
添加变量
读写锁

我们通过互斥锁进行线程同步，我们对需要同步到资源加一个锁，此时若别的线程也想要访问该资源，就会被阻塞住，只有当加锁的线程解锁，别的线程才能再访问该资源

我们对代码进行修改实现线程同步
初始化：int pthread_mutex_init(pthread_mutex_t *mutex, pthread_mutexattr_t *attr);
加锁：int pthread_mutex_lock(pthread_mutex_t *mutex);
解锁：int pthread_mute_unlock(pthread_mutx_t *mutex);
销毁锁：int pthread_mutex_destroy(pthread_mutex_t *mutex);

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>

pthread_mutex_t mutex;//定义锁
int  wg =0;
void* fun(void * arg)

	for(int i = 0;i<1000;i++)
	
		pthread_mutex_lock(&mutex);//加锁
		wg++;
		printf("wg = %d\\n",wg);
		pthread_mutex_unlock(&mutex);//解锁
	
	pthread_exit(NULL);



int main()

	pthread_mutex_init(&mutex,NULL); //初始化锁
	pthread_t id[5];
	int i =0;
	for(;i<5;i++)
	
		pthread_create(&id[i],NULL,fun,NULL);
	

	for(i = 0;i<5;i++)
	
		pthread_join(id[i],NULL);//等待进程结束 阻塞
	
	pthread_mutex_destroy(&mutex);//销毁锁
	exit(0);

经过多次执行结果都相同且正确

线程安全

线程安全即就是在多线程运行的时候，不论线程的调度顺序怎样，最终的结果都是一样的、正确的。那么就说这些线程是安全的。
要保证线程安全需要做到：

对线程同步，保证同一时刻只有一个线程访问临界资源
在多线程中使用线程安全的函数（可重入函数），所谓线程安全的函数指的是：如果一个函数能被多个线程同时调用且不发生静态条件（多线程调用函数不会出错），则我们称为线程安全的

我们先写一个函数来看看线程安全的重要性
让主线程与子线程分别对两个不同的字符串进行分割并输出

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>

void* fun(void* arg)

	char buff[128] = "a b c d e f g h w";
	char *s = strtok(buff," ");
	while(s!=NULL)
	
		printf("thread: s=%s\\n",s);
		sleep(1);
		s = strtok(NULL," ");//内部静态变量记录偏移量 会出现错误
	

int main()

	pthread_t id;
	pthread_create(&id,NULL,fun,NULL);
	char str[128] = "1 2 3 4 5 6 7";
	char *s = strtok(str," ");
	while(s!=NULL)
	
		printf("man s=%s\\n",s);
		sleep(1);
		s = strtok(NULL," ");

代码运行出现错误，因为strtok并不是一个线程安全方法，其中有静态变量的偏移量，多进程导致偏移量的错误，strtok在单线程程序中是正常运行的

我们查看帮助手册，发现系统提供了线程安全的strtok_r方法

我们使用线程安全的分割方法来修改代码
char *strtok_r(char *str, const char *delim, char **saveptr);

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>

void* fun(void* arg)

	char buff[128] = "a b c d e f g h w";
	char *ptr = NULL;//新建一个指针代替静态变量
	char *s = strtok_r(buff," ",&ptr);//传入地址（char **saveptr）
	while(s!=NULL)
	
		printf("thread: s=%s\\n",s);
		sleep(1);
		s = strtok_r(NULL," ",&ptr);//内部静态变量记录偏移量 会出现错误
	

int main()

	pthread_t id;
	pthread_create(&id,NULL,fun,NULL);
	char str[128] = "1 2 3 4 5 6 7";
	char *ptr = NULL;
	char *s = strtok_r(str," ",&ptr);//strtok_r线程安全方法
	while(s!=NULL)
	
		printf("man s=%s\\n",s);
		sleep(1);
		s = strtok_r(NULL," ",&ptr);

运行结果正确

多线程中执行fork()

我们先提出两个问题：

多线程中某个线程调用fork()，子进程会有和父进程相同数量的线程吗?
父进程被加锁的互斥锁fork()后，再子进程中是否已经加锁？

我们写一段代码来看看

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<pthread.h>

//linux 没有线程概念 由进程实现线程
//fork  设计上不会复制所有进程 只复制线程
void* fun(void* arg)

	int i = 0;
	for(;i<5;i++)
	
		printf("fun run pid=%d\\n",getpid());
		sleep(1);
	

int main()

	pthread_t id;
	pthread_create(&id,NULL,fun,NULL);

	fork(); //父进程有两个线程 子进程有一个线程
	int i = 0;	
	for(;i<5;i++)
	
		printf("man run pid=%d\\n",getpid());
		sleep(1);

执行查看

主线程有两条路径，而子线程只有一条，fork()是从执行那一行开始复制的
那么我将fork()添加到子线程中来看看

执行查看，发现子线程有两条路径，且pid不同，这是fork()设计上的问题，是因为再设计多线程程序的时候考虑到将所有的线程或者锁都进行复制会发生难以想象的复杂

所以上面的两个问题

某个线程发生fork()，子进程只会有一条执行路径
父进程被加锁，属于父进程的资源所以父进程有什么子进程也会有什么，而锁的状态是根据fork()那一刻父进程锁的状态而决定的，父进程加锁则子进程加锁，为防止出现因多线程产生的复杂情况，我们再fork()前，加锁一个双重锁，在没有人用锁的时候来进行复制

以上是关于Linux线程的创建与同步的主要内容，如果未能解决你的问题，请参考以下文章