python爬虫第2篇
Posted Nice1949
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python爬虫第2篇相关的知识,希望对你有一定的参考价值。
一、多进程
1.fork方法(os模块,适用于Lunix系统)
fork方法:调用1次,返回2次。原因:操作系统经当前进程(父进程)复制出一份进程(子进程),两个进程几乎完全相同,fork方法分别在父进程、子进程中返回,子进程返回值为0,父进程中返回的是子进程的ID。
普通方法:调用1次,返回1次
import os if __name__ == ‘__main__‘: print ‘current Process (%s) start ....‘%(os.getpid()) #getpid()用户获取当前进程ID pid = os.fork() if pid <0: print ‘error in fork‘ elif pid == 0: print ‘I am child process (%s)‘ and my parent process is (%s)‘,(os.getpid(),os.getppid()) else: print ‘I (%s) created a child process (%s).‘,(os.getpid(),pid) 运行结果如下: current Process (3052) start .... I (3052) created a child process (3053). I am child process (3053) and my parent process is (3052)
2.multiprocessing(跨平台)
import os # 从multiprocessing模块中导入Process类 from multiprocessing import Process def run_proc(name): print ‘Child process %s (%s) Running...‘ % (name,os.getpid()) if __name__ == ‘__main__‘: print ‘Parent process %s.‘ % os.getpid() for i in range(5): p = Process(target = run_proc,args = (str(i),)) print ‘Process will start‘ #用于启动进程 p.start() # 用于实现进程间的同步 p.join() print ‘Process end‘ 执行结果如下: Parent process 2392. Process will start. Process will start. Process will start. Process will start. Process will start. Child process 2 (10748) Runing... Child process 0 (5324) Runing... Child process 1 (3196) Runing... Child process 3 (4680) Runing... Child process 4 (10696) Runing... Process end
以上是关于python爬虫第2篇的主要内容,如果未能解决你的问题,请参考以下文章
Python爬虫开发第1篇动态HTMLSeleniumPhantomJS
100天精通Python(爬虫篇)——第43天:爬虫入门知识