銆岀紪鐮佺郴鍒椼€峂apReduce
Posted ronething
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了銆岀紪鐮佺郴鍒椼€峂apReduce相关的知识,希望对你有一定的参考价值。
馃挰
澶╁緢鐑€?/p>
馃捇
鏈€杩戞祻瑙堝埌鍚嶆牎鍏紑璇剧▼璇勪环缃?/span>[1]锛屽彂鐜拌繖闂?MIT6.824 鍒嗗竷寮忕郴缁熻绋嬶紝鍏跺疄涔嬪墠涔熸湁鎵€浜嗚В锛屼笉杩囧洜涓哄惉璇?lab 寰堥毦鍔犱笂娌′粈涔堟椂闂存墍浠ュ氨娌℃湁杩涜瀛︿範銆傚啀娆$湅鍒板喅瀹氳繕鏄瀛︾湅锛屽棷 瀛︿笉涓嬪ぇ涓嶄簡灏辩畻浜?/p>
璇剧▼鐨勪富瑕侀噸鐐瑰湪浜庡畬鎴?lab锛岀涓€涓?lab 鏄?MapReduce[2]锛岃繖搴旇鏄墍鏈?lab 涓渶绠€鍗曠殑涓€涓簡(鐒惰€屾垜杩樻槸鑺变簡鎸哄鏃堕棿锛岃€屼笖杩樻湁浜嗚В鍒汉鐨勬€濊矾锛屽疄鍦ㄦ儹鎰? 瀹屾垚杩欎釜 lab 澶ф闇€瑕佹秹鍙婁互涓嬬煡璇嗙偣锛?/p>
1銆乬o 璇█鍩烘湰璇硶锛岀紪鍐欏師鐢?rpc 鏈嶅姟绠€鍗曚簡瑙o紝鏂囦欢鎿嶄綔鐩稿叧绛?/p>
2銆乴inux/git 鍩烘湰鐭ヨ瘑 涔熷彲浠ョ湅鐪?lab guidance[3] 浠嬬粛浜嗗涔犺绋嬫墍闇€瑕佺殑鐭ヨ瘑鐐?/p>
寮€鍙戠幆澧冭姹傦細 1銆佹渶濂芥槸 linux锛屽洜涓烘祴璇曠▼搴? 2銆佹垜鏄湪 macos 涓婄紪鍐欎唬鐮侊紝鐒跺悗杩涜鑷祴锛屽啀鏀惧埌 linux 涓婅窇 鏁翠綋娴佺▼鍙互鎸夌収瀹為獙鎻忚堪璧?/p>
鐪嬭鏂?褰撶劧杩樻湁涓婅) 璁烘枃鍘熸枃[4]銆?span class="mq-36">璁烘枃涓枃缈昏瘧test-mr.sh
閲岄潰浣跨敤鐨勬槸 linux 鐨勫懡浠わ紝渚嬪 timeout
鍦?macos 涓婂苟娌℃湁test-mr.sh
MapReduce
[5]
娴嬭瘯缂栬瘧杩愪覆琛?mapreduce word count
$ go build -race -buildmode=plugin ../mrapps/wc.go
$ rm mr-out*
$ go run -race mrcoordinator.go pg-*.txt
-
瀹炵幇 mr/coordinator.go
,mr/worker.go
,mr/rpc.go
浠ヤ究閫氳繃娴嬭瘯
杩欓噷鍏跺疄鎵嶆槸鎴戜滑闇€瑕佽ˉ鍏呬唬鐮佺殑鍦版柟銆傜畝鍗曠敾浜嗕竴涓嬪浘銆?/p>
澶ф剰灏辨槸锛?/p>
1銆乧oordinator.go 涓疄鐜?rpc server锛岄渶瑕佹湁 AskForTask 浠ュ強 SubmitTask 鏂规硶浠ヤ緵 worker 杩涜璋冪敤
AskForTask: 鑾峰彇 Task锛屽彲浠ユ槸 Map 绫诲瀷锛屼篃鍙互鏄?Reduce 绫诲瀷
SubmitTask: worker 鎵ц瀹岀浉搴旂殑 Task 涔嬪悗锛岃皟鐢ㄦ鏂规硶杩涜鎻愪氦
2銆亀orker 瀹炵幇 DoMap 浠ュ強 DoReduce 鏂规硶锛岃繖閲岀殑璇?DoMap 杩橀渶瑕佷娇鐢?saveIntermediate 鏂规硶瀛樺偍涓棿缁撴灉锛孌oReduce 鍒欓渶瑕?loadIntermediate 鏂规硶鍔犺浇涓棿缁撴灉
3銆佹秹鍙婂埌涓変釜缁撴瀯浣?Task Coordinator worker
// Task锛屽彲浠ユ槸 Map 绫诲瀷 涔熷彲浠ユ槸 Redcue 绫诲瀷
type Task struct {
Index int
Filename string
Types TaskType
Completed bool // 鏄惁瀹屾垚
Distributed bool // 鏄惁宸插垎閰?/span>
}
// Coordinator 鍗忚皟鑰咃紝瀵瑰簲璁烘枃涓殑 master 瑙掕壊
type Coordinator struct {
mu sync.Mutex
finished bool
mapTasks []*Task
reduceTasks []*Task
mapCount int
reduceCount int
}
// worker锛宮apf 浠ュ強 reducef 瀵瑰簲 plugin 涓殑涓や釜 func
type worker struct {
mapf func(string, string) []KeyValue
reducef func(string, []string) string
}
4銆佷竴浜涚粏鑺?/p>
-
闇€瑕佺瓑鎵€鏈?map task 澶勭悊瀹屼箣鍚庯紝鎵嶅紑濮嬪鐞?reduce task
-
鍒ゆ柇浠诲姟鍏ㄩ儴瀹屾垚鐨勬潯浠舵槸 reduceCount >= len(reduceTasks)
if c.reduceCount >= len(c.reduceTasks) {
c.finished = true // 鎬荤殑浠诲姟瀹屾垚
} -
worker 濡傛灉鑾峰彇鍒扮殑 task 涓?nil锛屽垯闇€瑕佺瓑寰呬竴浼氶噸鏂板彂璧疯姹?task锛屽嚭鐜?nil 鐨勫師鍥犲彲鑳芥槸 map task 鍒嗛厤瀹屼簡锛屼絾鏄?reduce task 杩樹笉鍙互寮€濮嬫墽琛岋紝鎴栬€呮墍鏈?task 閮藉垎閰嶅畬浜嗭紝浣嗘槸杩樻病鏈夊叏閮ㄥ畬鎴?/p>
5銆佷竴浜涘潙(鐭ヨ瘑鐐?
-
Task 鐨勫瓧娈甸渶瑕?Exported锛屽洜涓鸿閫氳繃 rpc 鏈嶅姟浼犺緭锛屾湭瀵煎嚭瀛楁浼氭湁闂
-
rpc socket 鏂囦欢娉ㄦ剰鏉冮檺闂
func coordinatorSock() string {
s := "/tmp/824-mr-" // 杩欓噷 /var/tmp 榛樿鐩綍娌℃湁鏉冮檺(闇€瑕?nbsp;sudo) 鎵€浠ユ垜鏀逛负 /tmp
s += strconv.Itoa(os.Getuid())
return s
}
-
fmt.Sscanf 鐨勪娇鐢?
PS锛氬弬鑰冧簡涓€浣嶅ぇ浣殑鎬濊矾锛屽悗缁墦绠楁敼涓€涓娇鐢?channel 鐗堟湰鐨勩€?/p>
鐩墠鍒濈増宸茬粡鏀惧埌 cloud-org/6.824[6] 铏界劧璇磋绋嬪師鍒欎笂鏄笉鍏佽鍏紑浠g爜鐨勶紝浣嗘槸鎴戠浉淇?mit 鐨勫鐢熸瘮鎴戝帀瀹冲浜嗭紝鑷劧涓嶉渶瑕佺湅鎴戠殑浠g爜銆傚棷 鍏朵粬鐨勭偣锛屾湁鎯冲埌鍐嶈鍚с€?/p>
馃幐
涓婂懆鏈粌涔犱簡涓€涓嬨€婇煶涔愪汉鍚変粬璇俱€嬮噷闈㈢殑鍑犲皬鑺傝绋嬨€?/p>
馃捊
涓婂懆涔板埌鑻忔墦缁跨殑銆婂/鐙傜儹銆嬪拰 銆婂皬瀹囧畽銆嬶紝浼楁墍鍛ㄧ煡銆婂/鐙傜儹銆嬪凡缁忓緢闅句拱锛岀幇鍦ㄤ拱鍒扮殑鏄澹冲寘瑁呮湁鐮存崯鐨勶紝涓嶈繃鑷劧鏄笉褰卞搷 CD 鎾斁(涔熸湁姝岃瘝鏈瓑)锛屻€婂皬瀹囧畽銆嬪垯鏄?14 骞村啀鐗堬紝浣嗘槸鑳戒拱鍒板凡缁忔尯寮€蹇冨暒銆傛瘮杈冨枩娆㈣繖涓€棣栥€婃棤鐪犮€?/p>
馃摵
鏈€杩戜篃甯稿惉妞呭瓙涔愬洟鐨勬瓕锛屽彂鐜颁竴閮ㄥ彴鍓с€婅嫢鏄竴涓汉銆嬮噷闈㈡湁寰堝浠栦滑鐨勬瓕鏇诧紝寰堜笉閿欍€?/p>
馃尀
鍡?姣旇緝鏈夎叮 鏃╃偣鐫°€?/p>
鍙傝€冭祫鏂?/span>
鍚嶆牎鍏紑璇剧▼璇勪环缃? https://conanhujinming.github.io/comments-for-awesome-courses/MIT6.824%E5%88%86%E5%B8%83%E5%BC%8F%E7%B3%BB%E7%BB%9F.html
[2]MapReduce: https://pdos.csail.mit.edu/6.824/labs/lab-mr.html
[3]lab guidance: https://pdos.csail.mit.edu/6.824/labs/guidance.html
[4]璁烘枃鍘熸枃: http://research.google.com/archive/mapreduce-osdi04.pdf
[5]璁烘枃涓枃缈昏瘧: https://github.com/pirDOL/kaka/blob/master/Papers/MapReduce-Simplified-Data-Processing-on-Large-Clusters.md
[6]cloud-org/6.824: https://github.com/cloud-org/6.824
以上是关于銆岀紪鐮佺郴鍒椼€峂apReduce的主要内容,如果未能解决你的问题,请参考以下文章
涓€绔欐敹褰曡绠楁満鐭ヨ瘑浣撶郴锛氱畻娉曘€佹搷浣滅郴缁熴€佹暟鎹簱銆佺紪绋嬪疄璺点€佺郴缁熻璁$瓑