MapReduce涔婼huffle璇﹁В
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MapReduce涔婼huffle璇﹁В相关的知识,希望对你有一定的参考价值。
鏍囩锛?a href='http://www.mamicode.com/so/1/ctas' title='ctas'>ctas
璇﹁В hellip nap turn nat 杈撳嚭 pre 鐢熸垚Hadoop鍘熺敓鐨勮绠楁鏋禡apReduce锛岀畝鍗曟鎷竴涓嬶細杩涚▼閲忕骇寰堥噸锛屽惎鍔ㄥ緢鎱紝浣嗚兘鎵胯浇鐨勬暟鎹噺寰堝ぇ锛屾晥鐜囩浉杈冧簬Spark寰壒澶勭悊鍜孎link瀹炴椂鏉ヨ寰堟參锛孲huffle浠讳綍涓€涓啓MR鍚屽閮藉繀椤绘帉鎻$殑涓滆タ锛岃闅句笉闅撅紝璇寸畝鍗曚篃涓嶇畝鍗?/p>
MapReduce绋嬪簭鐨勪簲涓樁娈碉細
- input
- map
- shuffle
- reduce
- output
鎴戝皢Shuffle闃舵鍔犵矖浜嗭紝鍘熷洜寰堢畝鍗曪紝鍥犱负杩欓噷寰堥噸瑕?/p>
1. 鍏充簬Shuffle杩囩▼瀹炵幇鐨勫姛鑳斤細
1. 鍒嗗尯锛?/p>
- 鍐冲畾褰撳墠鐨凨ey浜ょ粰鍝釜Reducer杩涜澶勭悊锛岀浉鍚岀殑Key鍒欑敱鐩稿悓鐨凴educer澶勭悊
- 榛樿鏄牴鎹甂ey鐨凥ash鍊硷紝瀵筊educe涓暟鍙栦綑锛堟簮鐮佸涓嬶級
public int getPartition(K2 key, V2 value, int numReduceTasks) { return (key.hashCode() & Integer.MAX_VALUE) % numReducTasks }
2. 鍒嗙粍
- 灏嗙浉鍚岀殑Key鐨剉alue杩涜鍚堝苟
- Key鐩哥瓑鏃跺皢鍒嗗埌鍚屼竴涓粍閲岄潰
- MapReduce闃舵锛屼竴琛岃皟鐢ㄤ竴娆ap鏂规硶锛屼竴绉岾ey璋冪敤涓€娆educe
3. 鎺掑簭锛氬皢Key鎸夌収瀛楀吀鎺掑簭
2. 鍏充簬Shuffle杩囩▼瀹炵幇鍔熻兘鐨勮缁嗘弿杩帮細
1. Map绔疭huffle锛?/div>
-
Spill锛氭孩鍐?/div>
-
姣忎竴涓狹ap澶勭悊涔嬪悗鐨勭粨鏋滈兘浼氳繘鍏ョ幆褰㈢紦鍐插尯锛堝唴瀛橈紝榛樿100M锛?鍏充簬鐜舰缂撳啿鍖烘湁蹇呰鍗曠嫭浜嗚В涓€涓嬶紝涓嶈缁嗗睍寮€浜?
-
鍒嗗尯锛氬姣忎竴鏉ey-value杩涜鍒嗗尯锛屾墦鏍囩
-
鎺掑簭锛氬皢鐩稿悓鍒嗗尯鐨勬暟鎹繘琛屽垎鍖哄唴鎺掑簭
-
褰撶幆褰㈢紦鍐插尯杈惧埌闃堝€肩殑80%锛屽皢鍒嗗尯鎺掑簭鍚庣殑鏁版嵁鍐欏埌纾佺洏鍙樻垚鏂囦欢锛屾渶缁堜細鐢熸垚澶氫釜灏忔枃浠讹紝
Merge鍚堝苟锛?/div>-
灏唖pill鐢熸垚鐨勫皬鏂囦欢杩涜鍚堝苟
-
灏嗙浉鍚屽垎鍖虹殑鏁版嵁杩涜鎺掑簭
锛?strong>Map task缁撴潫锛夐€氱煡ApplicationMaster锛孯educe涓诲姩杩囨潵鎷夊彇鏁版嵁Reduce绔疭huffle2. Reduce绔疭huffle锛?/p>
-
鍚姩澶氫釜绾跨▼锛屽幓姣忓彴鏈哄櫒涓婃媺鍘诲睘浜庤嚜宸卞垎鍖虹殑M鏁版嵁
-
Merge锛?/div>
-
灏嗘瘡涓狹aptask鐨勭粨鏋滃睘浜庤嚜宸卞垎鍖虹殑鏁版嵁杩涜鍚堝苟
-
灏嗘暣浣撳睘浜庤嚜宸卞垎鍖虹殑鏁版嵁杩涜鎺掑簭
鍒嗙粍锛氬鐩稿悓鐨刱ey鐨剉alue杩涜鍚堝苟3. 鍏充簬MapReduce鐨凷huffle浼樺寲锛?/h3>
MapReduce Shuffle杩囩▼鐨勪紭鍖栵細-
Combiner锛氬悎骞?/div>
-
鍦╩ap闃舵鎻愬墠杩涜浜嗕竴娆″悎骞讹紝涓€鑸潵璇寸瓑鍚屼簬鎻愬墠杩涜浜唕educe锛岄檷浣巖educe鐨勫帇鍔?/div>涓嶆槸鎵€鏈夌殑绋嬪簭閮介€傚悎combinerCompress锛氬帇缂?/div>
-
鑳藉ぇ澶у噺灏戠鐩樺拰缃戠粶鐨処O
hadoop涓缃帇缂╋細-
hadoop checknative鏌ョ湅鏈湴鏀寔鍝簺鍘嬬缉
-
甯歌鐨勫帇缂╂牸寮忥細snappy锛宭zo锛宭z4
-
淇敼鏈湴鏀寔鐨勫帇缂╂柟寮忥細鏇挎崲lib/native
MapReduce绋嬪簭鍙互璁剧疆鍘嬬缉鐨勪綅缃細-
杈撳叆
-
map鐨勪腑闂寸粨鏋?闇€瑕佸悓鏃舵寚瀹?
-
mapreduce.map.output.compress
-
mapreduce.map.output.compress.codec=榛樿鏄疍efaultCodec
-
reduce鐨勮緭鍑?/div>
-
mapreduce.output.fileoutputformat.compress
-
Mapreduce,output.fileoutputformat.compress.codec
鎬庝箞璁剧疆鍘嬬缉锛?/div>-
闆嗙兢閰嶇疆鏂囦欢鍐?/div>璁剧疆conf瀵硅薄褰撳墠绋嬪簭鏈夋晥杩愯鏃舵寚瀹氬弬鏁帮細 -Dmapreduce.output.fileoutputformat.compress=true ….
-
-
-
-
以上是关于MapReduce涔婼huffle璇﹁В的主要内容,如果未能解决你的问题,请参考以下文章