WEKA浣跨敤鏁欑▼(缁忓吀鏁欑▼杞浇)

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了WEKA浣跨敤鏁欑▼(缁忓吀鏁欑▼杞浇)相关的知识,希望对你有一定的参考价值。

WEKA浣跨敤鏁欑▼(缁忓吀鏁欑▼杞浇)

鏍囩锛?nbsp;lift绠楁硶csv鏁版嵁鎸栨帢class浠诲姟
鎶€鏈垎浜? src= 鍒嗙被锛?/div>

WEKA浣跨敤鏁欑▼

鐩綍 
1. 绠€浠?br />2. 鏁版嵁鏍煎紡
3.鏁版嵁鍑嗗
4. 鍏宠仈瑙勫垯锛堣喘鐗╃鍒嗘瀽锛?br />5. 鍒嗙被涓庡洖褰?br />6. 鑱氱被鍒嗘瀽

1. 绠€浠?/strong> 

WEKA鐨勫叏鍚嶆槸鎬€鍗℃墭鏅鸿兘鍒嗘瀽鐜锛圵aikato Environment for Knowledge Analysis锛夛紝瀹冪殑婧愪唬鐮佸彲閫氳繃http://www.cs.waikato.ac.nz/ml/weka寰楀埌銆傚悓鏃秝eka涔熸槸鏂拌タ鍏扮殑涓€绉嶉笩鍚嶏紝鑰學EKA鐨勪富瑕佸紑鍙戣€呮潵鑷柊瑗垮叞銆?br />
WEKA浣滀负涓€涓叕寮€鐨勬暟鎹寲鎺樺伐浣滃钩鍙帮紝闆嗗悎浜嗗ぇ閲忚兘鎵挎媴鏁版嵁鎸栨帢浠诲姟鐨勬満鍣ㄥ涔犵畻娉曪紝鍖呮嫭瀵规暟鎹繘琛岄澶勭悊锛屽垎绫伙紝鍥炲綊銆佽仛绫汇€佸叧鑱旇鍒欎互鍙婂湪鏂扮殑浜や簰寮忕晫闈笂鐨勫彲瑙嗗寲銆?br />濡傛灉鎯宠嚜宸卞疄鐜版暟鎹寲鎺樼畻娉曠殑璇濓紝鍙互鐪嬩竴鐪媤eka鐨勬帴鍙f枃妗c€傚湪weka涓泦鎴愯嚜宸辩殑绠楁硶鐢氳嚦鍊熼壌瀹冪殑鏂规硶鑷繁瀹炵幇鍙鍖栧伐鍏峰苟涓嶆槸浠跺緢鍥伴毦鐨勪簨鎯呫€?nbsp;

2005骞?鏈堬紝鍦ㄧ11灞夾CM SIGKDD鍥介檯浼氳涓婏紝鎬€鍗℃墭澶у鐨刉eka灏忕粍鑽h幏浜嗘暟鎹寲鎺樺拰鐭ヨ瘑鎺㈢储棰嗗煙鐨勬渶楂樻湇鍔″锛學eka绯荤粺寰楀埌浜嗗箍娉涚殑璁ゅ彲锛岃瑾変负鏁版嵁鎸栨帢鍜屾満鍣ㄥ涔犲巻鍙蹭笂鐨勯噷绋嬬锛屾槸鐜颁粖鏈€瀹屽鐨勬暟鎹寲鎺樺伐鍏蜂箣涓€锛堝凡鏈?1骞寸殑鍙戝睍鍘嗗彶锛夈€俉eka鐨勬瘡鏈堜笅杞芥鏁板凡瓒呰繃涓囨銆?br />
--鏁寸悊鑷?a href="http://www.china-pub.com/computers/common/info.asp?id=29304" target="_blank">http://www.china-pub.com/computers/common/info.asp?id=29304

2. 鏁版嵁鏍煎紡 

宸у闅句负鏃犵背涔嬬倞銆傞鍏堟垜浠潵鐪嬬湅WEKA鎵€鐢ㄧ殑鏁版嵁搴旀槸浠€涔堟牱鐨勬牸寮忋€?nbsp;
璺熷緢澶氱數瀛愯〃鏍兼垨鏁版嵁鍒嗘瀽杞欢涓€鏍凤紝WEKA鎵€澶勭悊鐨勬暟鎹泦鏄浘1閭f牱鐨勪竴涓簩缁寸殑琛ㄦ牸銆?nbsp;
鎶€鏈垎浜? src=
鍥? 鏂扮獥鍙f墦寮€ 
杩欓噷鎴戜滑瑕佷粙缁嶄竴涓媁EKA涓殑鏈銆傝〃鏍奸噷鐨勪竴涓í琛岀О浣滀竴涓疄渚嬶紙Instance锛夛紝鐩稿綋浜庣粺璁″涓殑涓€涓牱鏈紝鎴栬€呮暟鎹簱涓殑涓€鏉¤褰曘€傜珫琛岀О浣滀竴涓睘鎬э紙Attrbute锛夛紝鐩稿綋浜庣粺璁″涓殑涓€涓彉閲忥紝鎴栬€呮暟鎹簱涓殑涓€涓瓧娈点€傝繖鏍蜂竴涓〃鏍硷紝鎴栬€呭彨鏁版嵁闆嗭紝鍦╓EKA鐪嬫潵锛屽憟鐜颁簡灞炴€т箣闂寸殑涓€绉嶅叧绯?Relation)銆傚浘1涓竴鍏辨湁14涓疄渚嬶紝5涓睘鎬э紝鍏崇郴鍚嶇О涓?ldquo;weather”銆?br />
WEKA瀛樺偍鏁版嵁鐨勬牸寮忔槸ARFF锛圓ttribute-Relation File Format锛夋枃浠讹紝杩欐槸涓€绉岮SCII鏂囨湰鏂囦欢銆傚浘1鎵€绀虹殑浜岀淮琛ㄦ牸瀛樺偍鍦ㄥ涓嬬殑ARFF鏂囦欢涓€傝繖涔熷氨鏄疻EKA鑷甫鐨?ldquo;weather.arff”鏂囦欢锛屽湪WEKA瀹夎鐩綍鐨?ldquo;data”瀛愮洰褰曚笅鍙互鎵惧埌銆?/p>

鎶€鏈垎浜? src=
闇€瑕佹敞鎰忕殑鏄紝鍦╓indows璁颁簨鏈墦寮€杩欎釜鏂囦欢鏃讹紝鍙兘浼氬洜涓哄洖杞︾瀹氫箟涓嶄竴鑷磋€屽鑷村垎琛屼笉姝e父銆傛帹鑽愪娇鐢?a name="OLE_LINK37">UltraEdit杩欐牱鐨勫瓧绗︾紪杈戣蒋浠跺療鐪婣RFF鏂囦欢鐨勫唴瀹广€?br />
涓嬮潰鎴戜滑鏉ュ杩欎釜鏂囦欢鐨勫唴瀹硅繘琛岃鏄庛€?nbsp;
璇嗗埆ARFF鏂囦欢鐨勯噸瑕佷緷鎹槸鍒嗚锛屽洜姝や笉鑳藉湪杩欑鏂囦欢閲岄殢鎰忕殑鏂銆傜┖琛岋紙鎴栧叏鏄┖鏍肩殑琛岋級灏嗚蹇界暐銆?nbsp;
浠?ldquo;%”寮€濮嬬殑琛屾槸娉ㄩ噴锛學EKA灏嗗拷鐣ヨ繖浜涜銆傚鏋滀綘鐪嬪埌鐨?ldquo;weather.arff”鏂囦欢澶氫簡鎴栧皯浜嗕簺“%”寮€濮嬬殑琛岋紝鏄病鏈夊奖鍝嶇殑銆?nbsp;
闄ゅ幓娉ㄩ噴鍚庯紝鏁翠釜ARFF鏂囦欢鍙互鍒嗕负涓や釜閮ㄥ垎銆傜涓€閮ㄥ垎缁欏嚭浜嗗ご淇℃伅锛圚ead information锛夛紝鍖呮嫭浜嗗鍏崇郴鐨勫0鏄庡拰瀵瑰睘鎬х殑澹版槑銆傜浜岄儴鍒嗙粰鍑轰簡鏁版嵁淇℃伅锛圖ata information锛夛紝鍗虫暟鎹泦涓粰鍑虹殑鏁版嵁銆備粠“@data”鏍囪寮€濮嬶紝鍚庨潰鐨勫氨鏄暟鎹俊鎭簡銆?br />
鍏崇郴澹版槑 
鍏崇郴鍚嶇О鍦ˋRFF鏂囦欢鐨勭涓€涓湁鏁堣鏉ュ畾涔夛紝鏍煎紡涓?nbsp;
@relation <relation-name> 
<relation-name>鏄竴涓瓧绗︿覆銆傚鏋滆繖涓瓧绗︿覆鍖呭惈绌烘牸锛屽畠蹇呴』鍔犱笂寮曞彿锛堟寚鑻辨枃鏍囩偣鐨勫崟寮曞彿鎴栧弻寮曞彿锛夈€?nbsp;

灞炴€у0鏄?nbsp;
灞炴€у0鏄庣敤涓€鍒椾互“@attribute”寮€澶寸殑璇彞琛ㄧず銆傛暟鎹泦涓殑姣忎竴涓睘鎬ч兘鏈夊畠瀵瑰簲鐨?ldquo;@attribute”璇彞锛屾潵瀹氫箟瀹冪殑灞炴€у悕绉板拰鏁版嵁绫诲瀷銆?nbsp;
杩欎簺澹版槑璇彞鐨勯『搴忓緢閲嶈銆傞鍏堝畠琛ㄦ槑浜嗚椤瑰睘鎬у湪鏁版嵁閮ㄥ垎鐨勪綅缃€備緥濡傦紝“humidity”鏄涓変釜琚0鏄庣殑灞炴€э紝杩欒鏄庢暟鎹儴鍒嗛偅浜涜閫楀彿鍒嗗紑鐨勫垪涓紝绗笁鍒楁暟鎹?85 90 86 96 ... 鏄浉搴旂殑“humidity”鍊笺€傚叾娆★紝鏈€鍚庝竴涓0鏄庣殑灞炴€ц绉颁綔class灞炴€э紝鍦ㄥ垎绫绘垨鍥炲綊浠诲姟涓紝瀹冩槸榛樿鐨勭洰鏍囧彉閲忋€?/strong>
灞炴€у0鏄庣殑鏍煎紡涓?nbsp;
@attribute <attribute-name> <datatype> 
鍏朵腑<attribute-name>鏄繀椤讳互瀛楁瘝寮€澶寸殑瀛楃涓层€傚拰鍏崇郴鍚嶇О涓€鏍凤紝濡傛灉杩欎釜瀛楃涓插寘鍚┖鏍硷紝瀹冨繀椤诲姞涓婂紩鍙枫€?nbsp;
WEKA鏀寔鐨?lt;datatype>鏈夊洓绉嶏紝鍒嗗埆鏄?nbsp;
numeric-------------------------鏁板€煎瀷 
<nominal-specification>-----鍒嗙被锛坣ominal锛夊瀷 
string----------------------------瀛楃涓插瀷 
date [<date-format>]--------鏃ユ湡鍜屾椂闂村瀷 
鍏朵腑<nominal-specification> 鍜?lt;date-format> 灏嗗湪涓嬮潰璇存槑銆傝繕鍙互浣跨敤涓や釜绫诲瀷“integer”鍜?ldquo;real”锛屼絾鏄疻EKA鎶婂畠浠兘褰撲綔“numeric”鐪嬪緟銆傛敞鎰?ldquo;integer”锛?ldquo;real”锛?ldquo;numeric”锛?ldquo;date”锛?ldquo;string”杩欎簺鍏抽敭瀛楁槸鍖哄垎澶у皬鍐欑殑锛岃€?ldquo;relation”“attribute ”鍜?ldquo;date”鍒欎笉鍖哄垎銆?br />
鏁板€煎睘鎬?nbsp;
鏁板€煎瀷灞炴€у彲浠ユ槸鏁存暟鎴栬€呭疄鏁帮紝浣哤EKA鎶婂畠浠兘褰撲綔瀹炴暟鐪嬪緟銆?nbsp;

鍒嗙被灞炴€?nbsp;
鍒嗙被灞炴€х敱<nominal-specification>鍒楀嚭涓€绯诲垪鍙兘鐨勭被鍒悕绉板苟鏀惧湪鑺辨嫭鍙蜂腑锛歿<nominal-name1>, <nominal-name2>, <nominal-name3>, ...} 銆傛暟鎹泦涓灞炴€х殑鍊煎彧鑳芥槸鍏朵腑涓€绉嶇被鍒€?br />渚嬪濡備笅鐨勫睘鎬у0鏄庤鏄?ldquo;outlook”灞炴€ф湁涓夌绫诲埆锛?ldquo;sunny”锛?ldquo; overcast”鍜?ldquo;rainy”銆傝€屾暟鎹泦涓瘡涓疄渚嬪搴旂殑“outlook”鍊煎繀鏄繖涓夎€呬箣涓€銆?br />@attribute outlook {sunny, overcast, rainy} 
濡傛灉绫诲埆鍚嶇О甯︽湁绌烘牸锛屼粛闇€瑕佸皢涔嬫斁鍏ュ紩鍙蜂腑銆?nbsp;

瀛楃涓插睘鎬?nbsp;
瀛楃涓插睘鎬т腑鍙互鍖呭惈浠绘剰鐨勬枃鏈€傝繖绉嶇被鍨嬬殑灞炴€у湪鏂囨湰鎸栨帢涓潪甯告湁鐢ㄣ€?nbsp;
绀轰緥锛?nbsp;
@ATTRIBUTE LCC string 

鏃ユ湡鍜屾椂闂村睘鎬?nbsp;
鏃ユ湡鍜屾椂闂村睘鎬х粺涓€鐢?ldquo;date”绫诲瀷琛ㄧず锛屽畠鐨勬牸寮忔槸 
@attribute <name> date [<date-format>] 
鍏朵腑<name>鏄繖涓睘鎬х殑鍚嶇О锛?lt;date-format>鏄竴涓瓧绗︿覆锛屾潵瑙勫畾璇ユ€庢牱瑙f瀽鍜屾樉绀烘棩鏈熸垨鏃堕棿鐨勬牸寮忥紝榛樿鐨勫瓧绗︿覆鏄疘SO-8601鎵€缁欑殑鏃ユ湡鏃堕棿缁勫悎鏍煎紡“yyyy-MM-ddTHH:mm:ss”銆?br />鏁版嵁淇℃伅閮ㄥ垎琛ㄨ揪鏃ユ湡鐨勫瓧绗︿覆蹇呴』绗﹀悎澹版槑涓瀹氱殑鏍煎紡瑕佹眰锛堜笅鏂囨湁渚嬪瓙锛夈€?nbsp;

鏁版嵁淇℃伅 
鏁版嵁淇℃伅涓?ldquo;@data”鏍囪鐙崰涓€琛岋紝鍓╀笅鐨勬槸鍚勪釜瀹炰緥鐨勬暟鎹€?nbsp;

姣忎釜瀹炰緥鍗犱竴琛屻€傚疄渚嬬殑鍚勫睘鎬у€肩敤閫楀彿“,”闅斿紑銆?strong>濡傛灉鏌愪釜灞炴€х殑鍊兼槸缂哄け鍊硷紙missing value锛夛紝鐢ㄩ棶鍙?ldquo;?”琛ㄧず锛屼笖杩欎釜闂彿涓嶈兘鐪佺暐銆備緥濡傦細
@data 
sunny,85,85,FALSE,no 
?,78,90,?,yes 


瀛楃涓插睘鎬у拰鍒嗙被灞炴€х殑鍊兼槸鍖哄垎澶у皬鍐欑殑銆傝嫢鍊间腑鍚湁绌烘牸锛屽繀椤昏寮曞彿鎷捣鏉ャ€備緥濡傦細 
@relation LCCvsLCSH 
  @attribute LCC string 
  @attribute LCSH string 
  @data 
  AG5, 鈥楨ncyclopedias and dictionaries.;Twentieth century.鈥?nbsp;
  AS262, 鈥楽cience -- Soviet Union -- History.鈥?nbsp;


鏃ユ湡灞炴€х殑鍊煎繀椤讳笌灞炴€у0鏄庝腑缁欏畾鐨勭浉涓€鑷淬€備緥濡傦細 
@RELATION Timestamps 
  @ATTRIBUTE timestamp DATE "yyyy-MM-dd HH:mm:ss" 
  @DATA 
  "2001-04-03 12:12:12" 
  "2001-05-03 12:59:55" 

绋€鐤忔暟鎹?nbsp;
鏈夌殑鏃跺€欐暟鎹泦涓惈鏈夊ぇ閲忕殑0鍊硷紙姣斿璐墿绡垎鏋愶級锛岃繖涓椂鍊欑敤绋€鐤忔牸寮忕殑鏁版嵁瀛樿串鏇村姞鐪佺┖闂淬€?nbsp;
绋€鐤忔牸寮忔槸閽堝鏁版嵁淇℃伅涓煇涓疄渚嬬殑琛ㄧず鑰岃█锛屼笉闇€瑕佷慨鏀笰RFF鏂囦欢鐨勫叾瀹冮儴鍒嗐€傜湅濡備笅鐨勬暟鎹細 
@data 
  0, X, 0, Y, "class A" 
  0, 0, W, 0, "class B" 
鐢ㄧ█鐤忔牸寮忚〃杈剧殑璇濆氨鏄?nbsp;
@data 
  {1 X, 3 Y, 4 "class A"} 
  {2 W, 4 "class B"} 
姣忎釜瀹炰緥鐢ㄨ姳鎷彿鎷捣鏉ャ€傚疄渚嬩腑姣忎竴涓潪0鐨勫睘鎬у€肩敤<index> <绌烘牸> <value>琛ㄧず銆?lt;index>鏄睘鎬х殑搴忓彿锛屼粠0寮€濮嬭锛?lt;value>鏄睘鎬у€笺€傚睘鎬у€间箣闂翠粛鐢ㄩ€楀彿闅斿紑銆傝繖閲屾瘡涓疄渚嬬殑鏁板€煎繀椤绘寜灞炴€х殑椤哄簭鏉ュ啓锛屽 {1 X, 3 Y, 4 "class A"}锛屼笉鑳藉啓鎴恵3 Y, 1 X, 4 "class A"}銆?br />娉ㄦ剰鍦ㄧ█鐤忔牸寮忎腑娌℃湁娉ㄦ槑鐨勫睘鎬у€间笉鏄己澶卞€硷紝鑰屾槸0鍊笺€傝嫢瑕佽〃绀虹己澶卞€煎繀椤绘樉寮忕殑鐢ㄩ棶鍙疯〃绀哄嚭鏉ャ€?br />
Relational鍨嬪睘鎬?nbsp;
鍦╓EKA 3.5鐗堜腑澧炲姞浜嗕竴绉嶅睘鎬х被鍨嬪彨鍋歊elational锛屾湁浜嗚繖绉嶇被鍨嬫垜浠彲浠ュ儚鍏崇郴鍨嬫暟鎹簱閭f牱澶勭悊澶氫釜缁村害浜嗐€備絾鏄繖绉嶇被鍨嬬洰鍓嶈繕涓嶈骞挎硾搴旂敤锛屾殏涓嶄綔浠嬬粛銆?br />
--鏁寸悊鑷?a href="http://www.cs.waikato.ac.nz/~ml/weka/arff.html" target="_blank">http://www.cs.waikato.ac.nz/~ml/weka/arff.html 鍜?a href="http://weka.sourceforge.net/wekadoc/index.php/en:ARFF_%283.5.3%29" target="_blank">http://weka.sourceforge.net/wekadoc/index.php/en:ARFF_%283.5.3%29

3.鏁版嵁鍑嗗 

浣跨敤WEKA浣滄暟鎹寲鎺橈紝闈复鐨勭涓€涓棶棰樺線寰€鏄垜浠殑鏁版嵁涓嶆槸ARFF鏍煎紡鐨勩€傚垢濂斤紝WEKA杩樻彁渚涗簡瀵笴SV鏂囦欢鐨勬敮鎸侊紝鑰岃繖绉嶆牸寮忔槸琚緢澶氬叾浠栬蒋浠舵墍鏀寔鐨勩€傛澶栵紝WEKA杩樻彁渚涗簡閫氳繃JDBC璁块棶鏁版嵁搴撶殑鍔熻兘銆?br />鍦ㄨ繖涓€鑺傞噷锛屾垜浠厛浠xcel鍜孧atlab涓轰緥锛岃鏄庡浣曡幏寰桟SV鏂囦欢銆傜劧鍚庢垜浠皢鐭ラ亾CSV鏂囦欢濡備綍杞寲鎴怉RFF鏂囦欢锛屾瘯绔熷悗鑰呮墠鏄疻EKA鏀寔寰楁渶濂界殑鏂囦欢鏍煎紡銆傞潰瀵逛竴涓狝RFF鏂囦欢锛屾垜浠粛鏈変竴浜涢澶勭悊瑕佸仛锛屾墠鑳借繘琛屾寲鎺樹换鍔°€?br />
.* -> .csv 
鎴戜滑缁欏嚭涓€涓狢SV鏂囦欢鐨勪緥瀛愶紙bank-data.csv锛夈€傜敤UltraEdit鎵撳紑瀹冨彲浠ョ湅鍒帮紝杩欑鏍煎紡涔熸槸涓€绉嶉€楀彿鍒嗗壊鏁版嵁鐨勬枃鏈枃浠?鍌ㄥ瓨浜嗕竴涓簩缁磋〃鏍笺€?br />
Excel鐨刋LS鏂囦欢鍙互璁╁涓簩缁磋〃鏍兼斁鍒颁笉鍚岀殑宸ヤ綔琛紙Sheet锛変腑锛屾垜浠彧鑳芥妸姣忎釜宸ヤ綔琛ㄥ瓨鎴愪笉鍚岀殑CSV鏂囦欢銆傛墦寮€涓€涓猉LS鏂囦欢骞跺垏鎹㈠埌闇€瑕佽浆鎹㈢殑宸ヤ綔琛紝鍙﹀瓨涓篊SV绫诲瀷锛岀偣“纭畾”銆?ldquo;鏄?rdquo;蹇界暐鎻愮ず鍗冲彲瀹屾垚鎿嶄綔銆?br />
鍦∕atlab涓殑浜岀淮琛ㄦ牸鏄竴涓煩闃碉紝鎴戜滑閫氳繃杩欐潯鍛戒护鎶婁竴涓煩闃靛瓨鎴怌SV鏍煎紡銆?nbsp;
csvwrite(鈥榝ilename鈥?matrixname) 
闇€瑕佹敞鎰忕殑鏄紝Matllab缁欏嚭鐨凜SV鏂囦欢寰€寰€娌℃湁灞炴€у悕锛圗xcel缁欏嚭鐨勪篃鏈夊彲鑳芥病鏈夛級銆?strong>鑰學EKA蹇呴』浠嶤SV鏂囦欢鐨勭涓€琛岃鍙栧睘鎬у悕锛屽惁鍒欏氨浼氭妸绗竴琛岀殑鍚勫睘鎬у€艰鎴愬彉閲忓悕銆?/strong>鍥犳鎴戜滑瀵逛簬Matllab缁欏嚭鐨凜SV鏂囦欢闇€瑕佺敤UltraEdit鎵撳紑锛屾墜宸ユ坊鍔犱竴琛屽睘鎬у悕銆傛敞鎰忓睘鎬у悕鐨勪釜鏁拌璺熸暟鎹睘鎬х殑涓暟涓€鑷达紝浠嶇敤閫楀彿闅斿紑銆?br />
.csv -> .arff 
灏咰SV杞崲涓篈RFF鏈€杩呮嵎鐨勫姙娉曟槸浣跨敤WEKA鎵€甯︾殑鍛戒护琛屽伐鍏枫€?nbsp;
杩愯WEKA鐨勪富绋嬪簭锛屽嚭鐜癎UI鍚庡彲浠ョ偣鍑讳笅鏂规寜閽繘鍏ョ浉搴旂殑妯″潡銆傛垜浠偣鍑昏繘鍏?ldquo;Simple CLI”妯″潡鎻愪緵鐨勫懡浠よ鍔熻兘銆傚湪鏂扮獥鍙g殑鏈€涓嬫柟锛堜笂鏂规槸涓嶈兘鍐欏瓧鐨勶級杈撳叆妗嗗啓涓?br />java weka.core.converters.CSVLoader filename.csv > filename.arff
鍗冲彲瀹屾垚杞崲銆?nbsp;
鍦╓EKA 3.5涓彁渚涗簡涓€涓?ldquo;Arff Viewer”妯″潡锛屾垜浠彲浠ョ敤瀹冩墦寮€涓€涓狢SV鏂囦欢灏嗚繘琛屾祻瑙堬紝鐒跺悗鍙﹀瓨涓篈RFF鏂囦欢銆?nbsp;
杩涘叆“Exploer”妯″潡锛屼粠涓婃柟鐨勬寜閽腑鎵撳紑CSV鏂囦欢鐒跺悗鍙﹀瓨涓篈RFF鏂囦欢浜﹀彲銆?nbsp;

“Exploer”鐣岄潰 
鎴戜滑搴旇娉ㄦ剰鍒帮紝“Exploer”杩樻彁渚涗簡寰堝鍔熻兘锛屽疄闄呬笂鍙互璇磋繖鏄疻EKA浣跨敤鏈€澶氱殑妯″潡銆傜幇鍦ㄦ垜浠厛鏉ョ啛鎮夊畠鐨勭晫闈紝鐒跺悗鍒╃敤瀹冨鏁版嵁杩涜棰勫鐞嗐€?nbsp;
鎶€鏈垎浜? src=
鍥? 鏂扮獥鍙f墦寮€ 
鍥?鏄剧ず鐨勬槸浣跨敤3.5鐗?Exploer"鎵撳紑"bank-data.csv"鐨勬儏鍐点€傛垜浠牴鎹笉鍚岀殑鍔熻兘鎶婅繖涓晫闈㈠垎鎴?涓尯鍩熴€?nbsp;
鍖哄煙1鐨勫嚑涓€夐」鍗℃槸鐢ㄦ潵鍒囨崲涓嶅悓鐨勬寲鎺樹换鍔¢潰鏉裤€傝繖涓€鑺傜敤鍒扮殑鍙湁“Preprocess”锛屽叾浠栭潰鏉跨殑鍔熻兘灏嗗湪浠ュ悗浠嬬粛銆?nbsp;
鍖哄煙2鏄竴浜涘父鐢ㄦ寜閽€傚寘鎷墦寮€鏁版嵁锛屼繚瀛樺強缂栬緫鍔熻兘銆傛垜浠湪杩欓噷鎶?bank-data.csv"鍙﹀瓨涓?bank-data.arff"銆?nbsp;
鍦ㄥ尯鍩?涓?ldquo;Choose”鏌愪釜“Filter”锛屽彲浠ュ疄鐜扮瓫閫夋暟鎹垨鑰呭鏁版嵁杩涜鏌愮鍙樻崲銆傛暟鎹澶勭悊涓昏灏卞埄鐢ㄥ畠鏉ュ疄鐜般€?nbsp;
鍖哄煙4灞曠ず浜嗘暟鎹泦鐨勪竴浜涘熀鏈儏鍐点€?nbsp;
鍖哄煙5涓垪鍑轰簡鏁版嵁闆嗙殑鎵€鏈夊睘鎬с€傚嬀閫変竴浜涘睘鎬у苟“Remove”灏卞彲浠ュ垹闄ゅ畠浠紝鍒犻櫎鍚庤繕鍙互鍒╃敤鍖哄煙2鐨?ldquo;Undo”鎸夐挳鎵惧洖銆傚尯鍩?涓婃柟鐨勪竴鎺掓寜閽槸鐢ㄦ潵瀹炵幇蹇€熷嬀閫夌殑銆?br />鍦ㄥ尯鍩?涓€変腑鏌愪釜灞炴€э紝鍒欏尯鍩?涓湁鍏充簬杩欎釜灞炴€х殑鎽樿銆傛敞鎰忓浜庢暟鍊煎睘鎬у拰鍒嗙被灞炴€э紝鎽樿鐨勬柟寮忔槸涓嶄竴鏍风殑銆傚浘涓樉绀虹殑鏄鏁板€煎睘鎬?ldquo;income”鐨勬憳瑕併€?nbsp;
鍖哄煙7鏄尯鍩?涓€変腑灞炴€х殑鐩存柟鍥俱€傝嫢鏁版嵁闆嗙殑鏈€鍚庝竴涓睘鎬э紙鎴戜滑璇磋繃杩欐槸鍒嗙被鎴栧洖褰掍换鍔$殑榛樿鐩爣鍙橀噺锛夋槸鍒嗙被鍙橀噺锛堣繖閲岀殑“pep”姝eソ鏄級锛岀洿鏂瑰浘涓殑姣忎釜闀挎柟褰㈠氨浼氭寜鐓ц鍙橀噺鐨勬瘮渚嬪垎鎴愪笉鍚岄鑹茬殑娈点€傝鎯虫崲涓垎娈电殑渚濇嵁锛屽湪鍖哄煙7涓婃柟鐨勪笅鎷夋涓€変釜涓嶅悓鐨勫垎绫诲睘鎬у氨鍙互浜嗐€備笅鎷夋閲岄€変笂“No Class”鎴栬€呬竴涓暟鍊煎睘鎬т細鍙樻垚榛戠櫧鐨勭洿鏂瑰浘銆?br />鍖哄煙8鏄姸鎬佹爮锛屽彲浠ユ煡鐪婰og浠ュ垽鏂槸鍚︽湁閿欍€傚彸杈圭殑weka楦熷湪鍔ㄧ殑璇濊鏄嶹EKA姝e湪鎵ц鎸栨帢浠诲姟銆傚彸閿偣鍑荤姸鎬佹爮杩樺彲浠ユ墽琛孞AVA鍐呭瓨鐨勫瀮鍦惧洖鏀躲€?nbsp;

棰勫鐞?nbsp;
bank-data鏁版嵁鍚勫睘鎬х殑鍚箟濡備笅锛?nbsp;
id a unique identification number 
age age of customer in years (numeric) 
sex MALE / FEMALE 
region inner_city/rural/suburban/town 
income income of customer (numeric) 
married is the customer married (YES/NO) 
children number of children (numeric) 
car does the customer own a car (YES/NO) 
save_acct does the customer have a saving account (YES/NO)
current_acct does the customer have a current account (YES/NO)
mortgage does the customer have a mortgage (YES/NO)
pep did the customer buy a PEP (Personal Equity Plan) after the last mailing (YES/NO)

閫氬父瀵逛簬鏁版嵁鎸栨帢浠诲姟鏉ヨ锛孖D杩欐牱鐨勪俊鎭槸鏃犵敤鐨勶紝鎴戜滑灏嗕箣鍒犻櫎銆傚湪鍖哄煙5鍕鹃€夊睘鎬?ldquo;id”锛屽苟鐐瑰嚮“Remove”銆傚皢鏂扮殑鏁版嵁闆嗕繚瀛樹竴娆★紝骞剁敤UltraEdit鎵撳紑杩欎釜ARFF鏂囦欢銆傛垜浠彂鐜帮紝鍦ㄥ睘鎬у0鏄庨儴鍒嗭紝WEKA宸茬粡涓烘瘡涓睘鎬ч€夊ソ浜嗗悎閫傜殑绫诲瀷銆?br />
鎴戜滑鐭ラ亾锛屾湁浜涚畻娉曪紝鍙兘澶勭悊鎵€鏈夌殑灞炴€ч兘鏄垎绫诲瀷鐨勬儏鍐点€傝繖鏃跺€欐垜浠氨闇€瑕佸鏁板€煎瀷鐨勫睘鎬ц繘琛岀鏁e寲銆傚湪杩欎釜鏁版嵁闆嗕腑鏈?涓彉閲忔槸鏁板€煎瀷鐨勶紝鍒嗗埆鏄?ldquo;age”锛?ldquo;income”鍜?ldquo;children”銆?br />鍏朵腑“children”鍙湁4涓彇鍊硷細0锛?锛?锛?銆傝繖鏃舵垜浠湪UltraEdit涓洿鎺ヤ慨鏀笰RFF鏂囦欢锛屾妸 
@attribute children numeric 
鏀逛负 
@attribute children {0,1,2,3} 
灏卞彲浠ヤ簡銆?nbsp;
鍦?ldquo;Explorer”涓噸鏂版墦寮€“bank-data.arff”锛岀湅鐪嬮€変腑“children”灞炴€у悗锛屽尯鍩?閭i噷鏄剧ず鐨?ldquo;Type”鏄笉鏄彉鎴?ldquo;Nominal”浜嗭紵

“age”鍜?ldquo;income”鐨勭鏁e寲鎴戜滑闇€瑕佸€熷姪WEKA涓悕涓?ldquo;Discretize”鐨凢ilter鏉ュ畬鎴愩€傚湪鍖哄煙2涓偣“Choose”锛屽嚭鐜颁竴妫?ldquo;Filter鏍?rdquo;锛岄€愮骇鎵惧埌“weka.filters.unsupervised.attribute.Discretize”锛岀偣鍑汇€傝嫢鏃犳硶鍏抽棴杩欎釜鏍戯紝鍦ㄦ爲涔嬪鐨勫湴鏂圭偣鍑?ldquo;Explorer”闈㈡澘鍗冲彲銆?br />鐜板湪“Choose”鏃佽竟鐨勬枃鏈搴旇鏄剧ず“Discretize -B 10 -M -0.1 -R first-last”銆?鐐瑰嚮杩欎釜鏂囨湰妗嗕細寮瑰嚭鏂扮獥鍙d互淇敼绂绘暎鍖栫殑鍙傛暟銆?br />鎴戜滑涓嶆墦绠楀鎵€鏈夌殑灞炴€х鏁e寲锛屽彧鏄拡瀵瑰绗?涓拰绗?涓睘鎬э紙瑙佸尯鍩?灞炴€у悕宸﹁竟鐨勬暟瀛楋級锛屾晠鎶奱ttributeIndices鍙宠竟鏀规垚“1,4”銆傝鍒掓妸杩欎袱涓睘鎬ч兘鍒嗘垚3娈碉紝浜庢槸鎶?ldquo;bins”鏀规垚“3”銆傚叾瀹冩閲屼笉鐢ㄦ洿鏀癸紝鍏充簬瀹冧滑鐨勬剰鎬濆彲浠ョ偣“More”鏌ョ湅銆傜偣“OK”鍥炲埌“Explorer”锛屽彲浠ョ湅鍒?ldquo;age”鍜?ldquo;income”宸茬粡琚鏁e寲鎴愬垎绫诲瀷鐨勫睘鎬с€傝嫢鎯虫斁寮冪鏁e寲鍙互鐐瑰尯鍩?鐨?ldquo;Undo”銆?br />濡傛灉瀵?ldquo;"(-inf-34.333333]"”杩欐牱鏅︽订鐨勬爣璇嗕笉婊★紝鎴戜滑鍙互鐢║ltraEdit鎵撳紑淇濆瓨鍚庣殑ARFF鏂囦欢锛屾妸鎵€鏈夌殑“鈥榎鈥?-inf-34.333333]\鈥樷€?rdquo;鏇挎崲鎴?ldquo;0_34”銆傚叾瀹冩爣璇嗗仛绫讳技鍦版墜鍔ㄦ浛鎹€?br />
缁忚繃涓婅堪鎿嶄綔寰楀埌鐨勬暟鎹泦鎴戜滑淇濆瓨涓?a href="http://maya.cs.depaul.edu/~classes/ect584/WEKA/data/bank-data-final.arff" target="_blank">bank-data-final.arff銆?br />
----鏁寸悊鑷?a href="http://maya.cs.depaul.edu/~classes/ect584/WEKA/preprocess.html" target="_blank">http://maya.cs.depaul.edu/~classes/ect584/WEKA/preprocess.html


4. 鍏宠仈瑙勫垯锛堣喘鐗╃鍒嗘瀽锛?/strong> 
娉ㄦ剰锛氱洰鍓嶏紝WEKA鐨勫叧鑱旇鍒欏垎鏋愬姛鑳戒粎鑳界敤鏉ヤ綔绀鸿寖锛屼笉閫傚悎鐢ㄦ潵鎸栨帢澶у瀷鏁版嵁闆嗐€?nbsp;

鎴戜滑鎵撶畻瀵瑰墠闈㈢殑“bank-data”鏁版嵁浣滃叧鑱旇鍒欑殑鍒嗘瀽銆傜敤“Explorer”鎵撳紑“bank-data-final.arff”鍚庯紝鍒囨崲鍒?ldquo;Associate”閫夐」鍗°€傞粯璁ゅ叧鑱旇鍒欏垎鏋愭槸鐢ˋpriori绠楁硶锛屾垜浠氨鐢ㄨ繖涓畻娉曪紝浣嗘槸鐐?ldquo;Choose”鍙宠竟鐨勬枃鏈淇敼榛樿鐨勫弬鏁帮紝寮瑰嚭鐨勭獥鍙d腑鐐?ldquo;More”鍙互鐪嬪埌鍚勫弬鏁扮殑璇存槑銆?br />
鑳屾櫙鐭ヨ瘑 
棣栧厛鎴戜滑鏉ユ俯涔犱竴涓婣priori鐨勬湁鍏崇煡璇嗐€傚浜庝竴鏉″叧鑱旇鍒橪->R锛屾垜浠父鐢ㄦ敮鎸佸害锛圫upport锛夊拰缃俊搴︼紙Confidence锛夋潵琛¢噺瀹冪殑閲嶈鎬с€傝鍒欑殑鏀寔搴︽槸鐢ㄦ潵浼拌鍦ㄤ竴涓喘鐗╃涓悓鏃惰瀵熷埌L鍜孯鐨勬鐜嘝(L,R)锛岃€岃鍒欑殑缃俊搴︽槸浼拌璐墿鏍忎腑鍑虹幇浜哃鏃朵篃鍑轰細鐜癛鐨勬潯浠舵鐜嘝(R|L)銆傚叧鑱旇鍒欑殑鐩爣涓€鑸槸浜х敓鏀寔搴﹀拰缃俊搴﹂兘杈冮珮鐨勮鍒欍€?br />鏈夊嚑涓被浼肩殑搴﹂噺浠f浛缃俊搴︽潵琛¢噺瑙勫垯鐨勫叧鑱旂▼搴︼紝瀹冧滑鍒嗗埆鏄?nbsp;
Lift锛堟彁鍗囧害锛燂級锛?nbsp;P(L,R)/(P(L)P(R)) 
Lift=1鏃惰〃绀篖鍜孯鐙珛銆傝繖涓暟瓒婂ぇ锛岃秺琛ㄦ槑L鍜孯瀛樺湪鍦ㄤ竴涓喘鐗╃涓笉鏄伓鐒剁幇璞°€?nbsp;
Leverage锛堜笉鐭ラ亾鎬庝箞缈昏瘧锛夛細P(L,R)-P(L)P(R) 
瀹冨拰Lift鐨勫惈涔夊樊涓嶅銆侺everage=0鏃禠鍜孯鐙珛锛孡everage瓒婂ぇL鍜孯鐨勫叧绯昏秺瀵嗗垏銆?nbsp;
Conviction锛堟洿涓嶇煡閬撹瘧浜嗭級锛歅(L)P(!R)/P(L,!R) 锛?R琛ㄧずR娌℃湁鍙戠敓锛?br />Conviction涔熸槸鐢ㄦ潵琛¢噺L鍜孯鐨勭嫭绔嬫€с€備粠瀹冨拰lift鐨勫叧绯伙紙瀵筊鍙栧弽锛屼唬鍏ift鍏紡鍚庢眰鍊掓暟锛夊彲浠ョ湅鍑猴紝鎴戜滑涔熷笇鏈涜繖涓€艰秺澶ц秺濂姐€?nbsp;
鍊煎緱娉ㄦ剰鐨勬槸锛岀敤Lift鍜孡everage浣滄爣鍑嗘椂锛孡鍜孯鏄绉扮殑锛孋onfidence鍜孋onviction鍒欎笉鐒躲€?nbsp;

鍙傛暟璁剧疆 
鐜板湪鎴戜滑璁″垝鎸栨帢鍑烘敮鎸佸害鍦?0%鍒?00%涔嬮棿锛屽苟涓攍ift鍊艰秴杩?.5涓攍ift鍊兼帓鍦ㄥ墠100浣嶇殑閭d簺鍏宠仈瑙勫垯銆傛垜浠妸“lowerBoundMinSupport”鍜?ldquo;upperBoundMinSupport”鍒嗗埆璁句负0.1鍜?锛?ldquo;metricType”璁句负lift锛?ldquo;minMetric”璁句负1.5锛?ldquo;numRules”璁句负100銆傚叾浠栭€夐」淇濇寔榛樿鍗冲彲銆?ldquo;OK” 涔嬪悗鍦?ldquo;Explorer”涓偣鍑?ldquo;Start”寮€濮嬭繍琛岀畻娉曪紝鍦ㄥ彸杈圭獥鍙f樉绀烘暟鎹泦鎽樿鍜屾寲鎺樼粨鏋溿€?br />
涓嬮潰鏄寲鎺樺嚭鏉ョ殑lift鎺掑墠5鐨勮鍒欍€?nbsp;
Best rules found: 
1. age=52_max save_act=YES current_act=YES 113 ==> income=43759_max 61 conf:(0.54) < lift:(4.05)> lev:(0.0 [45] conv:(1.85)
  2. income=43759_max 80 ==> age=52_max save_act=YES current_act=YES 61 conf:(0.76) < lift:(4.05)> lev:(0.0 [45] conv:(3.25)
  3. income=43759_max current_act=YES 63 ==> age=52_max save_act=YES 61 conf:(0.97) < lift:(3.85)> lev:(0.0 [45] conv:(15.72)
  4. age=52_max save_act=YES 151 ==> income=43759_max current_act=YES 61 conf:(0.4) < lift:(3.85)> lev:(0.0 [45] conv:(1.49)
  5. age=52_max save_act=YES 151 ==> income=43759_max 76 conf:(0.5) < lift:(3.77)> lev:(0.09) [55] conv:(1.72)
瀵逛簬鎸栨帢鍑虹殑姣忔潯瑙勫垯锛學EKA鍒楀嚭浜嗗畠浠叧鑱旂▼搴︾殑鍥涢」鎸囨爣銆?nbsp;

鍛戒护琛屾柟寮?nbsp;
鎴戜滑涔熷彲浠ュ埄鐢ㄥ懡浠よ鏉ュ畬鎴愭寲鎺樹换鍔★紝鍦?ldquo;Simlpe CLI”妯″潡涓緭鍏ュ涓嬫牸寮忕殑鍛戒护锛?nbsp;
java weka.associations.Apriori options -t directory-path\bank-data-final.arff
鍗冲彲瀹屾垚Apriori绠楁硶銆傛敞鎰忥紝“-t”鍙傛暟鍚庣殑鏂囦欢璺緞涓笉鑳藉惈鏈夌┖鏍笺€?nbsp;
鍦ㄥ墠闈㈡垜浠娇鐢ㄧ殑option涓?nbsp;
-N 100 -T 1 -C 1.5 -D 0.05 -U 1.0 -M 0.1 -S -1.0 鍛戒护琛屼腑浣跨敤杩欎簺鍙傛暟寰楀埌鐨勭粨鏋滃拰鍓嶉潰鍒╃敤GUI寰楀埌鐨勪竴鏍枫€?nbsp;
鎴戜滑杩樺彲浠ュ姞涓?ldquo;- I”鍙傛暟锛屽緱鍒颁笉鍚岄」鏁扮殑棰戠箒椤归泦銆傛垜鐢ㄧ殑鍛戒护濡備笅锛?nbsp;
java weka.associations.Apriori -N 100 -T 1 -C 1.5 -D 0.05 -U 1.0 -M 0.1 -S -1.0 -I -t d:\weka\bank-data-final.arff
鎸栨帢缁撴灉鍦ㄤ笂鏂规樉绀猴紝搴旀槸杩欎釜鏂囦欢鐨勬牱瀛愩€?br />
----鏁寸悊鑷?a href="http://maya.cs.depaul.edu/~classes/ect584/WEKA/associate.html" target="_blank">http://maya.cs.depaul.edu/~classes/ect584/WEKA/associate.html

5. 鍒嗙被涓庡洖褰?/strong> 

鑳屾櫙鐭ヨ瘑 
WEKA鎶婂垎绫?Classification)鍜屽洖褰?Regression)閮芥斁鍦?ldquo;Classify”閫夐」鍗′腑锛岃繖鏄湁鍘熷洜鐨勩€?nbsp;
鍦ㄨ繖涓や釜浠诲姟涓紝閮芥湁涓€涓洰鏍囧睘鎬э紙杈撳嚭鍙橀噺锛夈€傛垜浠笇鏈涙牴鎹竴涓牱鏈?WEKA涓О浣滃疄渚?鐨勪竴缁勭壒寰侊紙杈撳叆鍙橀噺锛夛紝瀵圭洰鏍囪繘琛岄娴嬨€備负浜嗗疄鐜拌繖涓€鐩殑锛屾垜浠渶瑕佹湁涓€涓缁冩暟鎹泦锛岃繖涓暟鎹泦涓瘡涓疄渚嬬殑杈撳叆鍜岃緭鍑洪兘鏄凡鐭ョ殑銆傝瀵熻缁冮泦涓殑瀹炰緥锛屽彲浠ュ缓绔嬭捣棰勬祴鐨勬ā鍨嬨€傛湁浜嗚繖涓ā鍨嬶紝鎴戜滑灏卞彲浠ユ柊鐨勮緭鍑烘湭鐭ョ殑瀹炰緥杩涜棰勬祴浜嗐€傝 閲忔ā鍨嬬殑濂藉潖灏卞湪浜庨娴嬬殑鍑嗙‘绋嬪害銆?br />鍦╓EKA涓紝寰呴娴嬬殑鐩爣锛堣緭鍑猴級琚О浣淐lass灞炴€э紝杩欏簲璇ユ槸鏉ヨ嚜鍒嗙被浠诲姟鐨?ldquo;绫?rdquo;銆?strong>涓€鑸殑锛岃嫢Class灞炴€ф槸鍒嗙被鍨嬫椂鎴戜滑鐨勪换鍔℃墠鍙垎绫伙紝Class灞炴€ф槸鏁板€煎瀷鏃舵垜浠殑浠诲姟鍙洖褰掋€?br />

閫夋嫨绠楁硶 
杩欎竴鑺備腑锛屾垜浠娇鐢–4.5鍐崇瓥鏍戠畻娉曞bank-data寤虹珛璧峰垎绫绘ā鍨嬨€?nbsp;
鎴戜滑鏉ョ湅鍘熸潵鐨?ldquo;bank-data.csv”鏂囦欢銆?ldquo;ID”灞炴€ц偗瀹氭槸涓嶉渶瑕佺殑銆傜敱浜嶤4.5绠楁硶鍙互澶勭悊鏁板€煎瀷鐨勫睘鎬э紝鎴戜滑涓嶇敤鍍忓墠闈㈢敤鍏宠仈瑙勫垯閭f牱鎶婃瘡涓彉閲忛兘绂绘暎鍖栨垚鍒嗙被鍨嬨€傚敖绠″姝わ紝鎴戜滑杩樻槸鎶?ldquo;Children”灞炴€ц浆鎹㈡垚鍒嗙被鍨嬬殑涓や釜鍊?ldquo;YES”鍜?ldquo;NO”銆傚彟澶栵紝鎴戜滑鐨勮缁冮泦浠呭彇鍘熸潵鏁版嵁闆嗗疄渚嬬殑涓€鍗婏紱鑰屼粠鍙﹀涓€鍗婁腑鎶藉嚭鑻ュ共鏉′綔涓哄緟棰勬祴鐨勫疄渚嬶紝瀹冧滑鐨?ldquo;pep”灞炴€ч兘璁句负缂哄け鍊笺€傜粡杩囦簡杩欎簺澶勭悊鐨勮缁冮泦鏁版嵁鍦?a href="http://maya.cs.depaul.edu/~classes/ect584/WEKA/classify/bank.arff" target="_blank">杩欓噷涓嬭浇锛涘緟棰勬祴闆嗘暟鎹湪杩欓噷涓嬭浇銆?br />
鎴戜滑鐢?ldquo;Explorer”鎵撳紑璁粌闆?ldquo;bank.arff”锛岃瀵熶竴涓嬪畠鏄笉鏄寜鐓у墠闈㈢殑瑕佹眰澶勭悊濂戒簡銆傚垏鎹㈠埌“Classify”閫夐」鍗★紝鐐瑰嚮“Choose”鎸夐挳鍚庡彲浠ョ湅鍒板緢澶氬垎绫绘垨鑰呭洖褰掔殑绠楁硶鍒嗛棬鍒被鐨勫垪鍦ㄤ竴涓爲鍨嬫閲屻€?.5鐗堢殑WEKA涓紝鏍戝瀷妗嗕笅鏂规湁涓€涓?ldquo;Filter...”鎸夐挳锛岀偣鍑诲彲浠ユ牴鎹暟鎹泦鐨勭壒鎬ц繃婊ゆ帀涓嶅悎閫傜殑绠楁硶銆傛垜浠暟鎹泦鐨勮緭鍏ュ睘鎬т腑鏈?ldquo;Binary”鍨嬶紙鍗冲彧鏈変袱涓被鐨勫垎绫诲瀷锛夊拰鏁板€煎瀷鐨勫睘鎬э紝鑰孋lass鍙橀噺鏄?ldquo;Binary”鐨勶紱浜庢槸鎴戜滑鍕鹃€?ldquo;Binary attributes”“Numeric attributes”鍜?ldquo;Binary class”銆傜偣“OK”鍚庡洖鍒版爲褰㈠浘锛屽彲浠ュ彂鐜颁竴浜涚畻娉曞悕绉板彉绾簡锛岃鏄庡畠浠笉鑳界敤銆傞€夋嫨“trees”涓嬬殑“J48”锛岃繖灏辨槸鎴戜滑闇€瑕佺殑C4.5绠楁硶锛岃繕濂藉畠娌℃湁鍙樼孩銆?br />鐐瑰嚮“Choose”鍙宠竟鐨勬枃鏈锛屽脊鍑烘柊绐楀彛涓鸿绠楁硶璁剧疆鍚勭鍙傛暟銆傜偣“More”鏌ョ湅鍙傛暟璇存槑锛岀偣“Capabilities”鏄煡鐪嬬畻娉曢€傜敤鑼冨洿銆傝繖閲屾垜浠妸鍙傛暟淇濇寔榛樿銆?br />鐜板湪鏉ョ湅宸︿腑鐨?ldquo;Test Option”銆傛垜浠病鏈変笓闂ㄨ缃楠屾暟鎹泦锛屼负浜嗕繚璇佺敓鎴愮殑妯″瀷鐨勫噯纭€ц€屼笉鑷充簬鍑虹幇杩囨嫙鍚堬紙overfitting锛夌殑鐜拌薄锛屾垜浠湁蹇呰閲囩敤10鎶樹氦鍙夐獙璇侊紙10-fold cross validation锛夋潵閫夋嫨鍜岃瘎浼版ā鍨嬨€傝嫢涓嶆槑鐧戒氦鍙夐獙璇佺殑鍚箟鍙互Google涓€涓嬨€?br />
寤烘ā缁撴灉 
OK锛岄€変笂“Cross-validation”骞跺湪“Folds”妗嗗~涓?ldquo;10”銆傜偣“Start”鎸夐挳寮€濮嬭绠楁硶鐢熸垚鍐崇瓥鏍戞ā鍨嬨€傚緢蹇紝鐢ㄦ枃鏈〃绀虹殑涓€妫靛喅绛栨爲锛屼互鍙婂杩欎釜鍐崇瓥鏍戠殑璇樊鍒嗘瀽绛夌瓑缁撴灉鍑虹幇鍦ㄥ彸杈圭殑“Classifier output”涓€傚悓鏃跺乏涓嬬殑“Results list”鍑虹幇浜嗕竴涓」鐩樉绀哄垰鎵嶇殑鏃堕棿鍜岀畻娉曞悕绉般€傚鏋滄崲涓€涓ā鍨嬫垨鑰呮崲涓弬鏁帮紝閲嶆柊“Start”涓€娆★紝鍒?ldquo;Results list”鍙堜細澶氬嚭涓€椤广€?br />
鎴戜滑鐪嬪埌“J48”绠楁硶浜ゅ弶楠岃瘉鐨勭粨鏋滀箣涓€涓?nbsp;
Correctly Classified Instances 206 68.6667 % 
涔熷氨鏄杩欎釜妯″瀷鐨勫噯纭害鍙湁69%宸﹀彸銆備篃璁告垜浠渶瑕佸鍘熷睘鎬ц繘琛屽鐞嗭紝鎴栬€呬慨鏀圭畻娉曠殑鍙傛暟鏉ユ彁楂樺噯纭害銆備絾杩欓噷鎴戜滑涓嶇瀹冿紝缁х画鐢ㄨ繖涓ā鍨嬨€?nbsp;

鍙抽敭鐐瑰嚮“Results list”鍒氭墠鍑虹幇鐨勯偅涓€椤癸紝寮瑰嚭鑿滃崟涓€夋嫨“Visualize tree”锛屾柊绐楀彛閲屽彲浠ョ湅鍒板浘褰㈡ā寮忕殑鍐崇瓥鏍戙€傚缓璁妸杩欎釜鏂扮獥鍙f渶澶у寲锛岀劧鍚庣偣鍙抽敭锛岄€?ldquo;Fit to screen”锛屽彲浠ユ妸杩欎釜鏍戠湅娓呮浜涖€傜湅瀹屽悗鎴浘鎴栬€呭叧鎺?/strong>

杩欓噷鎴戜滑瑙i噴涓€涓?ldquo;Confusion Matrix”鐨勫惈涔夈€?nbsp;
=== Confusion Matrix === 
  a b <-- classified as 
  74 64 | a = YES 
  30 132 | b = NO 
杩欎釜鐭╅樀鏄锛屽師鏈?ldquo;pep”鏄?ldquo;YES”鐨勫疄渚嬶紝鏈?4涓姝g‘鐨勯娴嬩负“YES”锛屾湁64涓敊璇殑棰勬祴鎴愪簡“NO”锛涘師鏈?ldquo;pep”鏄?ldquo;NO”鐨勫疄渚嬶紝鏈?0涓閿欒鐨勯娴嬩负“YES”锛屾湁132涓纭殑棰勬祴鎴愪簡“NO”銆?4+64+30+132 = 300鏄疄渚嬫€绘暟锛岃€?74+132)/300 = 0.68667姝eソ鏄纭垎绫荤殑瀹炰緥鎵€鍗犳瘮渚嬨€傝繖涓煩闃靛瑙掔嚎涓婄殑鏁板瓧瓒婂ぇ锛岃鏄庨娴嬪緱瓒婂ソ銆?br />
妯″瀷搴旂敤 
鐜板湪鎴戜滑瑕佺敤鐢熸垚鐨勬ā鍨嬪閭d簺寰呴娴嬬殑鏁版嵁闆嗚繘琛岄娴嬩簡銆傛敞鎰忓緟棰勬祴鏁版嵁闆嗗拰璁粌鐢ㄦ暟鎹泦鍚勪釜灞炴€х殑璁剧疆蹇呴』鏄竴鑷寸殑銆傚嵆浣夸綘娌℃湁寰呴娴嬫暟鎹泦鐨凜lass灞炴€х殑鍊硷紝浣犱篃瑕佹坊鍔犺繖涓睘鎬э紝鍙互灏嗚灞炴€у湪鍚勫疄渚嬩笂鐨勫€煎潎璁炬垚缂哄け鍊笺€?br />鍦?ldquo;Test Opion”涓€夋嫨“Supplied test set”锛屽苟涓?ldquo;Set”鎴愪綘瑕佸簲鐢ㄦā鍨嬬殑鏁版嵁闆嗭紝杩欓噷鏄?ldquo;bank-new.arff”鏂囦欢銆?br />鐜板湪锛屽彸閿偣鍑?ldquo;Result list”涓垰浜х敓鐨勯偅涓€椤癸紝閫夋嫨“Re-evaluate model on current test set”銆傚彸杈规樉绀虹粨鏋滅殑鍖哄煙涓細澧炲姞涓€浜涘唴瀹癸紝鍛婅瘔浣犺妯″瀷搴旂敤鍦ㄨ繖涓暟鎹泦涓婅〃鐜板皢濡備綍銆傚鏋滀綘鐨凜lass灞炴€ч兘鏄簺缂哄け鍊硷紝閭h繖浜涘唴瀹规槸鏃犳剰涔夌殑锛屾垜浠叧娉ㄧ殑鏄ā鍨嬪湪鏂版暟鎹泦涓婄殑棰勬祴鍊笺€?br />鐜板湪鐐瑰嚮鍙抽敭鑿滃崟涓殑“Visualize classifier errors”锛屽皢寮瑰嚭涓€涓柊绐楀彛鏄剧ず涓€浜涙湁鍏抽娴嬭宸殑鏁g偣鍥俱€傜偣鍑昏繖涓柊绐楀彛涓殑“Save”鎸夐挳锛屼繚瀛樹竴涓狝rff鏂囦欢銆傛墦寮€杩欎釜鏂囦欢鍙互鐪嬪埌鍦ㄥ€掓暟绗簩涓綅缃浜嗕竴涓睘鎬э紙predictedpep锛夛紝杩欎釜灞炴€т笂鐨勫€煎氨鏄ā鍨嬪姣忎釜瀹炰緥鐨勯娴嬪€笺€?br />
浣跨敤鍛戒护琛岋紙鎺ㄨ崘锛?nbsp;
铏界劧浣跨敤鍥惧舰鐣岄潰鏌ョ湅缁撴灉鍜岃缃弬鏁板緢鏂逛究锛屼絾鏄渶鐩存帴鏈€鐏垫椿鐨勫缓妯″強搴旂敤鐨勫姙娉曚粛鏄娇鐢ㄥ懡浠よ銆?nbsp;
鎵撳紑“Simple CLI”妯″潡锛屽儚涓婇潰閭f牱浣跨敤“J48”绠楁硶鐨勫懡浠ゆ牸寮忎负锛?nbsp;
java weka.classifiers.trees.J48 -C 0.25 -M 2 -t directory-path\bank.arff -d directory-path \bank.model
鍏朵腑鍙傛暟“ -C 0.25”鍜?ldquo;-M 2”鏄拰鍥惧舰鐣岄潰涓墍璁剧殑涓€鏍风殑銆?ldquo;-t ”鍚庨潰璺熺潃鐨勬槸璁粌鏁版嵁闆嗙殑瀹屾暣璺緞锛堝寘鎷洰褰曞拰鏂囦欢鍚嶏級锛?ldquo;-d ”鍚庨潰璺熺潃鐨勬槸淇濆瓨妯″瀷鐨勫畬鏁磋矾寰勩€傛敞鎰忥紒杩欓噷鎴戜滑鍙互鎶婃ā鍨嬩繚瀛樹笅鏉ャ€?br />杈撳叆涓婅堪鍛戒护鍚庯紝鎵€寰楀埌鏍戞ā鍨嬪拰璇樊鍒嗘瀽浼氬湪“Simple CLI”涓婃柟鏄剧ず锛屽彲浠ュ鍒朵笅鏉ヤ繚瀛樺湪鏂囨湰鏂囦欢閲屻€傝宸槸鎶婃ā鍨嬪簲鐢ㄥ埌璁粌闆嗕笂缁欏嚭鐨勩€?br />鎶婅繖涓ā鍨嬪簲鐢ㄥ埌“bank-new.arff”鎵€鐢ㄥ懡浠ょ殑鏍煎紡涓猴細 
java weka.classifiers.trees.J48 -p 9 -l directory-path\bank.model -T directory-path \bank-new.arff
鍏朵腑“-p 9”璇寸殑鏄ā鍨嬩腑鐨勫緟棰勬祴灞炴€х殑鐪熷疄鍊煎瓨鍦ㄧ9涓紙涔熷氨鏄?ldquo;pep”锛夊睘鎬т腑锛岃繖閲屽畠浠叏閮ㄦ湭鐭ュ洜姝ゅ叏閮ㄧ敤缂哄け鍊间唬鏇裤€?ldquo;-l”鍚庨潰鏄ā鍨嬬殑瀹屾暣璺緞銆?ldquo;-T”鍚庨潰鏄緟棰勬祴鏁版嵁闆嗙殑瀹屾暣璺緞銆?br />杈撳叆涓婅堪鍛戒护鍚庯紝鍦?ldquo;Simple CLI”涓婃柟浼氭湁杩欐牱涓€浜涚粨鏋滐細 
0 YES 0.75 ? 
1 NO 0.7272727272727273 ? 
2 YES 0.95 ? 
3 YES 0.8813559322033898 ? 
4 NO 0.8421052631578947 ? 
... 
杩欓噷鐨勭涓€鍒楀氨鏄垜浠彁鍒拌繃鐨?ldquo;Instance_number”锛岀浜屽垪灏辨槸鍒氭墠鐨?ldquo;predictedpep”锛岀鍥涘垪鍒欐槸“bank-new.arff”涓師鏉ョ殑“pep”鍊硷紙杩欓噷閮芥槸“?”缂哄け鍊硷級銆傜涓夊垪瀵归娴嬬粨鏋滅殑缃俊搴︼紙confidence 锛夈€傛瘮濡傝瀵逛簬瀹炰緥0锛屾垜浠湁75%鐨勬妸鎻¤瀹冪殑“pep”鐨勫€间細鏄?ldquo;YES”锛屽瀹炰緥4鎴戜滑鏈?4.2%鐨勬妸鎻¤瀹冪殑“pep”鍊间細鏄?ldquo;NO”銆?br />鎴戜滑鐪嬪埌锛屼娇鐢ㄥ懡浠よ鑷冲皯鏈変袱涓ソ澶勩€備竴涓槸鍙互鎶婃ā鍨嬩繚瀛樹笅鏉ワ紝杩欐牱鏈夋柊鐨勫緟棰勬祴鏁版嵁鍑虹幇鏃讹紝涓嶇敤姣忔閲嶆柊寤烘ā锛岀洿鎺ュ簲鐢ㄤ繚瀛樺ソ鐨勬ā鍨嬪嵆鍙€傚彟涓€涓槸瀵归娴嬬粨鏋滅粰鍑轰簡缃俊搴︼紝鎴戜滑鍙互鏈夐€夋嫨鐨勯噰绾抽娴嬬粨鏋滐紝渚嬪锛屽彧鑰冭檻閭d簺缃俊搴﹀湪85%浠ヤ笂鐨勭粨鏋溿€?br />
----鏁寸悊鑷?a href="http://maya.cs.depaul.edu/~classes/ect584/WEKA/classify.html" target="_blank">http://maya.cs.depaul.edu/~classes/ect584/WEKA/classify.html



6. 鑱氱被鍒嗘瀽 

鍘熺悊涓庡疄鐜?nbsp;
鑱氱被鍒嗘瀽涓殑“绫?rdquo;锛坈luster锛夊拰鍓嶉潰鍒嗙被鐨?ldquo;绫?rdquo;锛坈lass锛夋槸涓嶅悓鐨勶紝瀵筩luster鏇村姞鍑嗙‘鐨勭炕璇戝簲璇ユ槸“绨?rdquo;銆傝仛绫荤殑浠诲姟鏄妸鎵€鏈夌殑瀹炰緥鍒嗛厤鍒拌嫢骞茬殑绨囷紝浣垮緱鍚屼竴涓皣鐨勫疄渚嬭仛闆嗗湪涓€涓皣涓績鐨勫懆鍥达紝瀹冧滑涔嬮棿璺濈鐨勬瘮杈冭繎锛涜€屼笉鍚岀皣瀹炰緥涔嬮棿鐨勮窛绂绘瘮杈冭繙銆傚浜庣敱鏁板€煎瀷灞炴€у埢鐢荤殑瀹炰緥鏉ヨ锛岃繖涓窛绂婚€氬父鎸囨姘忚窛绂汇€?br />鐜板湪鎴戜滑瀵瑰墠闈㈢殑“bank data”浣滆仛绫诲垎鏋愶紝浣跨敤鏈€甯歌鐨凨鍧囧€硷紙K-means锛夌畻娉曘€備笅闈㈡垜浠畝鍗曟弿杩颁竴涓婯鍧囧€艰仛绫荤殑姝ラ銆?br />K鍧囧€肩畻娉曢鍏堥殢鏈虹殑鎸囧畾K涓皣涓績銆傜劧鍚庯細1)灏嗘瘡涓疄渚嬪垎閰嶅埌璺濆畠鏈€杩戠殑绨囦腑蹇冿紝寰楀埌K涓皣锛?)璁″垎鍒绠楀悇绨囦腑鎵€鏈夊疄渚嬬殑鍧囧€硷紝鎶婂畠浠綔涓哄悇绨囨柊鐨勭皣涓績銆傞噸澶?)鍜?)锛岀洿鍒癒涓皣涓績鐨勪綅缃兘鍥哄畾锛岀皣鐨勫垎閰嶄篃鍥哄畾銆?br />
涓婅堪K鍧囧€肩畻娉曞彧鑳藉鐞嗘暟鍊煎瀷鐨勫睘鎬э紝閬囧埌鍒嗙被鍨嬬殑灞炴€ф椂瑕佹妸瀹冨彉涓鸿嫢骞蹭釜鍙栧€?鍜?鐨勫睘鎬с€俉EKA灏嗚嚜鍔ㄥ疄鏂借繖涓垎绫诲瀷鍒版暟鍊煎瀷鐨勫彉鎹紝鑰屼笖WEKA浼氳嚜鍔ㄥ鏁板€煎瀷鐨勬暟鎹綔鏍囧噯鍖栥€傚洜姝わ紝瀵逛簬鍘熷鏁版嵁“bank-data.csv”锛屾垜浠墍鍋氱殑棰勫鐞嗗彧鏄垹鍘诲睘鎬?ldquo;id”锛屼繚瀛樹负ARFF鏍煎紡鍚庯紝淇敼灞炴€?ldquo;children”涓哄垎绫诲瀷銆傝繖鏍峰緱鍒扮殑鏁版嵁鏂囦欢涓?ldquo;bank.arff”锛屽惈600鏉″疄渚嬨€?br />
鐢?ldquo;Explorer”鎵撳紑鍒氭墠寰楀埌鐨?ldquo;bank.arff”锛屽苟鍒囨崲鍒?ldquo;Cluster”銆傜偣“Choose”鎸夐挳閫夋嫨“SimpleKMeans”锛岃繖鏄疻EKA涓疄鐜癒鍧囧€肩殑绠楁硶銆傜偣鍑绘梺杈圭殑鏂囨湰妗嗭紝淇敼“numClusters”涓?锛岃鏄庢垜浠笇鏈涙妸杩?00鏉″疄渚嬭仛鎴?绫伙紝鍗矺=6銆備笅闈㈢殑“seed”鍙傛暟鏄璁剧疆涓€涓殢鏈虹瀛愶紝渚濇浜х敓涓€涓殢鏈烘暟锛岀敤鏉ュ緱鍒癒鍧囧€肩畻娉曚腑绗竴娆$粰鍑虹殑K涓皣涓績鐨勪綅缃€傛垜浠笉濡ㄦ殏鏃惰瀹冨氨涓?0銆?br />閫変腑“Cluster Mode”鐨?ldquo;Use training set”锛岀偣鍑?ldquo;Start”鎸夐挳锛岃瀵熷彸杈?ldquo;Clusterer output”缁欏嚭鐨勮仛绫荤粨鏋溿€備篃鍙互鍦ㄥ乏涓嬭“Result list”涓繖娆′骇鐢熺殑缁撴灉涓婄偣鍙抽敭锛?ldquo;View in separate window”鍦ㄦ柊绐楀彛涓祻瑙堢粨鏋溿€?br />
缁撴灉瑙i噴 
棣栧厛鎴戜滑娉ㄦ剰鍒扮粨鏋滀腑鏈夎繖涔堜竴琛岋細 
Within cluster sum of squared errors: 1604.7416693522332
杩欐槸璇勪环鑱氱被濂藉潖鐨勬爣鍑嗭紝鏁板€艰秺灏忚鏄庡悓涓€绨囧疄渚嬩箣闂寸殑璺濈瓒婂皬銆備篃璁镐綘寰楀埌鐨勬暟鍊间細涓嶄竴鏍凤紱瀹為檯涓婂鏋滄妸“seed”鍙傛暟鏀逛竴涓嬶紝寰楀埌鐨勮繖涓暟鍊煎氨鍙兘浼氫笉涓€鏍枫€傛垜浠簲璇ュ灏濊瘯鍑犱釜seed锛屽苟閲囩撼杩欎釜鏁板€兼渶灏忕殑閭d釜缁撴灉銆備緥濡傛垜璁?ldquo;seed”鍙?00锛屽氨寰楀埌
Within cluster sum of squared errors: 1555.6241507629218
鎴戣鍙栧悗闈㈣繖涓€傚綋鐒跺啀灏濊瘯鍑犱釜seed锛岃繖涓暟鍊煎彲鑳戒細鏇村皬銆?nbsp;

鎺ヤ笅鏉?ldquo;Cluster centroids:”涔嬪悗鍒楀嚭浜嗗悇涓皣涓績鐨勪綅缃€?strong>瀵逛簬鏁板€煎瀷鐨勫睘鎬э紝绨囦腑蹇冨氨鏄畠鐨勫潎鍊硷紙Mean锛夛紱鍒嗙被鍨嬬殑灏辨槸瀹冪殑浼楁暟锛圡ode锛夛紝
涔熷氨鏄杩欎釜灞炴€т笂鍙栧€间负浼楁暟鍊肩殑瀹炰緥鏈€澶氥€傚浜庢暟鍊煎瀷鐨勫睘鎬э紝杩樼粰鍑轰簡瀹冨湪鍚勪釜绨囬噷鐨勬爣鍑嗗樊锛圫td Devs锛夈€?nbsp;

鏈€鍚庣殑“Clustered Instances”鏄悇涓皣涓疄渚嬬殑鏁扮洰鍙婄櫨鍒嗘瘮銆?nbsp;

涓轰簡瑙傚療鍙鍖栫殑鑱氱被缁撴灉锛屾垜浠湪宸︿笅鏂?ldquo;Result list”鍒楀嚭鐨勭粨鏋滀笂鍙冲嚮锛岀偣“Visualize cluster assignments”銆傚脊鍑虹殑绐楀彛缁欏嚭浜嗗悇瀹炰緥鐨勬暎鐐瑰浘銆傛渶涓婃柟鐨勪袱涓鏄€夋嫨妯潗鏍囧拰绾靛潗鏍囷紝绗簩琛岀殑“color”鏄暎鐐瑰浘鐫€鑹茬殑渚濇嵁锛岄粯璁ゆ槸鏍规嵁涓嶅悓鐨勭皣“Cluster”缁欏疄渚嬫爣涓婁笉鍚岀殑棰滆壊銆?br />鍙互鍦ㄨ繖閲岀偣“Save”鎶婅仛绫荤粨鏋滀繚瀛樻垚ARFF鏂囦欢銆傚湪杩欎釜鏂扮殑ARFF鏂囦欢涓紝“instance_number”灞炴€ц〃绀烘煇瀹炰緥鐨勭紪鍙凤紝“Cluster”灞炴€ц〃绀鸿仛绫荤畻娉曠粰鍑虹殑璇ュ疄渚嬫墍鍦ㄧ殑绨囥€?br />

----鏁寸悊鑷?nbsp;http://maya.cs.depaul.edu/~classes/ect584/WEKA/k-means.html

以上是关于WEKA浣跨敤鏁欑▼(缁忓吀鏁欑▼杞浇)的主要内容,如果未能解决你的问题,请参考以下文章

鏁欑▼锝渕acOS Mojave & Catalina 浣跨敤鏃х増娣辫壊妯″紡

闃块噷浜戞湇鍔″櫒ECS Ubuntu18.04 鍒濇浣跨敤閰嶇疆鏁欑▼(鍥惧舰鐣岄潰瀹夎)

MySQL涔嬪垵浣跨敤

鍗氭ⅵ杞欢绠″ Microsoft Office 2019 瀹夎鏁欑▼

jQuery validation

浣跨敤 React 涓€骞村悗锛屾垜瀛﹀埌鐨勬渶閲嶈缁忛獙