Stata锛氭満鍣ㄥ涔犲垎绫诲櫒澶у叏
Posted Stata杩炰韩浼?/a>
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Stata锛氭満鍣ㄥ涔犲垎绫诲櫒澶у叏相关的知识,希望对你有一定的参考价值。
馃崕 杩炰韩浼氫富椤碉細lianxh.cn
New锛?/span>
lianxh
鍛戒护鍙戝竷浜嗭細
闅忔椂鎼滅储 Stata 鎺ㄦ枃銆佹暀绋嬨€佹墜鍐屻€佽鍧涳紝瀹夎鍛戒护濡備笅锛?br>鈥?. ssc install lianxh
鈥?/p>
鈥?杩炰韩浼?路 鏈€鍙楁杩庣殑璇?/strong>
鈥?br>馃崜 2021 Stata 瀵掑亣鐝?/strong>
鈱?2021 骞?1.25-2.4馃尣 涓昏锛氳繛鐜夊悰 (涓北澶у)锛涙睙鑹?(涓浗浜烘皯澶у)
馃憠 璇剧▼涓婚〉锛?/p>
鈥?/p>
鈥?/p>
鈥?/p>
鈥?/p>
浣滆€? 妯婂槈璇?(涓北澶у)
E-Mail: fanjch676@163.com
鈥?/p>
鈥?/p>
鐩綍
1. 寮曡█
2. 鐞嗚浠嬬粛
2.1 鏀寔鍚戦噺鏈?/p>
2.2 鍐崇瓥鏍?/p>
2.3 绁炵粡缃戠粶
3. 鍛戒护浠嬬粛鍜屽畨瑁?/p>
3.1 鍩烘湰浠嬬粛
3.2 瀹夎鏂规硶
3.3 璇硶鍙婇€夐」
3.4 娉ㄦ剰浜嬮」
4. Stata 瀹炴搷
4.1 鏁版嵁缁撴瀯鎻忚堪
4.2 妯″瀷璁粌鍜岀粨鏋?/p>
4.3 缁撴灉姹囨€?/p>
5. 鎬荤粨
6. 鍙傝€冭祫鏂?/p>
鈥?/p>
娓╅Θ鎻愮ず锛?/strong> 鏂囦腑閾炬帴鍦ㄥ井淇′腑鏃犳硶鐢熸晥銆傝鐐瑰嚮搴曢儴銆岄槄璇诲師鏂囥€?/span>銆?/p>
鈥?/p>
1. 寮曡█
鈥滃浣曟牴鎹タ鐡滅殑鑹叉辰銆佹牴钂傘€佹暡澹扮瓑鐗瑰緛鍒嗚鲸鍑哄ソ鐡滃拰鍧忕摐鈥濓紝杩欐槸鎴戜滑鏃ュ父鐢熸椿涓粡甯搁潰涓寸殑鍒嗙被闂 (classification problem) 銆傝€屽湪瀛︽湳鐮旂┒涓紝璇稿鐮旂┒閮界涓嶅紑鍒嗙被鐨勫奖瀛愶細璇嗗埆缁忔祹鍛ㄦ湡锛屽垽鏂湭鏉ョ粡娴庡舰鍔匡紱鐮旂┒涓婂競鍏徃璐㈠姟淇℃伅锛屽鍏惰储鍔″洶澧冩垨鍗辨満杩涜棰勮鈥︹€︽澶栵紝璁$畻鏈鸿瑙夈€佸瀮鍦鹃偖浠跺垎绫汇€佸尰瀛﹁瘖鏂瓑涔熶笌鍒嗙被闂瀵嗗垏鐩稿叧銆?/p>
鍥炴兂缁忓吀鐨勮閲忔ā鍨?Logit 鍥炲綊锛屼负瑙e喅绂绘暎閫夋嫨闂鎻愪緵鎬濊矾锛屾湰璐ㄤ笂涔熷彲瑙嗕綔涓€绉嶅垎绫荤畻娉曘€傞殢鐫€澶ф暟鎹椂浠g殑鍒版潵锛岃澶氬垎绫讳换鍔¢潰涓寸潃鏁版嵁缁村害杩囬珮銆?span>鏁版嵁璐ㄩ噺杈冧綆銆?span>鏍锋湰涓嶅钩琛?/strong>绛夎澶氶棶棰樸€?span>鏈哄櫒瀛︿範 (Machine Learning, ML) 绠楁硶浣滀负杩戝勾鏉ョ倷鎵嬪彲鐑殑鏂规硶锛屼负瑙e喅杩欎簺闂寮€杈熶簡鏂扮殑鎬濊矾涓庨€斿緞銆?/p>
鏈帹鏂囧皢瑕佷粙缁嶇殑鍛戒护 鈥?/p>
鏈哄櫒瀛︿範鍒嗙被绠楁硶浼楀锛岀敱浜庣瘒骞呮湁闄愶紝鐜扮粨鍚? 鏀寔鍚戦噺鏈烘槸涓€绉?span>浜屽垎绫诲櫒锛屽畠鐨勫熀鏈€濇兂鏄熀浜庤缁冮泦
鍦ㄦ牱鏈┖闂翠腑瀵绘壘涓€涓?span>鍒掑垎瓒呭钩闈?/strong>锛屽皢涓嶅悓绫诲埆鐨勬牱鏈垝鍒嗗紑銆?/p>
鏀寔鍚戦噺鏈哄涔犳柟娉曞寘鎷敱绠€鑷崇箒鐨勪竴绯诲垪妯″瀷锛氬綋璁粌鏁版嵁绾挎€у彲鍒嗘椂锛岄€氳繃纭棿闅旀渶澶у寲 (hard margin maximization) 锛屽涔犱竴涓嚎鎬у垎绫诲櫒锛屽嵆绾挎€у彲鍒嗘敮鎸佸悜閲忔満锛涘綋璁粌鏁版嵁杩戜技绾挎€у彲鍒嗘椂锛岄€氳繃杞棿闅旀渶澶у寲 (soft margin maximization) 锛屼篃瀛︿範涓€涓嚎鎬у垎绫诲櫒锛屽嵆绾挎€ф敮鎸佸悜閲忔満锛涘綋璁粌鏁版嵁绾挎€т笉鍙垎鏃讹紝閫氳繃浣跨敤鏍告柟娉?(kernel method) 鍙婅蒋闂撮殧鏈€澶у寲锛屽涔?/strong>闈炵嚎鎬ф敮鎸佸悜閲忔満銆?/p>
鎴戜滑浠庢渶绠€鍗曠殑绾挎€у彲鍒嗘敮鎸佸悜閲忔満浣滀负寮曞叆锛屽亣瀹氱粰瀹氫竴涓壒寰佺┖闂磋缁冩暟鎹泦
锛屽叾涓紝
琛ㄧず鍏锋湁
涓壒寰佺殑鐗瑰緛鍚戦噺 (feature vector) 锛?span class="mq-90">
鐗瑰埆鍦帮紝鑻?
琛ㄧず鏍锋湰鐐逛綅浜庡垝鍒嗚秴骞抽潰涓婏紝琚О涔嬩负鈥?span>鏀寔鍚戦噺鈥?(support vector) 銆?/p>
涓€鑸湴锛屽浜庣嚎鎬у彲鍒嗙殑鏁版嵁锛屽瓨鍦ㄦ棤绌峰涓垝鍒嗚秴骞抽潰鍙互灏嗕袱绫绘暟鎹纭湴鍒嗗紑锛岄偅涔堝浣曡幏寰椾竴涓敮涓€鐨勬渶浼樺垝鍒嗚秴骞抽潰鍛紵鍦ㄤ笅鍥句腑锛屾湁
涓変釜鐐癸紝琛ㄧず
涓疄渚嬶紝涓旈娴嬪垎绫绘椂鍧囧湪鍒掑垎瓒呭钩闈㈢殑涓€渚с€傚洜涓?
鐐硅窛鍒嗙被瓒呭钩闈㈣緝杩滐紝灏辨瘮杈冪‘淇?
鐐硅姝g‘鍒嗙被鐨勫彲淇″害杈冮珮锛?span class="mq-122">
鍩轰簬鏈€鍒濈殑鐩磋锛屾垜浠殑鐩爣鑷劧鏄笇鏈涘嵆浣挎槸绂诲垝鍒嗚秴骞抽潰鏈€杩戠殑鐐癸紝鍏跺垝鍒嗙粨鏋滅殑鍙俊搴︿篃杈冮珮銆傚洜姝わ紝鏈€澶у寲闂撮殧鏄竴涓笉閿欑殑鎯虫硶銆傜劧鑰岋紝娉ㄦ剰鍒板鏋滄垜浠垚姣斾緥鍦版敼鍙?
鍜?
锛岃櫧鐒舵垜浠殑闂撮殧鍙樺寲浜嗭紝浣嗘槸瓒呭钩闈㈡湰韬苟鏈敼鍙樸€傚洜姝わ紝涓烘眰寰楀敮涓€瑙o紝瀵规硶鍚戦噺鏂藉姞鏌愮绾︽潫鏄繀瑕佺殑 (濡傝鑼冨寲锛岀害鏉?
) 銆傛渶缁堬紝鎴戜滑鍙互鍐欏嚭濡備笅鏈€浼樺寲闂 鐣ュ井閬楁喚鐨勬槸锛屼互涓婃渶浼樺寲闂鏄€滈潪鍑糕€?(Non-convex) 鐨勶紝姹傝В杩囩▼杈冧负澶嶆潅銆傚垢杩愮殑鏄紝涓婅堪鏈€浼樺寲闂鍙互閫氳繃绛変环鍙樻崲锛岃浆鎹负浠ヤ笅鍑?(Convex) 闂锛?/p>
姹傚緱鏈€浼樿В
鍗冲彲寰楀埌鏈€浼樺垝鍒嗚秴骞抽潰銆傚彲浠ヨ瘉鏄庯紝璇ユ渶浼樺垝鍒嗚秴骞抽潰鏄瓨鍦ㄤ笖鍞竴鐨勩€?/p>
浠ヤ笂鐨勮璁轰腑锛屾垜浠亣瀹氫簡璁粌鏁版嵁闆嗗彲浠ヨ鍒掑垎瓒呭钩闈㈠噯纭€佸畬鍏ㄥ湴鍒嗗紑銆傜劧鑰岋紝鍦ㄥ疄闄呴棶棰樹腑锛屾垜浠線寰€闅句互纭畾鏁版嵁鏄惁绾挎€у彲鍒嗭紱鎴栬€咃紝鍗充娇鏁版嵁绾挎€у彲鍒嗭紝涔熷緢闅炬柇瀹氳繖涓矊浼肩嚎鎬у彲鍒嗙殑缁撴灉涓嶆槸鐢变簬杩囨嫙鍚?/strong>瀵艰嚧鐨勩€傚洜姝わ紝涓轰簡鍑忓皯鏁版嵁涓€濆櫔澹扳€滅殑骞叉壈锛屾垜浠厑璁告敮鎸佸悜閲忔満鍦ㄤ竴浜涙牱鏈笂鐨勫垎绫荤粨鏋滃嚭閿欍€備负姝わ紝鎴戜滑闇€瑕佸紩鍏?span>杞棿闅?/strong> (soft margin) 鐨勬蹇点€備箣鍓嶇殑绾︽潫瑕佹眰鎵€鏈夋牱鏈潎鍒掑垎姝g‘锛屽嵆婊¤冻
锛岃繖鍙互鐞嗚В涓衡€滅‖闂撮殧鈥濄€傝€岃蒋闂撮殧鍒欏厑璁告煇浜涙牱鏈笉婊¤冻璇ョ害鏉燂紝褰撶劧杩欎簺涓嶆弧瓒虫牱鏈殑绾︽潫搴旇灏藉彲鑳界殑灏戯紝鍥犳浼樺寲鐩爣鍙互鍐欎负 鍏朵腑锛?span class="mq-185">
c_ml_stata
鍒╃敤 Python 璇█鍦?Stata 涓疄鐜颁簡鏈哄涔犲垎绫荤畻娉曪紝涓嶄粎鍥婃嫭浜嗕紬澶氬垎绫荤畻娉曪紝濡?span>鏀寔鍚戦噺鏈?/strong>銆?span>鍐崇瓥鏍?/strong>銆?span>绁炵粡缃戠粶绛夛紱涔熸敮鎸?span>浜ゅ弶楠岃瘉 (Cross Validation, CV) 锛屾槸鍒╃敤 Stata 澶勭悊鍒嗙被闂鏈夊姏宸ュ叿銆傛湰鎺ㄦ枃鐨勪綑涓嬮儴鍒嗗畨鎺掑涓嬶細鍦ㄧ浜岄儴鍒嗭紝瀵硅鍛戒护鐨勯儴鍒嗘満鍣ㄥ涔犲垎绫荤畻娉曡繘琛岀畝鍗曠殑鐞嗚浠嬬粛锛涘湪绗笁閮ㄥ垎锛屽璇ュ懡浠ょ殑绠€瑕佷粙缁嶅拰瀹夎鏂规硶杩涜璇存槑锛涘湪绗洓閮ㄥ垎锛屽埄鐢ㄨ鍛戒护鍙婂叾鎻愪緵鐨勬暟鎹泦浣跨敤 Stata 杩涜鍒嗙被闂鐨勫鐞嗭紱鍦ㄧ浜旈儴鍒嗭紝瀵规湰鎺ㄦ枃涓昏鍐呭杩涜鎬荤粨銆?/p>
2. 鐞嗚浠嬬粛
c_ml_stata
鍛戒护涓彁渚涚殑閮ㄥ垎鍒嗙被绠楁硶杩涜绠€瑕佺殑鐞嗚浠嬬粛锛屼互渚垮鏈哄櫒瀛︿範鍒嗙被闂銆佺畻娉曞強鍚庣画鍛戒护浣跨敤鏈夋洿娓呮鐨勮璇嗐€?span>鐔熸倝杩欎簺绠楁硶鐨勮鑰呭彲浠ュ揩閫熻烦杩?/strong>銆傝閮ㄥ垎涓昏浠嬬粛鐨勬満鍣ㄥ涔犵畻娉曞寘鎷細鏀寔鍚戦噺鏈?(Support Vector Machine, SVM) 銆佸喅绛栨爲 (Decesion Tree) 鍜岀缁忕綉缁?(Neural Network, NN) 銆?/p>
2.1 鏀寔鍚戦噺鏈?/span>
2.1.1 鏀寔鍚戦噺涓庨棿闅?/span>
2.1.2 杞棿闅?/span>
寮曞叆鏉惧紱鍙橀噺 (slack variables)
锛屽彲浠ュ皢涓婂紡閲嶅啓涓?/p>
鐢变簬鐜板疄涓殑璁稿闂骞堕潪鏄嚎鎬у彲鍒嗙殑锛屽浜庨潪绾挎€у彲鍒嗙殑鏁版嵁锛屽父閲囩敤鏍告柟娉?/strong>灏嗘牱鏈粠鍘熷绌洪棿鏄犲皠鍒版洿楂樼淮鐨勭壒寰佺┖闂?(濡備笅鍥炬墍绀? 锛屼娇寰楁牱鏈湪杩欎釜鐗瑰緛绌洪棿鍐呯嚎鎬у彲鍒嗐€?/p>
浠?
琛ㄧず
鏄犲皠鍚庣殑鐗瑰緛鍚戦噺锛屼簬鏄紝鍦ㄧ壒寰佺┖闂翠腑鍒掑垎瓒呭钩闈㈡墍瀵瑰簲鐨勬ā鍨嬪彲浠ヨ〃绀轰负 鍥犳锛屾渶浼樺寲闂鍙互鍐欎负 鍦ㄦ眰瑙h繃绋嬩腑锛岀敱浜庨渶瑕佽绠楁牱鏈?
涓?
鏄犲皠鍒扮壒寰佺┖闂翠箣闂寸殑鍐呯Н
銆傚洜涓虹壒寰佺┖闂寸殑缁村害鍙兘寰堥珮锛岀洿鎺ヨ绠楀唴绉€氬父鍗佸垎鍥伴毦锛屽洜鑰岃鎯冲涓嬪嚱鏁?/p>
鍙互澶уぇ绠€鍖栬繍绠楄繃绋嬨€傝€屽嚱鏁?
绉颁负鏍稿嚱鏁?/strong> (kernel function) 銆傛牳鍑芥暟鐨勯€夊彇骞堕潪鏄换鎰忕殑锛岄渶瑕佹弧瓒充竴浜涙潯浠讹紝鍙楅檺浜庣瘒骞呮垜浠笉浣滆缁嗚璁恒€備互涓嬪垪鍑轰竴浜涘父鐢ㄧ殑鏍稿嚱鏁帮細 鏀寔鍚戦噺鏈哄彲浠ユ瀯閫?span>瀵瑰伓闂锛屽埄鐢ㄦ媺鏍兼湕鏃ヤ箻瀛愭硶姹傝В銆傛敮鎸佸悜閲忔満鏈€缁堝彲杞寲涓轰竴涓?span>浜屾瑙勫垝闂锛屼娇鐢ㄨ濡?SMO ( Sequential Minimal Optimization ) 绛夐珮鏁堢畻娉曟眰瑙?(鐢变簬鎴戜滑鏈潃娴呮樉鍦颁簡瑙f敮鎸佸悜閲忔満鐨勫熀鏈悊璁猴紝渚夸笉璇︾粏浠嬬粛鍏舵眰瑙d紭鍖栬繃绋?銆?/p>
姝ゅ锛屾敮鎸佸悜閲忔満鏈変互涓嬩紭鍔e娍锛屾垜浠湪浣跨敤璇ュ垎绫绘柟娉曟椂闇€棰濆娉ㄦ剰锛?/p>
鍐崇瓥鏍戞槸涓€绉嶅熀浜庣殑鍒嗙被鍜屽洖褰掓柟娉曪紝椤惧悕鎬濅箟锛屽喅绛栨爲鍛堢幇鏍戝舰缁撴瀯 (瑙佷笅鍥? 銆備竴棰楀喅绛栨爲鐢辩粨鐐?(node) 鍜屾湁鍚戣竟 (directed edge) 鎴栧垎鏋濈粍鎴愶紝缁撶偣涓€鑸寘鎷牴缁撶偣銆佸唴閮ㄧ粨鐐瑰拰鍙剁粨鐐癸紝鍙互褰㈣薄绫绘瘮涓衡€滄爲鏍光€濆拰鈥滄爲鍙垛€濓紝鏈夊悜杈瑰彲浠ョ悊瑙d负鈥滄爲鏋濃€濄€傜敤鍐崇瓥鏍戝垎绫荤殑鍩烘湰鎬濇兂鏄紝浠庢牴缁撶偣寮€濮嬶紝瀵规牱鏈殑鏌愪竴鐗瑰緛 (鍒掑垎渚濇嵁) 杩涜娴嬭瘯锛屾牴鎹祴璇曠粨鏋滃皢鏍锋湰鍒嗛厤鍒板瓙缁撶偣锛涘姝ら€掑綊鍦板鏍锋湰杩涜娴嬭瘯骞跺垎閰嶏紝鐩磋嚦鍒拌揪鍙剁粨鐐广€?/p>
鍐崇瓥鏍戝涔犵殑鍏抽敭闂涔嬩竴鏄?span>鐗瑰緛閫夋嫨锛屽嵆鍦ㄦ瘡娆″垎绫绘椂閫夋嫨浠€涔堢壒寰佽繘琛屾祴璇曞拰鍒掑垎銆傚洜姝わ紝鎴戜滑闇€瑕佺‘瀹氶€夋嫨鐗瑰緛鐨勫噯鍒欍€傜洿瑙備笂锛屽鏋滄煇涓€涓壒寰佸叿鏈夋洿濂界殑鍒嗙被鑳藉姏锛岄偅涔堝喅绛栨爲鎸夎繖涓€鐗瑰緛鍒嗙被鍚庣殑鍚勪釜瀛愮被搴斿敖鍙兘鍦板睘浜庡悓涓€绫诲埆锛岀粨鐐圭殑鈥滅函搴︹€濊秺楂樸€傛帴涓嬫潵锛屾垜浠細渚濇寮曞叆涓€浜涙蹇碉細淇℃伅鐔点€佹潯浠剁喌銆佷俊鎭鐩婁互鍙婁俊鎭鐩婃瘮锛屾潵鐞嗚В濡備綍杩涜鐗瑰緛閫夋嫨銆?/p>
淇℃伅鐔?/strong> (information entropy) 鏄害閲忔牱鏈泦鍚堢函搴︾殑涓€绉嶆寚鏍囷紝瀵逛簬涓€涓湁
涓鏁e彇鍊肩殑闅忔満鍙橀噺
锛屽叾姒傜巼鍒嗗竷涓?/p>
鍏朵俊鎭喌瀹氫箟涓?/p>
鐗瑰埆鍦帮紝鑻?
锛屽畾涔?
锛涗笂寮忕殑瀵规暟甯镐互浠?
涓哄簳鎴栬€呬互
涓哄簳銆傜喌瓒婂ぇ锛岄殢鏈哄彉閲忕殑涓嶇‘瀹氭€ц秺澶с€傚彲浠ヨ瘉鏄?
銆?/p>
瀵逛簬闅忔満鍙橀噺
锛屽叾鑱斿悎姒傜巼鍒嗗竷涓?/p>
鏉′欢鐔?/strong> (conditional entropy)
琛ㄧず鍦ㄥ凡鐭ラ殢鏈哄彉閲?
鐨勬潯浠朵笅闅忔満鍙橀噺
鐨勪笉纭畾鎬э紝瀹氫箟涓?
缁欏畾鏉′欢涓?
鐨勬潯浠舵鐜囧垎甯冪殑鐔靛
鐨勬暟瀛︽湡鏈?/p>
杩欓噷锛?span class="mq-379">
淇℃伅澧炵泭 (information gain) 琛ㄧず寰楃煡鐗瑰緛
鐨勪俊鎭€屼娇寰楃被
鐨勪俊鎭笉纭畾鎬у噺灏戠殑绋嬪害锛屽洜姝ゆ垜浠畾涔夌壒寰?
瀵硅缁冩暟鎹泦
鐨勪俊鎭鐩?
涓烘暟鎹泦
鐨勭粡楠岀喌
涓庣壒寰?
缁欏畾涓嬪叾鏉′欢鐔?
鐨勫樊锛屽嵆 鏄剧劧锛屽浜庡叿鏈夎緝寮哄垎绫昏兘鍔涚殑鐗瑰緛锛屽叾淇℃伅澧炵泭鏇撮珮銆傚洜姝わ紝鎴戜滑鍒╃敤淇℃伅澧炵泭閫夋嫨鐗瑰緛鐨勬柟娉曟槸锛屽浜庤缁冩暟鎹泦
锛岃绠楀叾姣忎釜鐗瑰緛鐨勪俊鎭鐩婏紝閫夋嫨淇℃伅澧炵泭鏈€澶х殑鐗瑰緛銆?/p>
浣嗘槸锛屼娇鐢ㄤ俊鎭鐩婁綔涓哄垝鍒嗘爣鍑嗗瓨鍦ㄥ亸鍚戜簬閫夋嫨鐗瑰緛鍙栧€艰緝澶氱殑鐗瑰緛鐨勯棶棰橈紝杩欐牱鏄笉鍏钩鐨勩€傚洜姝わ紝寮曞叆浜?span>淇℃伅澧炵泭姣?/strong> (information gain ratio) 鐨勬蹇点€傚畾涔夌壒寰?
瀵硅缁冩暟鎹泦
鐨勪俊鎭鐩婃瘮
涓哄叾淇℃伅澧炵泭
涓庤缁冩暟鎹泦
鍏充簬鐗瑰緛
鐨勫€肩殑鐔?
涔嬫瘮锛屽嵆 鍐崇瓥鏍戠殑鐢熸垚鏈夊绉嶇畻娉曪紝濡?ID3 銆?span>C4.5 绛夌粡鍏哥殑鐢熸垚绠楁硶銆備负浜嗙悊瑙e喅绛栨爲鐨勭敓鎴愯繃绋嬶紝鎴戜滑杩樻槸閫夋嫨浠嬬粛鍏朵腑鐨勪竴绉嶇敓鎴愮畻娉曪細ID3 锛屽叾鏍稿績鎬濇兂鏄湪鏍戠殑鍚勪釜缁撶偣鐢ㄤ俊鎭鐩婁綔涓虹壒寰侀€夋嫨鍑嗗垯锛岄€掑綊鍦版瀯寤哄喅绛栨爲銆傚叿浣撴柟娉曟槸锛?/p>
C4.5 绠楁硶涓?ID3 绠楁硶鐩镐技锛屼笉鍚屼箣澶勫湪浜庯紝 C4.5 浣跨敤淇℃伅澧炵泭姣斾綔涓虹壒寰侀€夋嫨鐨勪緷鎹€傛澶栬繕鏈夎濡?CART 绠楁硶绛夌瓑澶氱澶氭牱鐨勭敓鎴愭爲鐨勬柟娉曘€?/p>
鐢熸垚鍐崇瓥鏍戝悗锛屽線寰€杩橀渶瑕佸鍏惰繘琛?span>鍓灊 (pruning) 锛岄【鍚嶆€濅箟锛屽氨鏄粠宸茬敓鎴愮殑鏍戜笂瑁佸壀涓€浜涘瓙鏍戞垨鑰呭彾缁撶偣锛屽鏍戠殑缁撴瀯杩涜绠€鍖栦互闃叉鍏惰繃鎷熷悎銆傛€荤粨涓€涓嬪喅绛栨爲鐨勪紭缂虹偣锛?/p>
鍥犳锛屽湪鍏跺熀纭€涓婁篃鏈夎澶氭嫇灞曟ā鍨嬶細涓?span>琚嬭娉?/strong> (Bagging) 鎬濇兂缁撳悎鐨?span>闅忔満妫灄 (Random Forest) 锛屼笌鎻愬崌娉?/strong> (Boosting) 缁撳悎鐨?span>姊害鎻愬崌鏍?/strong> (Gradient Boosting Decesion Tree) 銆?span>鏋佺姊害鎻愬崌鏍?/strong> (Extreme Gradient Boosting Decesion Tree) 绛夌瓑銆?/p>
绁炵粡缃戠粶鏄幇鍦ㄦ瘮杈冩祦琛岀殑鏈哄櫒瀛︿範绠楁硶锛屽彲浠ュ鐞嗗洖褰掋€佸垎绫荤瓑澶氱闂銆傜缁忕綉缁滀腑鏈€鍩烘湰鐨勭粨鏋勬槸绁炵粡鍏?(neuron) 锛屽叾缁撴瀯瑙佷笅鍥俱€?/p>
涓€涓渶鍩烘湰鐨勭缁忓厓鐢辫緭鍏?(input) 銆佹潈閲?(weight) 銆佸亸缃?(bias) 鎴栭槇鍊?(threshold) 銆佹縺娲诲嚱鏁?(active function) 鍜岃緭鍑?(output) 缁勬垚銆備互涓€涓湁澶氫釜杈撳叆鍙湁涓€涓緭鍑虹殑绁炵粡鍏冧负渚嬶紝鍏舵帴鍙椾簡
涓緭鍏ヤ俊鍙?
锛屽嵆
锛岃繖浜涜緭鍏ヤ俊鍙烽€氳繃甯︽潈閲嶇殑杩炴帴杩涜浼犻€掞紝鍏跺姞鏉冨悗鐨勬€昏緭鍏ヤ笌绁炵粡鍏冪殑闃堝€兼瘮杈冿紝閫氳繃婵€娲诲嚱鏁板鐞嗕骇鐢熻緭鍑恒€傝嫢灏嗙
涓緭鍏ヤ俊鍙风殑鏉冮噸鍐欎负
锛屽垯鍔犳潈鎬昏緭鍏ヤ负
銆傝闃堝€间负
锛屾潈閲嶅嚱鏁颁负
锛屽垯绁炵粡鍏冧骇鐢熺殑杈撳嚭涓?/p>
婵€娲诲嚱鏁?
寰€寰€鏄潪绾挎€х殑锛屾湁浠ヤ笅甯哥敤澶氱婵€娲诲嚱鏁板彲浠ヤ娇鐢細 Sigmoid:
锛?/p>
tanh:
ReLU:
涓€鑸€岃█锛屽父甯搁€夊彇 ReLu 婵€娲诲嚱鏁帮紝鍘熷洜鏄鍑芥暟褰㈠紡杈冧负绠€鍗曪紝璁$畻蹇€佹敹鏁涘揩涓斿彲浠ラ伩鍏嶈濡傛搴︽秷澶辩瓑闂銆?/p>
绁炵粡缃戠粶缁撴瀯涓€鑸敱**杈撳叆灞?(input layer) ** 銆?*闅愬眰 (hidden layer) ** 鍜?杈撳嚭灞?(output layer) 缁勬垚銆?/p>
鏈€缁忓吀涔熸槸鏈€甯歌鐨?span>鍓嶉绁炵粡缃戠粶姝f槸鐢辫緭鍏ュ眰銆侀殣灞傚拰杈撳嚭灞傛瀯鎴愶紝鍏剁壒鐐规槸鍚湁澶氫釜闅愬眰锛屾瘡灞傜缁忓厓涓庝笅涓€灞傜缁忓厓鍏ㄤ簰杩烇紝绁炵粡鍏冧箣闂翠笉瀛樺湪鍚屽眰杩炴帴锛屼篃涓嶅瓨鍦ㄨ法灞傝繛鎺?/strong>銆傚涓嬪浘鎵€绀恒€?/p>
涓€鑸湴锛屽浜庝竴涓鍒嗙被闂锛屾垜浠彲浠ュ畾涔夌缁忕綉缁滅殑杈撳嚭
缁村垪鍚戦噺
锛屽叾涓?
鏃㈠彲浠ユ槸杩炵画鍙栧€硷紝涔熷彲浠ユ槸绂绘暎鍙栧€煎
銆傝缃戠粶鍏辨湁
涓殣灞傦紝绠€鍗曡捣瑙佹垜浠悇涓殣灞傚強杈撳嚭灞傜缁忓厓鐨勬縺娲诲嚱鏁板潎涓?
锛岃緭鍏ュ眰鎺ュ彈鐨勮緭鍏ュ嵆鍒嗙被鐨勭壒寰?
缁村垪鍚戦噺
銆傝鍓嶉绁炵粡缃戠粶鍙互鍐欎负濡備笅閫掑綊褰㈠紡锛?/p>
鍏朵腑锛?span class="mq-571">
璁惧畾鏌愪竴闃堝€硷紝鎴栧皢鍏舵渶澶х殑
瀵瑰簲涓?
锛屽叾浣欎负
锛屽嵆鍙緱鍒板垎绫荤粨鏋溿€?/p>
鍓嶉绁炵粡缃戠粶涓殑鏉冮噸
鍜屽亸缃?
鍧囨槸鏈煡鍙傛暟锛屽父鐢?span>璇樊閫嗕紶鎾畻娉?(Error BackPropagation, BP) 杩涜纭畾銆傜缁忕綉缁滅畻娉曟湁浠ヤ笅闇€瑕佹敞鎰忕殑鍦版柟锛?/p>
鈥?/p>
鐔繃浠ヤ笂鐣ュ井绻佺悙鐨勭畻娉曠悊璁轰粙缁嶏紝姝f槸鏉ュ埌鏈鎺ㄦ枃鐨勪富瑙掑懡浠も€斺€?code class="mq-627">c_ml_stata 銆傝櫧鐒舵満鍣ㄥ涔犵畻娉曞崄鍒嗗鏉?(杩滄瘮浠ヤ笂鐨勭悊璁轰粙缁嶈澶嶆潅璁稿) 锛?浣嗘槸璇ュ懡浠よ緝涓虹畝娲佽€岀洿鎺ュ湴闆嗘垚浜嗗绉嶇畻娉曪紝鍚屾椂鍗佸垎瀹规槗璋冪敤銆傝閮ㄥ垎涓昏瀵瑰懡浠よ繘琛屽熀鏈粙缁嶏紝璇存槑鍏跺畨瑁呮柟娉曚互鍙婅娉曢€夐」鐨勪娇鐢ㄥ強涓€浜涙敞鎰忎簨椤广€?/p>
鏌ョ湅涓庝箣鐩稿叧鐨勫畬鏁寸▼搴忔枃浠跺拰鐩稿叧闄勪欢锛屼富瑕佸寘鎷互涓嬫枃浠讹細 濡傛灉鎯冲畬鍏ㄤ娇鐢ㄥ懡浠ゆ潵瀹夎锛屽彲浠ユ墽琛屽涓嬩袱鏉″懡浠わ細 杈撳叆鍙橀噺鐨勫惈涔夊涓嬶細 鍚勪釜閫夐」鐨勫惈涔夊涓嬶細 鈥?/p>
鎴戜滑浣跨敤 璇ユ暟鎹泦鍏辨湁 74 涓牱鏈紝鍖呭惈 4 涓В閲婂彉閲?(鍒嗗埆鍛藉悕涓?x1, x2, x3, x4 ) 鍜?1 涓洜鍙橀噺 (鍛藉悕涓?y ) 銆傚叾涓紝4 涓В閲婂彉閲忓潎涓鸿繛缁彉閲忥紝鑰屽洜鍙橀噺涓哄垎绫诲彉閲?(绂绘暎鍙栧€? 锛屽洜姝ゆ垜浠垎鍒娇鐢? 鐢变簬 灏嗛€夐」 in_pred_svm.dta 閮ㄥ垎鏁版嵁濡備笅鍥炬墍绀猴紝鍏朵腑 index 琛ㄧず瑙傛祴鍊兼牱鏈爣鍙凤紝涓庡師鏁版嵁鏍锋湰鏍囧彿鐩稿搴旓紱label_in_pred 琛ㄧず鏍锋湰鍐呮爣绛剧殑棰勬祴缁撴灉锛?span>Prob_1, Prob_2, Prob_3 鍙兘琛ㄧず棰勬祴缁撴灉涓嶆槸绗?
绫荤殑姒傜巼銆?/p>
out_pred_svm.dta 鏁版嵁缁撴灉涓?in_pred_svm.dta 绫讳技銆傚叾涓紝label_out_pre 琛ㄧず鏍锋湰澶栨爣绛剧殑棰勬祴缁撴灉锛屽彲浠ョ湅鍑?SVM 灏嗘牱鏈缁撴灉鍒嗙被涓虹
绫汇€?/p>
瀵逛簬鏀寔鍚戦噺鏈虹畻娉曪紝鏍规嵁绗簩閮ㄥ垎鐞嗚閮ㄥ垎鐨勪粙缁嶏紝鎴戜滑鐨勪富瑕佽秴鍙傛暟涓烘鍒欏寲绯绘暟
鍜屾牳鍑芥暟鍙傛暟
(甯歌浣?GAMMA ) 銆備娇鐢? 璇︾粏鐨勪氦鍙夐獙璇佺粨鏋滃彲鍦?CV.dta 涓煡鐪嬨€傛澶栵紝閫氳繃 灏? 灏? Note锛?/strong> 姝e垯鍖栧椤瑰紡缁撴灉鏃犳硶鏀舵暃锛屾殏涓嶅弬涓庢瘮杈冦€?/p>
鈥?/p>
鏈帹鏂囩殑鍐呭鍗冲皢杩涘叆灏惧0锛岀畝瑕佸洖椤炬垜浠笂杩扮殑涓昏鍐呭锛氭垜浠畝鍗曚簡瑙d簡浠€涔堟槸鍒嗙被闂浠ュ強鏈哄櫒瀛︿範鐨勫垎绫荤畻娉曪紝浠嬬粛浜? 鈥?/p>
娓╅Θ鎻愮ず锛?/strong> 鏂囦腑閾炬帴鍦ㄥ井淇′腑鏃犳硶鐢熸晥銆傝鐐瑰嚮搴曢儴銆岄槄璇诲師鏂囥€?/span>銆?/p>
鈥?/p>
鈥?/p>
鈥?br>馃崜 2021 Stata 瀵掑亣鐝?/strong> 馃尣 涓昏锛氳繛鐜夊悰 (涓北澶у)锛涙睙鑹?(涓浗浜烘皯澶у) 馃憠 璇剧▼涓婚〉锛?/p>
鈥?/p>
馃崗 馃崗 馃崗 馃崗 鍏嶈垂鍏紑璇撅細 娓╅Θ鎻愮ず锛?/strong> 鏂囦腑閾炬帴鍦ㄥ井淇′腑鏃犳硶鐢熸晥锛岃鐐瑰嚮搴曢儴銆岄槄璇诲師鏂囥€?/span>銆?/p>
鈥?/p>
鈥?/p>
New锛?/span> 鈥?/p>
鈥?/p>
鈥?/p>
2.1.3 鏍告柟娉?/span>
2.1.4 琛ュ厖
2.2 鍐崇瓥鏍?/span>
2.2.1 鍩烘湰妯″瀷
2.2.2 鐗瑰緛閫夋嫨
2.2.3 鏍戠殑鐢熸垚
2.2.4 琛ュ厖
2.3 绁炵粡缃戠粶
2.3.1 绁炵粡鍏?/span>
2.3.2 鍓嶉绁炵粡缃戠粶
2.3.3 绁炵粡缃戠粶鍒嗙被绠楁硶
2.3.4 琛ュ厖
3. 鍛戒护浠嬬粛鍜屽畨瑁?/span>
3.1 鍩烘湰浠嬬粛
c_ml_stata
鐢?Giovanni Cerulli 缂栧啓锛屾槸鍦?Stata 16 涓疄鐜版満鍣ㄥ涔犲垎绫荤畻娉曠殑鍛戒护锛岃鍛戒护浣跨敤 Python 涓殑 Scikit-learn
鎺ュ彛瀹炵幇妯″瀷璁粌銆侀娴嬬瓑鍔熻兘锛屼富瑕佹湁浠ヤ笅鐗圭偣锛?/p>
cross_validation
閫夐」锛屽埄鐢ㄢ€滆椽濠悳绱⑩€?(greed search) 瀹炵幇
K 鎶樹氦鍙夐獙璇?/strong> (K-fold cross validation) 閫夋嫨鏈€浼樿秴鍙傛暟 (hyper parameters) 锛岃皟浼樺垎绫绘ā鍨嬨€?
3.2 瀹夎鏂规硶
c_ml_stata
闇€瑕佸湪 Stata 16.0 鍙婁互涓婄増鏈娇鐢ㄣ€傚湪 Stata 鐨勫懡浠よ涓緭鍏?ssc install c_ml_stata
鍗冲彲涓嬭浇锛屾垨鑰呬娇鐢ㄥ涓嬪懡浠ゆ墦寮€涓嬭浇椤甸潰锛?/p>
. veiw net describe c_ml_stata // 鍛戒护鍖呯畝浠?br>
路 net install c_ml_stata // 瀹夎鍛戒护鍖?br>. net get c_ml_stata // 涓嬭浇闄勪欢锛歞ofile, .dta 绛?br> // 瀛樺偍鍦ㄥ綋鍓嶅伐浣滆矾寰勪笅
3.3 璇硶鍙婇€夐」
c_ml_stata
涓昏鍛戒护鐨勮娉曟牸寮忎负锛?/p>
c_ml_stata outcome [varlist], mlmodel(modeltype) out_sample(filename)
in_prediction(name) out_prediction(name) cross_validation(name)
seed(integer) [save_graph_cv(name)]
outcome
锛氭槸涓€涓暟鍊煎瀷銆佺鏁g殑鍥犲彉閲?(鎴栨爣绛? 锛岃〃绀轰笉鍚岀殑绫诲埆銆傝嫢鍥犲彉閲忔湁
绉嶇被鍒紝寤鸿瀵瑰叾绫诲埆杩涜缂栫爜 (recode) 锛屽彇鍊艰寖鍥翠负
銆備緥濡傦紝瀵逛簬涓€涓簩鍏冨彉閲?(鍙栧€间负
鎴?
) 锛屽垯搴旂紪鐮佹垚
銆?
娉ㄦ剰锛?/strong>
outcome
涓嶆帴鍙楃己澶卞€?/strong>銆?
varlist
锛氭槸浠h〃鑷彉閲?(鎴栫壒寰? 鐨勬暟鍊煎瀷鍙橀噺鍒楄〃锛屽睘浜庡彲閫夐」銆傝嫢鏌愪竴鐗瑰緛涔熸槸绫诲埆鍙橀噺锛屽垯闇€鍏堢敓鎴愮浉搴旂殑鏁板€煎瀷銆佺鏁g殑铏氭嫙鍙橀噺銆?
娉ㄦ剰锛?/strong>
varlist
涓嶆帴鍙楃己澶卞€?/strong>銆?
mlmodel(modeltype)
锛氭寚瀹氫娇鐢ㄧ殑鏈哄櫒瀛︿範鍒嗙被绠楁硶 (妯″瀷) 锛屾湁浠ヤ笅鍑犵閫夋嫨锛?/p>
tree
: Classification tree (鍒嗙被鏍?
randomforest
: Bagging and random forests (琚嬭鏍戝拰闅忔満妫灄)
boost
: Boosting (鎻愬崌绠楁硶锛屾彁鍗囨爲)
regularizedmultionmial
: Regularized multinomial (姝e垯鍖栧椤瑰紡)
nearestneighbor
: Nearest Neighbor (K 杩戦偦绠楁硶)
neuralnet
: Neural network (绁炵粡缃戠粶)
naivebayes
: Naive Bayes (鏈寸礌璐濆彾鏂?
svm
: Support vector machine (鏀寔鍚戦噺鏈?
out_sample(filename)
锛氳姹傛寚瀹氫竴涓牱鏈鐨勬柊鏁版嵁闆?(娴嬭瘯闆? 锛岃鏁版嵁闆嗕粎鍖呭惈鍚勪釜鐗瑰緛 (鏃犲洜鍙橀噺) 锛岀敤浜庢牱鏈娴嬭瘯銆?code class="mq-727">filename 琛ㄧず瀛樻斁璇ユ暟鎹泦鐨勬枃浠跺悕銆?/p>
in_prediction(name)
锛氫繚瀛樻牱鏈唴璁粌鏁版嵁 (璁粌闆嗗拰楠岃瘉闆? 鐨勬嫙鍚堢粨鏋滐紝name
涓烘枃浠跺悕銆?/p>
out_prediction(name)
锛氫繚瀛樻牱鏈鏁版嵁 (娴嬭瘯闆? 鐨勯娴嬬粨鏋?锛?code class="mq-735">name 涓烘枃浠跺悕銆傛牱鏈鏁版嵁浠?out_sample
涓幏寰椼€?/p>
cross_validation(name)
锛氬皢 name
璁惧畾涓?"CV"
鍙互鎵ц浜ゅ弶楠岃瘉锛岄粯璁や负 10 鎶樹氦鍙夐獙璇併€?/p>
seed(integer)
锛氶殢鏈虹瀛?(鏁存暟) 銆?/p>
[save_graph_cv(name)]
锛氬彲閫夐」锛屼繚瀛樹氦鍙夐獙璇佷腑妯″瀷鍦ㄨ缁冮泦鍜岄獙璇侀泦涓婂垎绫荤粨鏋滅殑鍑嗙‘鎬?( Accuracy ) 锛岀敤浜庣‘瀹氭渶浼樼殑瓒呭弬鏁板拰妯″瀷銆?/p>
c_ml_stata
鐨勮繑鍥炲€硷細
ereturn list
鍛戒护鏌ョ湅 (鏁板€煎瀷杩斿洖鍊煎偍瀛樺湪 scalars 涓紝瀛楃鍨嬬粨鏋滃偍瀛樺湪 macros 涓? 銆?
3.4 娉ㄦ剰浜嬮」
c_ml_stata
绋嬪簭闇€瑕佹嫢鏈?Stata 16 鍙?Python (2.7 鍙婁互涓婄増鏈? 锛屽悓鏃堕渶瀹夎 Python 鐨?
Scikit-learn
鍜?
Stata Function Interface (SFI)
涓や釜渚濊禆搴撱€?
outcome
鍜?
varlist
鍧囦笉鍏佽鍑虹幇缂哄け鍊硷紝鍥犳鍦ㄤ娇鐢ㄨ鍛戒护鍓嶉渶妫€鏌ユ暟鎹泦鏄惁鍑虹幇缂哄け鍊?(骞跺垹闄ょ己澶卞€?銆?
ssc install c_ml_stata, replace
銆?
help c_ml_stata
鍛戒护鑾峰彇銆?
4. Stata 瀹炴搷
c_ml_stata
鎻愪緵鐨勬暟鎹泦 (鍌ㄥ瓨鍦?c_ml_stata_data_example.dta 鏂囦欢涓? 杩涜瀹炴搷銆傜敱浜庝綔鑰呭苟鏈彁渚涜繃澶氭湁鍏宠鏁版嵁闆嗙殑鍏朵粬瑙i噴淇℃伅锛屾墍浠ヨ渚嬪瓙浠呬綔涓烘搷浣滃拰婕旂ず鎻愪緵锛屽疄闄呮剰涔変笉澶с€?/p>
4.1 鏁版嵁缁撴瀯鎻忚堪
summary
鍜?tab
鍛戒护瀵瑰彉閲忚繘琛屾弿杩版€х粺璁°€傚彲浠ョ湅鍑猴紝鏍规嵁 c_ml_stata
鍛戒护鐨勮姹傦紝杈撳叆鏁版嵁闆嗗苟鏃犵己澶卞€笺€傚緟鍒嗙被鍙橀噺 (浣滀负 outcome
鐨勮緭鍏? y 杩涜鏁板€煎寲缂栫爜锛?span class="mq-792">
鍒嗗埆浠h〃涓夌绫诲埆銆?/p>
. use "c_ml_stata_data_example.dta", clear
. tab y
y | Freq. Percent Cum.
-------+------------------------------
1 | 42 56.76 56.76
2 | 22 29.73 86.49
3 | 10 13.51 100.00
-------+------------------------------
Total | 74 100.00
. sum x1-x4
Variable | Obs Mean Std. Dev. Min Max
----------+--------------------------------------------
x1 | 74 6165.257 2949.496 3291 15906
x2 | 74 21.2973 5.785503 12 41
x3 | 74 3019.459 777.1936 1760 4840
x4 | 74 187.9324 22.26634 142 2334.2 妯″瀷璁粌鍜岀粨鏋?/span>
c_ml_stata
鎻愪緵浜嗗绉嶆湁鐩戠潱瀛︿範鍒嗙被绠楁硶锛屾垜浠娇鐢ㄥ彉閲?x1, x2, x3, x4 浣滀负瑙i噴鍙橀噺锛?span>y 浣滀负鏍囩杩涜妯″瀷璁粌銆備互涓嬮儴鍒嗘垜浠互鏀寔鍚戦噺鏈轰负渚嬶紝璇︾粏浠嬬粛璇ュ懡浠ょ殑璋冪敤鏂规硶鍜岃緭鍑虹粨鏋溿€?/p>
4.2.1 鏀寔鍚戦噺鏈?/span>
mlmodel
璁惧畾涓?svm
鍗冲彲浣跨敤鏀寔鍚戦噺鏈虹畻娉曡繘琛屽垎绫汇€傛牱鏈唴棰勬祴缁撴灉淇濆瓨鍦ㄦ枃浠?in_pred_svm.dta 涓紝浣跨敤 c_ml_stata_data_new_example.dta 鏂囦欢璇诲彇鏍锋湰澶栨暟鎹?(浠呭寘鍚壒寰?x1, x2, x3, x4 ) 锛屾牱鏈棰勬祴缁撴灉淇濆瓨鍦ㄦ枃浠?out_pre_svm.dta 涓€傚湪 cross_validation
閫夐」涓娇鐢?CV
鍗冲彲杩涜浜ゅ弶楠岃瘉锛屼氦鍙夐獙璇佺粨鏋滆嚜鍔ㄤ繚瀛樺湪 CV.dta 鏂囦欢涓紱鑻ユ湁闇€瑕佸彲浣跨敤 save_graph_cv
閫夐」鍙鍖栦氦鍙夐獙璇佺粨鏋滃苟淇濆瓨銆傚叿浣撲唬鐮佸強閮ㄥ垎杈撳嚭缁撴灉濡備笅锛?/p>
. c_ml_stata y x1-x4, mlmodel(svm) ///
out_sample("c_ml_stata_data_new_example") ///
in_prediction("in_pred_svm") ///
out_prediction("out_pred_svm") ///
cross_validation("CV") ///
seed(10) save_graph_cv("graph_cv_svm")
-------------------------------------------
CROSS-VALIDATION RESULTS TABLE
-------------------------------------------
The best score is:
0.5678571428571428
-------------------------------------------
The best parameters are:
{'C': 1, 'gamma': 0.1}
1
0.1
-------------------------------------------
The best estimator is:
SVC(C=1, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma=0.1, kernel='rbf',
max_iter=-1, probability=True, random_state=None, shrinking=True, tol=0.001,
verbose=False)
-------------------------------------------
The best index is:
0
-------------------------------------------ereturn list
鍙繑鍥炴渶浼樼殑瓒呭弬鏁伴€夊彇缁撴灉锛屽彲浠ョ湅鍑烘渶浼樼殑姝e垯鍖栫郴鏁板簲璁惧畾涓?
锛屽拰鍑芥暟鍙傛暟閫夊彇涓?
锛涗氦鍙夐獙璇佷腑璁粌闆嗙殑姝g‘鐜囦负
锛屾祴璇曢泦 (鍗抽獙璇侀泦) 姝g‘鐜囦负
銆?/p>
. ereturn list
scalars:
e(OPT_C) = 1
e(OPT_GAMMA) = .1
e(TEST_ACCURACY) = .5678571428571428
e(TRAIN_ACCURACY) = 1
e(BEST_INDEX) = 94.5save_graph_cv("graph_cv_svm")
鍙鍖栦氦鍙夐獙璇佺粨鏋滃苟淇濆瓨涓?graph_cv_svm.gph 鏂囦欢锛岀粨鏋滃涓嬫墍绀恒€傚彲浠ュ彂鐜帮紝闅忕潃 index 鐨勬敼鍙?(涓嶅悓瓒呭弬鏁扮粍鍚堢殑璁惧畾) 锛屾ā鍨嬪湪璁粌闆嗐€侀獙璇侀泦鐨勮〃鐜板潎涓嶆敼鍙樸€?span>鍊煎緱娉ㄦ剰鐨勬槸锛岃繖绉嶆儏鍐靛湪瀹為檯搴旂敤涓瀬涓哄皯瑙併€?/strong>4.2.2 鍐崇瓥鏍?/span>
mlmodel
閫夐」璁惧畾涓?tree
鍗冲彲浣跨敤鍐崇瓥鏍戣繘琛屽垎绫汇€傚喅绛栨爲鐨勮秴鍙傛暟涓昏涓哄彾瀛愯妭鐐逛釜鏁?(leaves) 锛岄€氳繃浜ゅ弶楠岃瘉鐨勫彲瑙嗗寲缁撴灉鍙互鍙戠幇锛岄殢鐫€ index 鐨勫鍔?(鍙跺瓙鑺傜偣涓暟鐨勫鍔? 锛?妯″瀷鍦ㄨ缁冮泦鐨勮〃鐜颁笉鏂彁楂橈紝鑰屽湪楠岃瘉闆嗙殑琛ㄧ幇鍏堜笂鍗囧悗鍛堜笅闄嶈秼鍔?(鍑虹幇杩囨嫙鍚堥棶棰? 锛屽洜姝ゅ彲浠ュ緱鍒版渶浼樼殑鍙跺瓙鑺傜偣涓暟 (瀵瑰簲浜庨獙璇侀泦鏈€楂樼殑鍒嗙被姝g‘鐜? 銆備娇鐢?ereturn list
鏌ョ湅鏈€鏈夎秴鍙傛暟鐨勯€夊彇缁撴灉銆傚彲浠ュ彂鐜帮紝鏈€浼樼殑鍙跺瓙鑺傜偣涓暟涓?3 锛涗氦鍙夐獙璇佷腑璁粌闆嗙殑姝g‘鐜囦负
锛屾祴璇曢泦 (鍗抽獙璇侀泦) 姝g‘鐜囦负
銆?/p>
. c_ml_stata y x1-x4, mlmodel(tree) ///
out_sample("c_ml_stata_data_new_example") ///
in_prediction("in_pred_ctree") ///
out_prediction("out_pred_ctree") ///
cross_validation("CV") ///
seed(10) save_graph_cv("graph_cv_ctree")
. ereturn list
scalars:
e(OPT_LEAVES) = 3
e(TEST_ACCURACY) = .6375
e(TRAIN_ACCURACY) = .8108095884215288
e(BEST_INDEX) = 24.2.3 绁炵粡缃戠粶
mlmodel
閫夐」璁惧畾涓?neuralnet
鍗冲彲浣跨敤绁炵粡缃戠粶杩涜鍒嗙被銆傜缁忕綉缁滅殑涓昏瓒呭弬鏁颁负绁炵粡缃戠粶灞傛暟 (layers) 鍜岀缁忓厓涓暟 (neurons) 銆傞€氳繃鏌ョ湅绁炵粡缃戠粶鍦ㄦ牱鏈唴澶栫殑棰勬祴缁撴灉鍙戠幇锛岃鍒嗙被绠楁硶鍦ㄨ鏁版嵁闆嗕笂鐨勮〃鐜拌緝宸紝鈥滄毚鍔涒€濆湴灏嗘墍鏈夋爣绛惧潎鍒嗙被涓?
锛屽垎绫绘纭巼浠呬负
銆傞€氳繃浜ゅ弶楠岃瘉鍙涔庣粨鏋滀篃鍙互鐪嬪嚭锛岃绁炵粡缃戠粶妯″瀷瀵硅秴鍙傛暟閫夊彇杈冧负鏁忔劅锛岃€岃缁冮泦鍑嗙‘鐜囧潎涓?
锛屾病鏈変换浣曟敼鍠勩€?/p>
4.3 缁撴灉姹囨€?/span>
c_ml_stata
鍛戒护鍏辨彁渚涗簡 8 绉嶅垎绫绘ā鍨嬶紝閫愪竴浣跨敤鍚勬ā鍨嬪鎵€鎻愪緵鐨勬牱鏈唴鏁版嵁闆嗚繘琛岃缁冨悗锛屾牱鏈唴鏈€浼樺垎绫荤粨鏋滅殑鍑嗙‘鐜囧強楠岃瘉闆嗛泦鍑嗙‘鐜囧涓嬫墍绀猴細
妯″瀷
鏈€浼樿缁冮泦
鍑嗙‘鐜?/th>
鏈€浼橀獙璇侀泦
鍑嗙‘鐜?/th>
鍐崇瓥鏍?/td>
闅忔満妫灄
鎻愬崌鏁?/td>
K 杩戦偦
绁炵粡缃戠粶
鏈寸礌璐濆彾鏂?/td>
鏀寔鍚戦噺鏈?/td>
5. 鎬荤粨
c_ml_stata
杩欎竴 Python 鍜?Stata 缁撳悎鐨勫懡浠ゅ強鍏朵娇鐢ㄦ柟娉曘€傛垜浠彲浠ョ湅鍒伴殢鐫€鏃朵唬鐨勫彂灞曪紝鏈哄櫒瀛︿範绠楁硶鐨勫鏍锋€т互鍙婂叾骞块様鐨勫簲鐢ㄨ寖鍥达紱浣嗘槸锛屾垜浠缁堜笉鑳藉皢鏂规硶鍋氫负鏈€缁堢殑鐩殑锛屽叾鑳屽悗鐨勭粡娴庡鍚箟渚濈劧鍊煎緱鎴戜滑鎬濊€冦€?/p>
6. 鍙傝€冭祫鏂?/span>
鈥?杩炰韩浼?路 鏈€鍙楁杩庣殑璇?/strong>
鈱?2021 骞?1.25-2.4
杩炰韩浼氫富椤碉細馃崕 www.lianxh.cn
鐩存挱瑙嗛锛歭ianxh.duanshu.com
鈥?br>
鍏充簬鎴戜滑
lianxh
鍛戒护鍙戝竷浜嗭細
闅忔椂鎼滅储杩炰韩浼氭帹鏂囥€丼tata 璧勬簮锛屽畨瑁呭懡浠ゅ涓嬶細
鈥?. ssc install lianxh
浣跨敤璇︽儏鍙傝甯姪鏂囦欢 (鏈夋儕鍠?锛?br>鈥?. help lianxh
以上是关于Stata锛氭満鍣ㄥ涔犲垎绫诲櫒澶у叏的主要内容,如果未能解决你的问题,请参考以下文章