Posted 杩涘嚮鐨凜oder
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了相关的知识,希望对你有一定的参考价值。
缂栬€呮寜锛欻uggingface绉戠爺璐熻矗浜篢homas Wolf绠€瑕佷粙缁嶄簡鏅€傝瘝宓屽叆銆佸彞宓屽叆鐨勬渶鏂版柟娉曘€?/span>
鍥剧墖鏉ユ簮锛欽ohn Christian Fjellestad
璇嶅祵鍏ワ紙word embeddings锛?/span>鍜?span>鍙ュ祵鍏ワ紙sentence embeddings锛?/span>宸茬粡鎴愪负浠讳綍鍩轰簬娣卞害瀛︿範鐨勮嚜鐒惰瑷€澶勭悊绯荤粺涓嶅彲鎴栫己鐨勯儴鍒嗐€?/p>
璇嶅祵鍏ュ拰鍙ュ祵鍏ュ皢鍗曡瘝鍜屽彞瀛?馃摐 缂栫爜涓哄浐瀹氶暱搴︾殑瀵嗛泦鍚戦噺馃搻 锛屾垙鍓ф€у湴鏀瑰杽鏂囨湰鏁版嵁鐨勫鐞嗐€?/p>
涓€涓儹闂ㄧ殑瓒嬪娍鏄?span>鏅€傚祵鍏ワ紙Universal Embeddings锛?/span>锛氬湪澶у瀷璇枡搴撲笂棰勮缁冪殑宓屽叆锛屽彲浠ユ彃鍏ヨ澶氫笅娓哥殑浠诲姟妯″瀷锛堟儏鎰熷垎鏋愩€佸垎绫汇€佺炕璇戔€︹€︼級锛岄€氳繃骞跺叆鍦ㄨ緝澶ц妯$殑鏁版嵁闆嗕笂瀛︿範鍒扮殑璇?鍙ヨ〃绀鸿嚜鍔ㄦ敼鍠勬ā鍨嬭〃鐜般€?/p>
杩欐槸涓€绉?span>杩佺Щ瀛︿範锛坱ransfer learning锛?/span>鐨勫舰寮忋€傛渶杩戯紝杩佺Щ瀛︿範鍦ㄤ竴浜涢噸瑕佺殑浠诲姟锛堟瘮濡傛枃鏈垎绫伙級涓婃垙鍓ф€у湴鎻愬崌浜哊LP妯″瀷鐨勮〃鐜般€傝鍙傞槄Jeremy Howard鍜孲ebastian Ruder鎻愬嚭鐨刄LMFiT妯″瀷銆?/p>
灏界闀挎湡浠ユ潵鍙ュ瓙鐨勬棤鐩戠潱琛ㄧず瀛︿範鏄富娴侊紝鏈€杩戝嚑涓湀锛?017骞存湯/2018骞村垵锛夛紝鎴戜滑鐪嬪埌浜嗚澶氶潪甯告湁瓒g殑宸ヤ綔锛屾樉绀轰簡鍚戠洃鐫e涔犲拰澶氫换鍔″涔犺浆鍚戠殑瓒嬪娍銆?/p>
鏈€杩戠殑鏅€傝瘝/鍙ュ祵鍏ョ爺绌?/p>
鏈枃瀵瑰綋鍓嶆渶鍏堣繘鐨勬櫘閫傝瘝/鍙ュ祵鍏ュ仛浜嗕竴涓畝鐭殑鍒濇鎬荤粨銆傛垜浠皢鎻忚堪涓婅〃涓姞绮楃殑妯″瀷锛?/p>
寮哄姏/杩呴€熺殑鍩虹嚎锛欶astText銆佽瘝琚嬶紙Bag-of-Words锛?/p> 褰撳墠鏈€鍏堣繘妯″瀷锛欵LMo銆丼kip-Thoughts銆丵uick-Thoughts銆?InferSent銆丮ILA/MSR鐨凣eneral Purpose Sentence Representations銆丟oogle鐨刄niversal Sentence Encoder 濡傛灉浣犲笇鏈涗簡瑙d竴浜涜瘝宓屽叆鍦?017骞翠箣鍓嶇殑杩涘睍锛屽彲浠ュ弬鑰僑ebastian Ruder鍐欑殑Word embeddings in 2017: Trends and future directions銆俁uder杩樺啓杩囦竴绡囦粙缁嶈瘝宓屽叆鎶€鏈殑鏂囩珷On word embeddings銆?/p>
璁╂垜浠粠璇嶅祵鍏ュ紑濮嬨€?/p>
杩戜簲骞存潵鎻愬嚭浜嗗ぇ閲忚瘝宓屽叆鏂规硶銆傚叾涓渶甯哥敤鐨勬ā鍨嬫槸word2vec鍜孏loVe锛岃繖涓や釜妯″瀷閮芥槸鍩轰簬鍒嗗竷鍋囪锛坉istributional hypothesis锛?/span>鐨勬棤鐩戠潱鏂规硶銆傦紙鏍规嵁鍒嗗竷鍋囪锛屽嚭鐜板湪鐩稿悓涓婁笅鏂囦腑鐨勫崟璇嶅€惧悜浜庡叿鏈夌浉浼肩殑鍚箟锛夈€?br> 灏界鏈変竴浜涘伐浣滈€氳繃骞跺叆璇箟鎴栬娉曠煡璇嗙瓑澧炲己杩欎簺鏃犵洃鐫f柟娉曪紝绾棤鐩戠潱鏂规硶鍦?017-2018骞存湡闂村彇寰椾簡鏈夎叮鐨勮繘灞曪紝鍏朵腑鏈€閲嶅ぇ鐨勬槸FastText锛坵ord2vec鐨勬墿灞曪級鍜?span>ELMo锛堝綋鍓嶆渶鍏堣繘鐨勪笂涓嬫枃璇嶅悜閲忥級銆?/p>
FastText鐢盩omas Mikolov鍥㈤槦寮€鍙戯紝杩欎竴鍥㈤槦姝f槸鍦?013骞存彁鍑簑ord2vec鐨勫洟闃熴€?/p>
FastText鐨勪富瑕佹敼杩涙槸鍖呭惈浜嗗瓧绗︾殑n鍏冭娉曪紝浠庤€屽彲浠ヤ负璁粌鏁版嵁涓病鏈夊嚭鐜?/span>鐨勫崟璇嶈绠楄瘝琛ㄧず銆?/p>
FastText鍚戦噺璁粌鏋佷负杩呴€燂紝鍚屾椂鎻愪緵浜嗗熀浜庣淮鍩虹櫨绉戝拰Crawl璁粌鐨?57绉嶈瑷€鐨勯璁粌璇嶅悜閲忊€斺€旇繖鏄緢妫掔殑鍩虹嚎銆?/p>
ELMo锛堟繁搴︿笂涓嬫枃璇嶈〃绀猴級鏈€杩戝皢璇嶅祵鍏ョ殑鏈€浣宠〃鐜版彁鍗囦簡涓嶅皯銆傚畠鏄敱AI2寮€鍙戠殑锛屽苟鍦∟AACL 2018涓婅璇勪负鏈€浣宠鏂囥€?/p>
Elmo锛堣姖楹昏瑙掕壊锛?/p>
鍦‥LMo涓紝宓屽叆鍩轰簬涓€涓?span>鍙屽眰鐨勫弻鍚戣瑷€妯″瀷锛坆iLM锛?/span>鐨勫唴閮ㄧ姸鎬?/span>璁$畻锛孍LMo涔熸槸鍥犳寰楀悕鐨勶細Embeddings from Language Models锛堟潵鑷瑷€妯″瀷鐨勫祵鍏ワ級銆?/p>
ELMo鐨勭壒鎬э細 ELMo鐨勮緭鍏ユ槸瀛楃鑰屼笉鏄崟璇嶃€傝繖浣垮緱瀹冨彲浠ュ埄鐢ㄥ瓙瀛楋紙sub-word锛夊崟鍏冧负璇嶆眹琛ㄤ互澶栫殑鍗曡瘝璁$畻鏈夋剰涔夌殑琛ㄧず锛堝拰FastText绫讳技锛夈€?/p> ELMo鏄痓iLM鐨勫灞傛縺娲荤殑杩炴帴锛坈oncatenation锛?/span>銆傝瑷€妯″瀷鐨勪笉鍚屽眰缂栫爜浜嗗崟璇嶇殑涓嶅悓淇℃伅銆傝繛鎺ユ墍鏈夊眰浣垮緱ELMo鍙互缁勫悎澶氱璇嶈〃绀猴紝浠ユ彁鍗囦笅娓镐换鍔$殑琛ㄧ幇銆?/p> 濂戒簡锛岀幇鍦紝璁╂垜浠湅鐪嬫櫘閫傚彞宓屽叆銆?/p>
鐩墠鏈夊緢澶氱浉浜掔珵浜夌殑瀛︿範鍙ュ祵鍏ョ殑鏂规銆傚敖绠$畝鍗曠殑鍩虹嚎锛堜緥濡傚钩鍧囪瘝宓屽叆锛夋寔缁彁渚涘己鍔涚殑缁撴灉锛屽湪2017骞翠笅鍗婂勾鍜?018骞翠笂鍗婂勾鍑虹幇浜嗕竴浜涘垱鏂扮殑鏃犵洃鐫e拰鐩戠潱鏂规硶锛屼互鍙婂浠诲姟瀛︿範鏂规銆?/p>
璇嶈鏂规硶 杩欎竴棰嗗煙鐨勪竴鑸叡璇嗘槸锛岀洿鎺?span>骞冲潎涓€涓彞瀛愮殑璇嶅悜閲?/span>杩欎竴绠€鍗曟柟娉曪紙鎵€璋撹瘝琚嬫柟娉曪級锛屼负璁稿涓嬫父浠诲姟鎻愪緵浜嗗己鍔涚殑鍩虹嚎銆?/p>
Arora绛夊幓骞村湪ICLR鍙戣〃鐨勮鏂嘇 Simple but Tough-to-Beat Baseline for Sentence Embeddings鎻愪緵浜嗕竴涓緢濂界殑绠楁硶锛氶€夋嫨涓€绉嶆祦琛岀殑璇嶅祵鍏ワ紝缂栫爜鍙ュ瓙涓鸿瘝鍚戦噺鐨勭嚎鎬у姞鏉冪粍鍚堬紝鐒跺悗杩涜鐩稿悓鎴愬垎绉婚櫎锛堟牴鎹瑕佷富鎴愬垎绉婚櫎鍚戦噺鎶曞奖锛夈€傝繖涓€閫氱敤鏂规硶鍏锋湁娣卞埢鑰屽己澶х殑鐞嗚鍔ㄦ満锛屽熀浜庡湪璇瘒鍚戦噺涓婇殢鏈鸿璧颁互鐢熸垚鏂囨湰鐨勭敓鎴愬紡妯″瀷銆?/p>
杈惧鏂藉鐗瑰伐涓氬ぇ瀛︽渶杩戞彁鍑轰簡Concatenated p-mean Embeddings锛堣繛鎺-鍧囧€煎祵鍏ワ級锛屼竴涓瘮Arora绛夌殑绠楁硶鏇村己鍔茬殑璇嶈鍩虹嚎瀹炵幇锛堟劅璋aser鍚戞垜鎻愬強杩欎竴宸ヤ綔锛夈€?/p>
璇嶈鏂规硶涓㈠け浜嗚瘝搴忥紝浣嗘槸淇濈暀浜嗘暟閲忔儕浜虹殑璇箟鍜岃娉曞唴瀹?/p>
鏃犵洃鐫f柟娉?/strong> Jamie Kiros绛夊湪2015骞存彁鍑轰簡Skip-thoughts鍚戦噺锛屼竴绉嶅熀浜庢棤鐩戠潱璁粌鐨勬柟娉曘€?/p>
鏃犵洃鐫f柟妗堝皢瀛︿範鍙ュ瓙宓屽叆浣滀负瀛︿範棰勬祴鍙ュ瓙鎴栧瓙鍙ョ殑涓嬩竴鍙ョ殑鍓骇鍝併€傝繖涓€鏂规硶鍙互锛堢悊璁轰笂锛夊埄鐢ㄤ换浣曞寘鍚繛璐彞瀛愩€佸瓙鍙ョ殑鏂囨湰鏁版嵁闆嗐€?/p>
Skip-thoughts鍚戦噺鏄棤鐩戠潱瀛︿範鍙ュ祵鍏ョ殑鍏稿瀷渚嬪瓙銆傚彲浠ユ妸瀹冪湅鎴愯瘝宓屽叆鐨剆kip-gram妯″瀷鐨勫彞瀛愮増鏈細缁欏畾鍙ュ瓙锛岄娴嬪叾鍛ㄥ洿鐨勫彞瀛?/span>銆傝妯″瀷鍖呮嫭涓€涓熀浜嶳NN鐨勭紪鐮佸櫒-瑙g爜鍣ㄦ灦鏋勶紝鏍规嵁褰撳墠鍙ュ瓙閲嶅缓鍛ㄥ洿鐨勫彞瀛愩€?/p>
Skip-thoughts璁烘枃鐨?span>璇嶆眹鎵╁睍鏂规寰堟湁瓒o細閫氳繃瀛︿範RNN璇嶅祵鍏ョ┖闂村拰word2vec涔嬬被鐨勮瘝宓屽叆鐨勭嚎鎬у彉鎹紝鏉ュ鐞嗗湪璁粌涓湭瑙佺殑鍗曡瘝銆?/p>
鏈€杩戯紝Quick-thoughts鍚戦噺鍙戝睍浜哠kip-thoughts鍚戦噺銆傝璁烘枃鍦ㄤ粖骞寸殑ICLR涓婃姤鍛婅繃銆傚湪杩欎竴宸ヤ綔涓紝缁欏畾鍓嶄竴鍙ワ紝棰勬祴涓嬩竴鍙ョ殑浠诲姟琚噸鏁翠负涓€涓垎绫讳换鍔★細鐢ㄤ竴涓粠鍊欓€夊彞瀛愪腑閫夋嫨涓嬩竴鍙ョ殑鍒嗙被鍣ㄥ彇浠d簡瑙g爜鍣?/span>銆傝繖鍙互鐪嬫垚鏄鐢熸垚闂鐨勫垽鍒€艰繎銆?/p>
璇ユā鍨嬬殑涓€澶т紭鍔挎槸璁粌閫熷害锛堝拰Skip-thoughts妯″瀷鏈夋暟閲忕骇鐨勫樊寮傦級锛屽洜姝わ紝鍦ㄥぇ瑙勬ā鏁版嵁闆嗕笂锛屽畠鏄竴涓緢鏈夌珵浜夊姏鐨勬柟妗堛€?/p>
Quick-thoughts鐨勫垎绫诲櫒浠庝竴缁勫彞宓屽叆涓€夋嫨涓嬩竴鍙?/p>
鐩戠潱鏂规硶 闀挎湡浠ユ潵锛屼汉浠涓猴紝鐩告瘮鏃犵洃鐫f柟娉曪紝鐩戠潱瀛︿範鍙ュ祵鍏ョ粰鍑虹殑宓屽叆璐ㄩ噺姣旇緝浣庛€傜劧鑰岋紝閮ㄥ垎鏄洜涓篒nferSent鐨勬彁鍑猴紝鏈€杩戣繖涓€鍋囧畾琚帹缈讳簡銆?/p>
鍜屼箣鍓嶆弿杩扮殑鏃犵洃鐫f柟娉曚笉鍚岋紝鐩戠潱瀛︿範闇€瑕佹爣娉ㄨ繃鐨勬暟鎹泦锛岃鏁版嵁闆嗗寘鎷负鏌愪釜浠诲姟娣诲姞鐨勬敞閲婏紝姣斿鑷劧璇█鎺ㄦ柇锛堜緥濡傦紝钑存兜鍙ュ锛夋垨鏈哄櫒缈昏瘧锛堢炕璇戝彞瀵癸級銆傝繖灏辨彁鍑轰簡闂锛氳閫夋嫨鍝竴绉嶄换鍔★紵澶氬ぇ鐨勬暟鎹泦鑳芥彁渚涗紭璐ㄥ祵鍏ワ紵鎴戜滑灏嗗湪涓嬩竴灏忚妭锛堝浠诲姟瀛︿範锛夎璁鸿繖浜涢棶棰樸€傚湪姝や箣鍓嶏紝璁╂垜浠湅涓?017骞村彂琛ㄧ殑InferSent杩欎竴绐佺牬銆?/p>
InferSent鏄竴涓緢鏈夎叮鐨勬柟娉曪紝瀹冪殑鏋舵瀯寰堢畝鍗曘€傚畠浣跨敤鍙ュ瓙缂栫爜鍣ㄥ湪Sentence Natural Language Inference dataset锛堜竴涓寘鍚?7涓囧彞瀛愬鐨勬暟鎹泦锛屾瘡涓彞瀛愬鏍囨敞涓轰腑鎬с€佸啿绐併€佽暣娑典笁涓被鍒腑鐨勪竴涓級涓婅缁冧竴涓垎绫诲櫒銆傚彞瀛愬涓殑鍙ュ瓙鍧囦娇鐢ㄧ浉鍚岀殑缂栫爜鍣ㄧ紪鐮侊紝鍒嗙被鍣ㄥ湪鐢变袱涓彞宓屽叆鏋勬垚鐨勮〃绀哄涓婅缁冦€傚彞瀛愮紪鐮佸櫒涓哄弻鍚慙STM鍔犱笂鏈€澶ф睜鍖栥€?/p>
澶氫换鍔″涔?/strong> 涔嬪墠鎴戜滑鎻愬埌杩囷紝鐩戠潱瀛︿範闇€瑕侀€夋嫨涓烘煇涓€浠诲姟鏍囨敞鐨勬暟鎹泦锛?/p>
鍝鐩戠潱璁粌浠诲姟鑳藉瀛︿範鍦ㄤ笅娓镐换鍔′腑閫氱敤鎬ф洿濂界殑鍙ュ祵鍏ワ紵 澶氫换鍔″涔犲彲浠ョ湅鎴愭槸Skip-thoughts銆両nferSent浠ュ強鐩稿叧鏃犵洃鐫?鐩戠潱瀛︿範鏂规鐨勬帹骞匡紝澶氫换鍔″涔犻€氳繃灏濊瘯鍦ㄤ竴绉嶈缁冩柟妗堜腑缁勫悎澶氱璁粌鐩爣鍥炵瓟涓婇潰鐨勯棶棰樸€?/p>
2018骞翠笂鍗婂勾鍙戣〃浜嗕竴浜涘浠诲姟瀛︿範鏂归潰鐨勫伐浣溿€傝鎴戜滑绠€鍗曚粙缁嶄笅MILA/MSR鐨?span>General Purpose Sentence Representation锛堜竴鑸洰鐨勫彞瀛愯〃绀猴級鍜孏oogle鐨?span>Universal Sentence Encoder锛堟櫘閫傚彞缂栫爜鍣級銆?/p>
MILA/MSR鐨勫伐浣滃湪ICLR 2018涓婂仛浜嗘姤鍛婏紙arXiv:1804.00079锛夛紝鍦ㄨ繖涓€宸ヤ綔涓紝Subramanian绛夎瀵熷埌锛屼负浜嗚兘澶熸帹骞垮埌鑼冨洿骞挎硾鐨勫绉嶄换鍔★紝鏈夊繀瑕佺紪鐮佸悓涓€鍙ュ瓙鐨勫涓柟闈€?/p>
鍥犳锛孲ubramanian绛夊埄鐢ㄤ竴涓?span>涓€瀵瑰鐨勫浠诲姟瀛︿範妗嗘灦锛岄€氳繃鍒囨崲浠诲姟瀛︿範鏅€傚彞宓屽叆銆備粬浠€夋嫨鐨?涓换鍔★紙棰勬祴鍚庡彞/鍓嶅彞銆佺缁忔満鍣ㄧ炕璇戙€佺煭璇粨鏋勮В鏋愩€佽嚜鐒惰瑷€鎺ㄦ柇绛夛級鍏变韩閫氳繃鍙屽悜GRU寰楀埌鐨勫彞瀛愬祵鍏ャ€傝瘯楠屾樉绀猴紝閫氳繃澶氳绁炵粡缈昏瘧浠诲姟鑳芥洿濂藉湴瀛︿範鍙ユ硶鎬ц川锛岄€氳繃瑙f瀽浠诲姟鍒欒兘鏇村ソ鍦板涔犻暱搴﹀拰璇嶅簭锛岃€岃缁冪缁忚瑷€鎺ㄦ柇鍙互鏇村ソ鍦扮紪鐮佽娉曚俊鎭€?/p>
Google鍦?018骞翠笂鍗婂勾鍙戣〃鐨?span>鏅€傚彞缂栫爜鍣?/span>閲囩敤浜嗗悓鏍风殑鏂规硶銆備粬浠殑缂栫爜鍣ㄤ娇鐢ㄤ簡涓€涓湪澶氱鏁版嵁婧愬拰澶氱浠诲姟涓婅缁冪殑杞崲鍣ㄧ綉缁滐紝浠ヤ究鍔ㄦ€佸湴瀹圭撼骞挎硾鐨勮嚜鐒惰瑷€鐞嗚В浠诲姟銆俆ensorFlow鎻愪緵浜嗕竴涓妯″瀷鐨勯璁粌鐗堟湰銆?/p>
鏈€杩戝嚑涓湀鏉ワ紝鏅€傝瘝宓屽叆鍜屽彞宓屽叆杩欎竴棰嗗煙杩樻湁涓嶅皯鏈夎叮鐨勮繘灞曪紝鍖呮嫭璇勪及杩欎簺宓屽叆鐨勮〃鐜扮殑鏂规硶锛屼互鍙婂叧浜庡叾鍐呭湪鐨勫亸缃殑鐮旂┒锛堝鏅€傚祵鍏ヨ€岃█锛岃繖鏄竴涓ぇ闂锛夈€傛湰鏂囨病鑳借璁鸿繖浜涙渶鏂扮殑涓婚锛屼笉杩囨垜浠湪鏂囨湯鎻愪緵浜嗕竴浜涢摼鎺ャ€?br> 鎴戝笇鏈涗綘鍠滄杩欑瘒绠€鐭殑鍥為【銆傚鏋滀綘鍠滄鐨勮瘽锛岃鐐硅禐銆?/p>
Perone绛夋渶杩戝彂琛ㄤ簡涓€绡囨瘮杈僂LMo銆両nferSent銆丟oogle鏅€傚彞缂栫爜鍣ㄣ€乸-mean銆丼kip-thought绛夊祵鍏ユ柟娉曠殑璁烘枃锛歛rXiv:1806.06259 Hironsan鐨刧ithub浠撳簱鏀堕泦浜嗗緢澶氳瘝宓屽叆鐩稿叧鐨勮祫婧愶細Hironsan/awesome-embedding-models 鍙ュ祵鍏ヨ鏂囷細Skip-Thoughts锛坅rXiv:1506.06726锛夈€丵uick-Thoughts锛坥penreview/rJvJXZb0W锛夈€丏iscSent锛坅rXiv:1705.00557锛夈€両nferSent锛坅rXiv:1705.02364锛夈€丮ILA/MSR鐨凣eneral Purpose Sentence Representations锛坅rXiv:1804.00079锛? Google鐨刄niversal Sentence Encoder锛坅rXiv:1803.11175锛夈€丟oogle鐨処nput-Ouput Sentence learning on dialog锛坅rXiv:1804.07754锛夈€?/p> 濡傛灉浣犲鍙ュ祵鍏ョ殑璇勪及鏂瑰紡鎰熷叴瓒o紝鍒敊杩嘑alebook鏈€杩戠殑宸ヤ綔SentEval锛屼互鍙奛YU銆乁W銆丏eepMind鐨勭爺绌朵汉鍛樻渶杩戝彂琛ㄧ殑GLUE璇勬祴妗嗘灦銆?/p> 鎺ㄨ崘闃呰 1 2 3 4 5 宕斿簡鎵?/strong> 闈欒鍗氬鍗氫富锛?/span>銆奝ython3缃戠粶鐖櫕寮€鍙戝疄鎴樸€嬩綔鑰?/span> 闅愬舰瀛?/span> 杩欓噷鈥滈槄璇诲師鏂団€濓紝鏌ョ湅鏇村 以上是关于的主要内容,如果未能解决你的问题,请参考以下文章
璇嶅祵鍏ユ渶杩戠殑杩涘睍
鏅€傚彞宓屽叆鐨勫叴璧?/span>
缁撹
鐩稿叧璧勬簮
缂栬瘧锛歸eakish