从 keras.preprocessing.text 在 pytorch 中导入 one_hot 等效项？

Posted 2023-02-16

技术标签:

【中文标题】从 keras.preprocessing.text 在 pytorch 中导入 one_hot 等效项？【英文标题】：from keras.preprocessing.text import one_hot equivalent in pytorch? 【发布时间】：2021-08-02 20:03:24 【问题描述】：

我刚开始使用 pytorch 进行 NLP。我找到了一个教程，它使用from keras.preprocessing.text import one_hot 并将文本转换为给定词汇量大小的 one_hot 表示。

例如：

输入是

vocab_size = 10000
sentence = ['the glass of milk',
            'the cup of tea',
            'I am a good boy']

onehot_repr = [one_hot(words, vocab_size) for words in sentence]

输出是"

[[6654, 998, 8896, 1609], [6654, 998, 1345, 879], [123, 7653, 1, 5678,7890]]

如何在 pytorch 中执行相同的程序并获得如上所示的输出。

【问题讨论】：

【参考方案1】：

PyTorch 从根本上与张量一起工作，而不是为与字符串一起工作而设计的。但是，您可以使用SK Learn's LabelEncoder 对您的文字进行编码：

from sklearn import preprocessing

le = preprocessing.LabelEncoder()
le.fit([w for s in sentence for w in s.split()])

onehot_repr = [le.transform(s.split()) for s in sentence]

>>> [array([10,  5,  8,  7]), array([10,  4,  8,  9]), array([0, 2, 1, 6, 3])]

【讨论】：

以上是关于从 keras.preprocessing.text 在 pytorch 中导入 one_hot 等效项？的主要内容，如果未能解决你的问题，请参考以下文章

从PRISM开始学WPFMVVMViewModel？

在 python 中，为啥从数组读取比从列表读取慢？

从图库中挑选或从相机捕获的高质量图像

从PRISM开始学WPFMVVMCommand？

从PRISM开始学WPFPrism？

mysql 主-主-从-从