检测包含多个字符串的列表中的唯一术语
Posted
技术标签:
【中文标题】检测包含多个字符串的列表中的唯一术语【英文标题】:Detecting Unique Terms In A List Contained of Multiple Strings 【发布时间】:2022-01-13 17:28:06 【问题描述】:example = ["duran duran sang wild boys in 1984", "wild boys don't remain forever wild", "who brought wild flowers","it was john krakauer who wrote in to the wild"]
我如何检测独特的术语并将它们放在这样的列表中:
['duran', 'sang', 'wild', 'boys', 'in', '1984', "don't", 'remain', 'forever', 'who', 'brought', 'flowers', 'it', 'was', 'john',
'krakauer', 'wrote', 'to', 'the']
我的代码:
def uniqueterms(a, d, e, f) :
b = a.split()
c = [] `
for x in b:
if a.count(x) >= 1 and (x not in c):
c.append(x)
print((' '.join(c)).split(), end=' ')
g = d.split()
h = []
for y in g:
if d.count(y) >= 1 and (y not in h):
h.append(y)
print((' '.join(h)).split(), end=' ')
i = e.split()
j = []
for z in i:
if e.count(z) >= 1 and (z not in j):
j.append(z)
print((' '.join(j)).split(), end=' ')
k = f.split()
m = []
for t in k:
if f.count(t) >= 1 and (t not in m):
m.append(t)
print((' '.join(m)).split())
>>> uniqueterms(example[0], example[1], example[2], example[3])
['duran', 'sang', 'wild', 'boys', 'in', '1984'] ['wild', 'boys', "don't", 'remain', 'forever'] ['who', 'brought', 'wild', 'flowers'] ['it', 'was', 'john', 'krakauer', 'who', 'wrote', 'in', 'to', 'the', 'wild']
【问题讨论】:
【参考方案1】:*更新后按出现顺序返回唯一单词。以前使用python set() 的版本对输入顺序不敏感:
def get_unique_words(text):
visited = set()
uniq = []
for word in text.split():
if word not in visited:
uniq.append(word)
visited.add(word)
return uniq
处理字符串列表:
def get_unique_words_from_list_of_strings(str_list):
return get_unique_words(' '.join(str_list))
运行您的示例:
words_in_order = get_unique_words_from_list_of_strings(example)
返回
['duran', 'sang', 'wild', 'boys', 'in', '1984', "don't", 'remain', 'forever', 'who', 'brought', 'flowers', 'it', 'was', 'john', 'krakauer', 'wrote', 'to', 'the']
【讨论】:
要使解决方案完整,您应该将其用作get_unique_words(' '.join(example))
好的,但现在我如何对列表进行排序?
已更新以包括如何对列表进行排序
每次我运行它时,它都会给出一个随机输出,所以我无法通过我知道的任何帮助的方式获得这个输出? ['duran','sang','wild','boys','in','1984','don't','remain','forever','who','brought','flowers' , 'it', 'was', 'john', 'krakauer', 'wrote', 'to', 'the']
啊,好的,您希望它们在原始字符串列表中按出现顺序排序。这有点棘手,一会儿。以上是关于检测包含多个字符串的列表中的唯一术语的主要内容,如果未能解决你的问题,请参考以下文章
如何从给定复杂 JSON 的 JSON 列表中检测冲突字符串?