从很长的字符串中获取特定的字符串
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了从很长的字符串中获取特定的字符串相关的知识,希望对你有一定的参考价值。
我正在使用正则表达式来获取进行一些比较操作所需的数据。我从https://play.pokemonshowdown.com/data/pokedex.js?fe67f5ac那里刮了一下,那里有所有可能的神奇宝贝的清单。我使用正则表达式将Pokemon划分为不同的层,如下所示:
LC = (re.findall(r'name:(.+?)tier:"LC"\',data))
然后像这样收集数据:
self.LC = s.join(LC)
self.names = (re.findall(r'name:"(.+?)"', self.LC))
self.stats = (re.findall(r'baseStats:(.+?)', self.LC))
self.types = (re.findall(r'types:\[(.+?)]', self.LC))
[不幸的是,当我使用第一行代码时,它几乎收集了该列表中的每个Pokemon,从而创建了一个效率极低且无用的字符串。任何有关如何解决此问题的帮助将不胜感激。
现在这是我在打印列表中的第一项时得到的:
"Bulbasaur",types:["Grass","Poison"],genderRatio:
M:0.875,F:0.125,baseStats:
hp:45,atk:49,def:49,spa:65,spd:65,spe:45,abilities
"0":"Overgrow",H:"Chlorophyll",heightm:0.7,weightkg:6.9,color:"Green",evos:["Ivysaur"],eggGroups:["Monster","Grass"],tier:"LC",ivysaur:num:2,name:"Ivysaur",types:["Grass","Poison"],..........
我想要的是:
"Bulbasaur",types:["Grass","Poison"],genderRatio:
M:0.875,F:0.125,baseStats:
hp:45,atk:49,def:49,spa:65,spd:65,spe:45,abilities:
"0":"Overgrow",H:"Chlorophyll",heightm:0.7,weightkg:6.9,color:"Green",evos:
["Ivysaur"],eggGroups:["Monster","Grass"],
字符串在我要查找的任何层之前结束,在本例中为LC。
答案
[作为替代,您可以使用How to convert raw javascript object to python dictionary?中的建议之一将您的javascript对象转换为适当的JSON(仅在删除字符串开头export
和结尾;
之后)。从那里,您将可以使用python字典符号来按需访问/过滤。
原始javascript对象
exports.BattlePokedex = bulbasaur:num:1,name:"Bulbasaur",types:["Grass","Poison"],genderRatio:M:0.875,
F:0.125,baseStats:hp:45,atk:49,def:49,spa:65,spd:65,spe:45,abilities:"0":"Overgrow",H:"Chlorophyll",
heightm:0.7,weightkg:6.9,color:"Green",evos:["Ivysaur"],eggGroups:["Monster","Grass"],tier:"LC",
ivysaur:num:2,name:"Ivysaur",types:["Grass","Poison"],genderRatio:M:0.875,F:0.125,
baseStats:hp:60,atk:62,def:63,spa:80,spd:80,spe:60,abilities:... ...
Javascript对象符号
"bulbasaur":"num":1,"name":"Bulbasaur","types":["Grass","Poison"],"genderRatio":"M":0.875,"F":0.125,
"baseStats":"hp":45,"atk":49,"def":49,"spa":65,"spd":65,"spe":45,"abilities":"0":"Overgrow",
"H":"Chlorophyll","heightm":0.7,"weightkg":6.9,"color":"Green","evos":["Ivysaur"],
"eggGroups":["Monster","Grass"],"tier":"LC","ivysaur":"num":2,"name":"Ivysaur","types":["Grass","Poison"],
"genderRatio":"M":0.875,"F":0.125,"baseStats":"hp":60,"atk":62,"def":63,"spa":80,"spd":80,"spe":60,
"abilities":... ...
import requests
import json
import _jsonnet
import collections
r = requests.get("https://play.pokemonshowdown.com/data/pokedex.js?fe67f5ac")
d = r.content.decode('utf-8')
# remove JS export and the ";" at the end
json_str = d[24:-1]
print(json_str)
# convert to JSON
json_obj = _jsonnet.evaluate_snippet("snippet", json_str)
pyDict = json.loads(json_obj)
print(pyDict)
print("Total: ".format(len(pyDict)))
tier_dict = collections.defaultdict(list)
for pkm in pyDict:
tier = pyDict[pkm].get("tier")
if tier:
tier_dict[tier].append(
pyDict[pkm].get("name"):
"baseStats": pyDict[pkm].get("baseStats"),
"types": pyDict[pkm].get("types")
# add desired stat here
)
print(tier_dict)
Total: 1203
tier_dict的输出
'NU': [
'Abomasnow':
'baseStats':
'atk': 92,
...
],
'Illegal': [
'Abomasnow-Mega':
'baseStats':
'atk': 132,
...
],
'RU': [
'Accelgor':
'baseStats':
'atk': 70,
...
],
'OU': [
'Aegislash':
'baseStats':
'atk': 50,
...
以上是关于从很长的字符串中获取特定的字符串的主要内容,如果未能解决你的问题,请参考以下文章
如何将很长的字符串保存到firebase firestore数据库中?
XmlReader - 如何在没有 System.OutOfMemoryException 的情况下读取元素中很长的字符串