在 Jupyter Notebook 上获取 JsonDecodeError
Posted
技术标签:
【中文标题】在 Jupyter Notebook 上获取 JsonDecodeError【英文标题】:Getting a JsonDecodeError on Jupyter Notebook 【发布时间】:2019-10-12 01:22:43 【问题描述】:我正在设置一个 Jupyter Notebook,它将来自 Ibm watson Studio API 的机器学习模型应用于来自我的 Postgresql 数据库的一些数据。
在重塑数据以供 API 读取时,出现JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
,我无法解决。
这是完整的回溯:
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-114-9d8e7cf98a41> in <module>()
1 import json
2
----> 3 classes = natural_language_classifier.classify_collection('7818d2s519-nlc-1311', reshaped).get_result()
4
5 print(json.dumps(classes, indent=2))
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/watson_developer_cloud/natural_language_classifier_v1.py in classify_collection(self, classifier_id, collection, **kwargs)
152 if collection is None:
153 raise ValueError('collection must be provided')
--> 154 collection = [self._convert_model(x, ClassifyInput) for x in collection]
155
156 headers =
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/watson_developer_cloud/natural_language_classifier_v1.py in <listcomp>(.0)
152 if collection is None:
153 raise ValueError('collection must be provided')
--> 154 collection = [self._convert_model(x, ClassifyInput) for x in collection]
155
156 headers =
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/watson_developer_cloud/watson_service.py in _convert_model(val, classname)
461 if classname is not None and not hasattr(val, "_from_dict"):
462 if isinstance(val, str):
--> 463 val = json_import.loads(val)
464 val = classname._from_dict(dict(val))
465 if hasattr(val, "_to_dict"):
/opt/conda/envs/DSX-Python35/lib/python3.5/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
317 parse_int is None and parse_float is None and
318 parse_constant is None and object_pairs_hook is None and not kw):
--> 319 return _default_decoder.decode(s)
320 if cls is None:
321 cls = JSONDecoder
/opt/conda/envs/DSX-Python35/lib/python3.5/json/decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):
/opt/conda/envs/DSX-Python35/lib/python3.5/json/decoder.py in raw_decode(self, s, idx)
353 """
354 try:
--> 355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
357 raise JSONDecodeError("Expecting value", s, err.value) from None
JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
这是我笔记本中的代码:
from watson_developer_cloud import NaturalLanguageClassifierV1
import pandas as pd
import psycopg2
import json
# connect to the database
conn_string = 'host= port= dbname= user= password='.format('119.203.10.242', 5432, 'mydb', 'locq', 'Mypass***')
conn_cbedce9523454e8e9fd3fb55d4c1a52e = psycopg2.connect(conn_string)
# select the description column
data_df_1 = pd.read_sql('SELECT description from public."search_product"', con=conn_cbedce9523454e8e9fd3fb55d4c1a52e)
# package phrases into format required by Watson
reshaped = json.dumps('collection': ['text' : t for t in data_df_1['description']])
# connect to the Watson Studio API
natural_language_classifier = NaturalLanguageClassifierV1(
iam_apikey='F76ugy8hv1s3sr87buhb7564vb7************'
)
# apply the model to the datas
classes = natural_language_classifier.classify_collection('7818d2s519-nlc-1311', reshaped).get_result()
# print the results
print(classes)
当我评论 classes
行并且我只是做 print(reshaped)
时,这是我得到的响应,这是 Watson 工作室的正确格式:
"collection": [
"text": "Lorem ipsum sjvh hcx bftiyf, hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc yfctgg h vgchbvju."
,
"text": "Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc ivjhn oikgjvn uhnhgv 09iuvhb oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv."
,
"text": "Lorem aiv ibveikb jvk igvcib ok blnb v hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn"
,
"text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
,
"text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
,
"text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
,
"text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
]
请帮忙。
编辑
这就是我刚刚做的:
reshape = json.dumps(['text' : t for t in data_df_1['description']])
print(reshape)
这是我得到的结果:
["text": "Lorem ipsum sjvh hcx bftiyf, hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc yfctgg h vgchbvju.", "text": "Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc ivjhn oikgjvn uhnhgv 09iuvhb oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv.", "text": "Lorem aiv ibveikb jvk igvcib ok blnb v hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lorem sivbnogc hbiuygv bnjiuygv bmkjygv nmjhgv", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lore juhgv bnmkiuhygv nmkiuhb mkjiuhb mkjgv mkjhygv nmkjuytfrdc mjhygtfvc mkijuytfc vbnmkjuhygtfv bnmkjuhygtfvc mjhygv mjhgv nmjhuygv bnjhb mnhgv mjhgv njhgv bnjhb njhygvbnjkiuhbhjihbv mjhgbv nmkjhbhnjb njhgv njmkjhbvbh nhgv mbhhnb hjbhu njbhn njb n jjijh bb jiji bi jiijib bkiijij b hggg.", "text": "Lorem uhygfv bniuhgv nmkjuhgv nmkijuhygv mkihv bjijnb bnjib bjinb bnjub vgvg bhgfc nhgytredxc ngtfv mkjuygfcv bnmjuygv mjhgv bnmkjhgv njhgv njgfvc."]
我复制了结果并用这些数据替换了重塑:
#reshape = json.dumps(['text' : t for t in data_df_1['description']])
reshape = ["text": "Lorem ipsum sjvh hcx bftiyf, hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc yfctgg h vgchbvju.", "text": "Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc ivjhn oikgjvn uhnhgv 09iuvhb oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv.", "text": "Lorem aiv ibveikb jvk igvcib ok blnb v hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lorem sivbnogc hbiuygv bnjiuygv bmkjygv nmjhgv", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lore juhgv bnmkiuhygv nmkiuhb mkjiuhb mkjgv mkjhygv nmkjuytfrdc mjhygtfvc mkijuytfc vbnmkjuhygtfv bnmkjuhygtfvc mjhygv mjhgv nmjhuygv bnjhb mnhgv mjhgv njhgv bnjhb njhygvbnjkiuhbhjihbv mjhgbv nmkjhbhnjb njhgv njmkjhbvbh nhgv mbhhnb hjbhu njbhn njb n jjijh bb jiji bi jiijib bkiijij b hggg.", "text": "Lorem uhygfv bniuhgv nmkjuhgv nmkijuhygv mkihv bjijnb bnjib bjinb bnjub vgvg bhgfc nhgytredxc ngtfv mkjuygfcv bnmjuygv mjhgv bnmkjhgv njhgv njgfvc."]
classes = natural_language_classifier.classify_collection('7818d2s519-nlc-1311', reshape).get_result()
print(classes)
我通过这种方式得到了成功的响应.. 但这不是一个很好的方法。有什么解决办法吗?
【问题讨论】:
你能试试这个json.dumps("collection": ["text" : t for t in data_df_1["description"]])
吗?
试试这个new_reshaped = json.loads(json.dumps("collection": ["text" : t for t in data_df_1["description"]]))
并传递这个new_reshape。
如果你阅读源代码here,它说集合应该是列表。检查this 示例。试试这个json.dumps(["text" : t for t in data_df_1["description"]])
或json.dumps(["collection": ["text" : t for t in data_df_1["description"]]])
您是否阅读了运行良好的示例(在我上面提供的链接中)?
您可以尝试手动更改 print 的输出并将其重新分配给另一个变量并尝试吗?更改为示例中显示的格式。
【参考方案1】:
问题在于 json.dumps() 正在返回 <class 'str'>
(json 表示),而分类集合()的输入需要 <class 'list'>
。因此,我们在这里不使用 json.dumps(),而只是将 replace
用双引号(")作为键并将 <class 'list'>
传递给函数。
reshape = ["text" : t for t in data_df_1["description"]]
【讨论】:
以上是关于在 Jupyter Notebook 上获取 JsonDecodeError的主要内容,如果未能解决你的问题,请参考以下文章
使 Jupyter notebook 以 html 格式执行