在 Jupyter Notebook 上获取 JsonDecodeError

Posted

技术标签:

【中文标题】在 Jupyter Notebook 上获取 JsonDecodeError【英文标题】:Getting a JsonDecodeError on Jupyter Notebook 【发布时间】:2019-10-12 01:22:43 【问题描述】:

我正在设置一个 Jupyter Notebook,它将来自 Ibm watson Studio API 的机器学习模型应用于来自我的 Postgresql 数据库的一些数据。

在重塑数据以供 API 读取时,出现JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1),我无法解决。

这是完整的回溯:

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-114-9d8e7cf98a41> in <module>()
      1 import json
      2 
----> 3 classes = natural_language_classifier.classify_collection('7818d2s519-nlc-1311', reshaped).get_result()
      4 
      5 print(json.dumps(classes, indent=2))

/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/watson_developer_cloud/natural_language_classifier_v1.py in classify_collection(self, classifier_id, collection, **kwargs)
    152         if collection is None:
    153             raise ValueError('collection must be provided')
--> 154         collection = [self._convert_model(x, ClassifyInput) for x in collection]
    155 
    156         headers = 

/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/watson_developer_cloud/natural_language_classifier_v1.py in <listcomp>(.0)
    152         if collection is None:
    153             raise ValueError('collection must be provided')
--> 154         collection = [self._convert_model(x, ClassifyInput) for x in collection]
    155 
    156         headers = 

/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/watson_developer_cloud/watson_service.py in _convert_model(val, classname)
    461         if classname is not None and not hasattr(val, "_from_dict"):
    462             if isinstance(val, str):
--> 463                 val = json_import.loads(val)
    464             val = classname._from_dict(dict(val))
    465         if hasattr(val, "_to_dict"):

/opt/conda/envs/DSX-Python35/lib/python3.5/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    317             parse_int is None and parse_float is None and
    318             parse_constant is None and object_pairs_hook is None and not kw):
--> 319         return _default_decoder.decode(s)
    320     if cls is None:
    321         cls = JSONDecoder

/opt/conda/envs/DSX-Python35/lib/python3.5/json/decoder.py in decode(self, s, _w)
    337 
    338         """
--> 339         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    340         end = _w(s, end).end()
    341         if end != len(s):

/opt/conda/envs/DSX-Python35/lib/python3.5/json/decoder.py in raw_decode(self, s, idx)
    353         """
    354         try:
--> 355             obj, end = self.scan_once(s, idx)
    356         except StopIteration as err:
    357             raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

这是我笔记本中的代码:

from watson_developer_cloud import NaturalLanguageClassifierV1
import pandas as pd
import psycopg2
import json

# connect to the database
conn_string = 'host= port=  dbname=  user=  password='.format('119.203.10.242', 5432, 'mydb', 'locq', 'Mypass***')
conn_cbedce9523454e8e9fd3fb55d4c1a52e = psycopg2.connect(conn_string)

# select the description column
data_df_1 = pd.read_sql('SELECT description from public."search_product"', con=conn_cbedce9523454e8e9fd3fb55d4c1a52e)

# package phrases into format required by Watson
reshaped = json.dumps('collection': ['text' : t for t in data_df_1['description']])

# connect to the Watson Studio API
natural_language_classifier = NaturalLanguageClassifierV1(
    iam_apikey='F76ugy8hv1s3sr87buhb7564vb7************'
)

# apply the model to the datas
classes = natural_language_classifier.classify_collection('7818d2s519-nlc-1311', reshaped).get_result()

# print the results
print(classes)

当我评论 classes 行并且我只是做 print(reshaped) 时,这是我得到的响应,这是 Watson 工作室的正确格式:


  "collection": [
    
      "text": "Lorem ipsum sjvh  hcx bftiyf,  hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc   yfctgg h vgchbvju."
    ,
    
      "text": "Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc  ivjhn oikgjvn uhnhgv 09iuvhb  oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv."
    ,
    
      "text": "Lorem aiv ibveikb jvk igvcib ok blnb v  hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn"
    ,
    
      "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
    ,
    
      "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
    ,
    
      "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
    ,
    
      "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"
    
  ]

请帮忙。

编辑

这就是我刚刚做的:

reshape = json.dumps(['text' : t for t in data_df_1['description']])


print(reshape)

这是我得到的结果:

["text": "Lorem ipsum sjvh  hcx bftiyf,  hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc   yfctgg h vgchbvju.", "text": "Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc  ivjhn oikgjvn uhnhgv 09iuvhb  oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv.", "text": "Lorem aiv ibveikb jvk igvcib ok blnb v  hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lorem sivbnogc hbiuygv bnjiuygv bmkjygv nmjhgv", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lore  juhgv bnmkiuhygv nmkiuhb mkjiuhb mkjgv mkjhygv nmkjuytfrdc mjhygtfvc mkijuytfc vbnmkjuhygtfv bnmkjuhygtfvc mjhygv mjhgv nmjhuygv bnjhb mnhgv mjhgv njhgv bnjhb njhygvbnjkiuhbhjihbv mjhgbv nmkjhbhnjb njhgv njmkjhbvbh nhgv mbhhnb hjbhu njbhn njb n  jjijh bb jiji bi jiijib bkiijij b hggg.", "text": "Lorem uhygfv bniuhgv nmkjuhgv nmkijuhygv mkihv bjijnb bnjib bjinb bnjub vgvg bhgfc nhgytredxc ngtfv mkjuygfcv bnmjuygv mjhgv bnmkjhgv njhgv njgfvc."]

我复制了结果并用这些数据替换了重塑:

#reshape = json.dumps(['text' : t for t in data_df_1['description']])

reshape = ["text": "Lorem ipsum sjvh  hcx bftiyf,  hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc   yfctgg h vgchbvju.", "text": "Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc  ivjhn oikgjvn uhnhgv 09iuvhb  oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv.", "text": "Lorem aiv ibveikb jvk igvcib ok blnb v  hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lorem sivbnogc hbiuygv bnjiuygv bmkjygv nmjhgv", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx", "text": "lore  juhgv bnmkiuhygv nmkiuhb mkjiuhb mkjgv mkjhygv nmkjuytfrdc mjhygtfvc mkijuytfc vbnmkjuhygtfv bnmkjuhygtfvc mjhygv mjhgv nmjhuygv bnjhb mnhgv mjhgv njhgv bnjhb njhygvbnjkiuhbhjihbv mjhgbv nmkjhbhnjb njhgv njmkjhbvbh nhgv mbhhnb hjbhu njbhn njb n  jjijh bb jiji bi jiijib bkiijij b hggg.", "text": "Lorem uhygfv bniuhgv nmkjuhgv nmkijuhygv mkihv bjijnb bnjib bjinb bnjub vgvg bhgfc nhgytredxc ngtfv mkjuygfcv bnmjuygv mjhgv bnmkjhgv njhgv njgfvc."]


classes = natural_language_classifier.classify_collection('7818d2s519-nlc-1311', reshape).get_result()

print(classes)

我通过这种方式得到了成功的响应.. 但这不是一个很好的方法。有什么解决办法吗?

【问题讨论】:

你能试试这个json.dumps("collection": ["text" : t for t in data_df_1["description"]])吗? 试试这个new_reshaped = json.loads(json.dumps("collection": ["text" : t for t in data_df_1["description"]]))并传递这个new_reshape。 如果你阅读源代码here,它说集合应该是列表。检查this 示例。试试这个json.dumps(["text" : t for t in data_df_1["description"]])json.dumps(["collection": ["text" : t for t in data_df_1["description"]]]) 您是否阅读了运行良好的示例(在我上面提供的链接中)? 您可以尝试手动更改 print 的输出并将其重新分配给另一个变量并尝试吗?更改为示例中显示的格式。 【参考方案1】:

问题在于 json.dumps() 正在返回 &lt;class 'str'&gt;(json 表示),而分类集合()的输入需要 &lt;class 'list'&gt;。因此,我们在这里不使用 json.dumps(),而只是将 replace 用双引号(")作为键并将 &lt;class 'list'&gt; 传递给函数。

reshape = ["text" : t for t in data_df_1["description"]]

【讨论】:

以上是关于在 Jupyter Notebook 上获取 JsonDecodeError的主要内容,如果未能解决你的问题,请参考以下文章

使 Jupyter notebook 以 html 格式执行

Jupyter Notebook 连接到远程配置单元

获取Jupyter Notebook中定义的对象的源代码

无法在子目录中创建Jupyter Notebook

在 Ubuntu 上启动时运行 Jupyter-notebook

访问在 Docker 容器上运行的 Jupyter notebook