使用 to_sql 将 pandas df 写入 mysql 时抛出错误
Posted
技术标签:
【中文标题】使用 to_sql 将 pandas df 写入 mysql 时抛出错误【英文标题】:Error thrown when using to_sql to write pandas df to mysql 【发布时间】:2021-01-25 18:47:57 【问题描述】:我正在使用 praw 抓取 reddit,并将记录存储在 pandas df 中。使用 sqlalchemy 和 pymysql 的组合连接到我的 AWS RDS db 和 to_sql 以将记录附加到现有表。在我使用 to_sql 方法之前,一切似乎都运行良好。它引发以下错误,我不确定从这里去哪里。任何帮助或建议都会很棒!
engine = sqlalchemy.create_engine('mysql+pymysql://username:password@database...rds.amazonaws.com:3306/socialdata')
df_comment = pd.DataFrame(comment_table)
df_comment.to_sql(name='reddit_comments', con=engine, index=False, if_exists='append')
Traceback (most recent call last):
File "/Users/ty/Desktop/Python/reddit_scraper.py", line 121, in <module>
df_comment.to_sql(name='reddit_comments', con=engine, index=False, if_exists='append')
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/core/generic.py", line 2605, in to_sql
sql.to_sql(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 589, in to_sql
pandas_sql.to_sql(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 1398, in to_sql
table.insert(chunksize, method=method)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 830, in insert
exec_insert(conn, keys, chunk_iter)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 747, in _execute_insert
conn.execute(self.table.insert(), data)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception
util.raise_(exc_info[1], with_traceback=exc_info[2])
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1256, in _execute_context
self.dialect.do_executemany(
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 148, in do_executemany
rowcount = cursor.executemany(statement, parameters)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 188, in executemany
return self._do_execute_many(q_prefix, q_values, q_postfix, args,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 206, in _do_execute_many
v = values % escape(next(args), conn)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 120, in _escape_args
return key: conn.literal(val) for (key, val) in args.items()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 120, in <dictcomp>
return key: conn.literal(val) for (key, val) in args.items()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/connections.py", line 469, in literal
return self.escape(obj, self.encoders)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/connections.py", line 462, in escape
return converters.escape_item(obj, self.charset, mapping=mapping)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/converters.py", line 27, in escape_item
val = encoder(val, mapping)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/converters.py", line 123, in escape_unicode
return u"'%s'" % _escape_unicode(value)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/converters.py", line 78, in _escape_unicode
return value.translate(_escape_table)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/praw/models/reddit/base.py", line 35, in __getattr__
return getattr(self, attribute)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/praw/models/reddit/base.py", line 36, in __getattr__
raise AttributeError(
AttributeError: 'Redditor' object has no attribute 'translate'
【问题讨论】:
【参考方案1】:DataFrame 中的一列包含一个自定义的“Redditor”对象,该对象不映射到相应的 SQL 数据类型。 pymysql 调用对象的 translate 函数时,它不是像 int float 或 string 这样明显的东西
如果 Redditor 只是用户名和其他元数据存储的包装对象,那么您可以执行一些操作,例如将该列重新映射到 Redditor 对象的字符串/数字表示。如果它是您定义的对象,您可以在 Redditor 类的定义中添加 translate() 函数以返回适当的值。例如,如果 Redditor.id 包含您要存储在列中的值:-
class Redditor():
def translate(self):
# Change self.id with the value you care about
return self.id
或者在你保存之前在 pandas 中
df[REDDITOR_COLUMN] = df[REDDITOR_COLUMN].apply(lambda x: x.id)
【讨论】:
嘿,你说得对,我在没有检查的情况下领先了自己一秒钟,相应地编辑了帖子,谢谢!以上是关于使用 to_sql 将 pandas df 写入 mysql 时抛出错误的主要内容,如果未能解决你的问题,请参考以下文章
pandas df.to_sql 到 Oracle 数据库数据类型不一致
使用 SqlAlchemy 和 cx_Oracle 将 Pandas DataFrame 写入 Oracle 数据库时加快 to_sql()
Pandas 和 SQLAlchemy:使用来自 engine.begin() 的连接时,带有 SQLAlchemy 2.0 fututre=True 的 df.to_sql() 会引发错误