如何使用正则表达式替换替换特殊字符?
Posted
技术标签:
【中文标题】如何使用正则表达式替换替换特殊字符?【英文标题】:how to use regex replace to replace special character? 【发布时间】:2020-03-24 06:45:50 【问题描述】:我正在尝试使用 regex replace 将“\”替换为 \,但没有得到正确的解决方案。想要删除即将出现的双引号。你能帮我怎么做吗?
例子:
"\""warfarin was discontinued 3 days ago and xarelto was started when the INR was 2.7, and now the INR is 5.8, should Xarelto be continued or stopped?"
结果:
\"warfarin was discontinued 3 days ago and xarelto was started when the INR was 2.7, and now the INR is 5.8, should Xarelto be continued or stopped?"
【问题讨论】:
您的问题已经解决了吗? 【参考方案1】:这能解决您的问题吗?
re.sub(r'"\\"', r'\\', text)
【讨论】:
嗨,亚历克斯,感谢您的回复。我仍然面临同样的问题。我用这个 - df = df.withColumn('QSTN', regexp_replace(col('QSTN'), '"\\"', '\\')) 用于保存我正在使用的数据帧 - df.repartition(1).write.format('com.databricks.spark.csv').mode('overwrite').save(output_path, escape= '\"', sep='|',header='True',nullValue=None)【参考方案2】:尝试以下解决方案:
df = spark.createDataFrame([
(1, '"\\""warfarin was discontinued 3 days ago and xarelto was started when the INR was 2.7, and now the INR is 5.8, should Xarelto be continued or stopped?"')
], ("ID","textVal"))
import pandas as pd
from pyspark.sql.functions import regexp_replace, col
pd.set_option('max_colwidth', 200)
df2 = df.withColumn('textVal', regexp_replace(col('textVal'), '\\"\\\\\"', '\\\\'))
df2.toPandas()
ID textVal
0 1 \"warfarin was discontinued 3 days ago and xarelto was started when the INR was 2.7, and now the INR is 5.8, should Xarelto be continued or stopped?"
希望对你有帮助!
【讨论】:
以上是关于如何使用正则表达式替换替换特殊字符?的主要内容,如果未能解决你的问题,请参考以下文章