python之json格式化与紧凑处理

Posted 2023-04-08 To Be Yourself

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了python之json格式化与紧凑处理相关的知识，希望对你有一定的参考价值。

格式化

在工作中json是我们常用的数据格式，因为格式化与紧凑存储所占的内存是不同的，格式化存储接近大一倍空间。所以有时候需要紧凑存储（一行存储），但是查看不太方便。

场景：

记事本打开json的速度最快，但是没有格式化功能。
notepad++可以格式化但是需要联网安装插件，内网环境不能下载
vscode自带格式化功能，右键->格式化，但是成千上万的文件操作拉低效率

所以，可以通过python脚本将json在格式化与紧凑之间互相转换

以下以python3为例

#coding=utf-8
import json
import os
import sys
import io

# 遍历所有文件夹下的文件
#sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding=\'gb18030\')         #改变标准输出的默认编码

def getFileCon(filename):
  if not os.path.isfile(filename):
    return

  with open(filename, encoding="utf_8_sig") as f:
    con = f.read()
    f.close()
    return con

def writeFile(filepath,con):
  with open(filepath, "w") as f:
    f.write(con)
    f.close()

#递归的找出目表路径下的所有文件
def get_files(dirName):
    for filepath,dirnames,filenames in os.walk(dirName):
        return filenames


if __name__ == "__main__":
  # fl = get_files(".")
  filePath = \'.\'
  for filepath, dirnames, filenames in os.walk(filePath):
      for f in filenames:
        g = os.path.join(filepath, f)
        # print(g)
        if os.path.isdir(g):
            print(g + " it\'s a directory")

        elif os.path.isfile(g):
            # print(g + " it\'s a normal file")
            if not f.endswith(".json"):
                continue

        try:
            con = json.loads(getFileCon(g))
            # print(con)
            # writeFile(g,json.dumps(con,indent=4,ensure_ascii=False).decode(\'utf8\'))
            writeFile(g, json.dumps(con, indent=4, ensure_ascii=False))
            print(g, \'OK\')
        except Exception as e:
            print(g, e)

将此脚本拷贝到指定目录下，然后cmd，执行 python formatjsonAll.py，该目录（包括子目录）下的所有json文件将格式化

紧凑

#coding=utf-8
import json
import os
import sys
import io

# 遍历所有文件夹下的文件
#sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding=\'gb18030\')         #改变标准输出的默认编码

def getFileCon(filename):
  if not os.path.isfile(filename):
    return

  with open(filename, encoding="utf_8_sig") as f:
    con = f.read()
    f.close()
    return con

def writeFile(filepath,con):
  with open(filepath, "w") as f:
    f.write(con)
    f.close()

#递归的找出目表路径下的所有文件
def get_files(dirName):
    for filepath,dirnames,filenames in os.walk(dirName):
        return filenames


if __name__ == "__main__":
  # fl = get_files(".")
  filePath = \'.\'
  for filepath, dirnames, filenames in os.walk(filePath):
      for f in filenames:
        g = os.path.join(filepath, f)
        # print(g)
        if os.path.isdir(g):
            print(g + " it\'s a directory")

        elif os.path.isfile(g):
            # print(g + " it\'s a normal file")
            if not f.endswith(".json"):
                continue

        try:
            con = json.loads(getFileCon(g))         
            # print(con)
            outfile= open(g, "w")
            json.dump(con,outfile,ensure_ascii=False)
            outfile.close()
    
            print(g, \'OK\')
        except Exception as e:
            print(g, e)

Json.dump用法

json.dumps()是把python对象转换成json对象的一个过程，生成的是字符串。
json.dump()是把python对象转换成json对象生成一个fp的文件流，和文件相关。

outfile= open(g, "w")
json.dump(con,outfile,ensure_ascii=False)
outfile.close()

# g是文件名，con是json数据，ensure_ascii=True：默认输出ASCLL码，如果把这个该成False,就可以输出中文。

参考：

Python json.dump()实例讲解

json.dump方法

json.dumps参数之解

参考技术A 通过help(“json”) 可以参考json库使用说明

编码后的json格式字符串紧凑输出，且无顺序，其dumps方法提供一些可选的参数，让输出的格式提高可读性。
（1）sort_keys是告诉编码器按照字典key排序(a到z)输出。

（2）indent参数根据数据格式缩进显示，读起来更加清晰, indent的值，代表缩进空格式：

(3)separators参数的作用是去掉‘，’ ‘：’后面的空格，在传输数据的过程中，越精简越好，冗余的东西全部去掉。

(4)skipkeys参数，在encoding过程中，dict对象的key只可以是string对象，如果是其他类型，那么在编码过程中就会抛出ValueError的异常。skipkeys可以跳过那些非string对象当作key的处理.

(5)输出真正的中文需要指定ensure_ascii=False

如果无任何配置，或者说使用默认配置，
输出的会是‘凉凉’的ASCII字符吗，而不是真正的中文。
这是因为json.dumps 序列化时对中文默认使用的ascii编码。

以上是关于python之json格式化与紧凑处理的主要内容，如果未能解决你的问题，请参考以下文章

python接口测试之序列化与反序列化

python 序列化模块之 json 和 pickle

python教程之JSON文件数据存储的处理操作

python 数据提取之JSON与JsonPATH

熊猫数据框到 json 列表格式