小程序体积优化--优化大文本

Posted smallcoder

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了小程序体积优化--优化大文本相关的知识,希望对你有一定的参考价值。

缘起

昨天接手了一个小程序,让新增一些页面。页面写完,预览失败。为啥?大小超过2M了。虽然说小程序目前支持分包的方式让上限提高到4M,但是考虑到业务的发展,还是先优化一波。

去掉无用数据

优化体积,从大文件下手,首先找到的大文件,就是 address.js, 体积颇大,足有 145kb 。我们看看他。

// 原始数据, 差不多长这样

module.exports = [{"code":"110000","region":"北京","regionEntitys":[{"code":"110100","region":"北京市","regionEntitys":[{"code":"110101","region":"东城区"},{"code":"110102","region":"西城区"},{"code":"110105","region":"朝阳区"},{"code":"110106","region":"丰台区"},{"code":"110107","region":"石景山区"},{"code":"110108","region":"海淀区"},{"code":"110109","region":"门头沟区"},{"code":"110111","region":"房山区"},{"code":"110112","region":"通州区"},{"code":"110113","region":"顺义区"},{"code":"110114","region":"昌平区"},{"code":"110115","region":"大兴区"},{"code":"110116","region":"怀柔区"},{"code":"110117","region":"平谷区"},{"code":"110118","region":"密云区"},{"code":"110119","region":"延庆区"},{"code":"110199","region":"其他区"}]}]},{"code":"120000","region":"天津","regionEntitys":[{"code":"120100","region":"天津市","regionEntitys":[{"code":"120101","region":"和平区"},{"code":"120102","region":"河东区"},{"code":"120103","region":"河西区"},{"code":"120104","region":"南开区"},{"code":"120105","region":"河北区"},{"code":"120106","region":"红桥区"},{"code":"120110","region":"东丽区"},{"code":"120111","region":"西青区"},{"code":"120112","region":"津南区"},{"code":"120113","region":"北辰区"},{"code":"120114","region":"武清区"},{"code":"120115","region":"宝坻区"},{"code":"120116","region":"滨海新区"},{"code":"120117","region":"宁河区"},{"code":"120118","region":"静海区"},{"code":"120119","region":"蓟州区"},{"code":"120199","region":"其他区"}]}]},{"code":"130000","region":"河北省","regionEntitys":[{"code":"130100","region":"石家庄市","regionEntitys":[{"code":"130102","region":"长安区"},{"code":"130104","region":"桥西区"},{"code":"130105","region":"新华区"},{"code":"130107","region":"井陉矿区"},{"code":"130108","region":"裕华区"},{"code":"130109","region":"藁城区"},{"code":"130110","region":"鹿泉区"},{"code":"130111","region":"栾城区"},{"code":"130121","region":"井陉县"},{"code":"130123","region":"正定县"},{"code":"130125","region":"行唐县"},{"code":"130126","region":"灵寿县"},{"code":"130127","region":"高邑县"},{"code":"130128","region":"深泽县"},{"code":"130129","region":"赞皇县"},{"code":"130130","region":"无极县"},{"code":"130131","region":"平山县"},{"code":"130132","region":"元氏县"},{"code":"130133","region":"赵县"},{"code":"130183","region":"晋州市"},{"code":"130184","region":"新乐市"},{"code":"130199","region":"其他区"}]},{"code":"130200","region":"唐山市","regionEntitys":[{"code":"130202","region":"路南区"},{"code":"130203","region":"路北区"},{"code":"130204","region":"古冶区"},{"code":"130205","region":"开平区"},{"code":"130207","region":"丰南区"},{"code":"130208","region":"丰润区"},{"code":"130209","region":"曹妃甸区"},{"code":"130223","region":"滦县"},{"code":"130224","region":"滦南县"},{"code":"130225","region":"乐亭县"},{"code":"130227","region":"迁西县"},{"code":"130229","region":"玉田县"},{"code":"130281","region":"遵化市"},{"code":"130283","region":"迁安市"},{"code":"130299","region":"其他区"}]},{"code":"130300","region":"秦皇岛市","regionEntitys":[{"code":"130302","region":"海港区"},{"code":"130303","region":"山海关区"},{"code":"130304","region":"北戴河区"},{"code":"130306","region":"抚宁区"}, ...], ... ]

通过调用的页面发现,数据中的 code 字段是没有被使用的,先全文替换为空字符串。

let str = JSON.stringify(data)
// 去除code字段
str = str.replace(/"code":"d{6}",/g, ‘‘)

替换后,体积变为了 90KB, 直接减少了 38% 的体积

缩短变量名

去掉code字段之后,体积确实少了很多,但是还需要进一步优化,把长的变量名改短,看看能减少多少体积?

// regionEntitys 修改为 E
// region 修改为 R

str = str.replace(/regionEntitys/g, ‘E‘)
str = str.replace(/region/g, ‘R‘)

现在的体积是 68kb,仅仅是通过修改变量名,又减少了 24% 的体积。

数据字典

到了现在,还能减少吗?当然能,变量名可以缩短,汉字字符串可以提取相同的部分,作为数据字典。先统计一下那些字出现概率最高:

let hashMap = {};
for(let i = 0, len = str.length; i < len; i++){
  let char = str[i];
  if([‘{‘, ‘}‘, ‘[‘, ‘]‘, ‘:‘, ‘,‘, ‘"‘, ‘E‘, ‘R‘].indexOf(char) > -1) continue
  if(!hashMap[char]){
    hashMap[char] = 1
  }
  hashMap[char] += 1
}

let sortList = [];
for(var i in hashMap){
  sortList.push([i, hashMap[i]])
}

// sortList 前20个
[ [ ‘县‘, 1503 ],
  [ ‘区‘, 1305 ],
  [ ‘市‘, 667 ],
  [ ‘其‘, 341 ],
  [ ‘他‘, 341 ],
  [ ‘族‘, 198 ],
  [ ‘山‘, 172 ],
  [ ‘治‘, 161 ],
  [ ‘自‘, 160 ],
  [ ‘城‘, 157 ],
  [ ‘州‘, 147 ],
  [ ‘阳‘, 132 ],
  [ ‘江‘, 125 ],
  [ ‘安‘, 120 ],
  [ ‘南‘, 109 ],
  [ ‘东‘, 85 ],
  [ ‘平‘, 82 ],
  [ ‘宁‘, 80 ],
  [ ‘河‘, 78 ],
  [ ‘西‘, 74 ] ]

统计完毕之后,做一次全局的文本替换

let top20 = sortList.sort((a,b)=>{return b[1] - a[1]}).slice(0, 20).map(i=>i[0])
let keyMap = ‘abcdefghijklmnopqrstuvwxyz‘;

const pat = new RegExp(`(${top20.join(‘|‘)})`, ‘g‘)
// 替换字符串
str = str.replace(pat, (hit)=>{
  let index = top20.indexOf(hit);
  return keyMap.charAt(index);
})

压缩后的文本看起来是这样的

let region = [{"R":"北京","E":[{"R":"北京c","E":[{"R":"pjb"},{"R":"tjb"},{"R":"朝lb"},{"R":"丰台b"},{"R":"石景gb"},{"R":"海淀b"},{"R":"门头沟b"},{"R":"房gb"},{"R":"通kb"},{"R":"顺义b"},{"R":"昌qb"},{"R":"大兴b"},{"R":"怀柔b"},{"R":"q谷b"},{"R":"密云b"},{"R":"延庆b"},{"R":"deb"}]}]},{"R":"天津","E":[{"R":"天津c","E":[{"R":"和qb"},{"R":"spb"},{"R":"stb"},{"R":"o开b"},{"R":"s北b"},{"R":"红桥b"},{"R":"p丽b"}, ...]

替换后的文本是56kb,体积再次减少了 17%。

文本解析

在进行了字典压缩文本之后,使用时还需要解析,再次利用提取出的字段:

let pat = new RegExp(`(${Object.keys(top20).join(‘|‘)})`, ‘g‘)
function getRegion() {
  let data = null;
  try {
    data = JSON.parse(JSON.stringify(region).replace(pat, (hit)=>{
      return top20[hit];
    }))
  } catch (error) {
    throw new Error(error);
  }
  return data
}

经过测试,解析耗时 5ms左右, 在可以承受的范围。

经过不懈努力,终于把这个文件从最初的145kb,减少了到现在的56kb,一共减少了61% 的文件大小。

可以看出,压缩虽然有效,但是收益最大的操作还是去掉无用的字段。顺着这个思路,接下来继续对图片进行优化。

example

本文demo在这里 小程序体积优化(1)--优化大文件 demo

以上是关于小程序体积优化--优化大文本的主要内容,如果未能解决你的问题,请参考以下文章

小程序包体积优化

小程序包体积优化

如何优化C ++代码的以下片段 - 卷中的零交叉

常用的图片优化手段

常用的图片优化手段

关于JSON数据体积优化的一点小心得