第一章简单爬取动态页面

Posted 2020-12-19 wby-110

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了第一章简单爬取动态页面相关的知识，希望对你有一定的参考价值。

使用AWS的dms同步工具，把mysql数据同步到postgres,工具本身是没办法同步索引这些信息，也还有一些额外的限制，重建索引就相当麻烦了，写了一个脚本：

mysql执行,查询mysql里面有哪些索引，生成postgres里面建索引的语法：

select table_name,concat(cc.ct_inx,concat(‘(‘,group_concat(concat(‘"‘,substring_index(substring_index(cc.colns,‘,‘,b.help_topic_id + 1),‘,‘ ,- 1) ,‘"‘)),‘)‘))AS add_inx
       from (select  CONCAT(‘create index idx_‘,concat(ma.table_name,‘_‘,FLOOR(1+ rand()*99)),‘ on ‘) as ct_inx,colns,table_name
from
(SELECT a.TABLE_NAME,a.index_name,
GROUP_CONCAT(column_name ORDER BY seq_in_index) AS colns
FROM information_schema.statistics a
-- where TABLE_SCHEMA=‘copytrading‘ and TABLE_NAME in(‘t_trades‘)
GROUP BY a.TABLE_NAME,a.index_name)ma
where ma.INDEX_NAME <> ‘PRIMARY‘)cc
JOIN mysql.help_topic b ON b.help_topic_id < (length(cc.colns) - length(REPLACE (cc.colns, ‘,‘, ‘‘)) + 1)
group by cc.colns order by 1;

然后拿到postgres，直接执行，这样原来mysql有什么索引，postgres里面也有什么索引了，否则很容易出现同步延时等一些性能问题。

以上是关于第一章简单爬取动态页面的主要内容，如果未能解决你的问题，请参考以下文章

第一章 简单爬取动态页面

第一章简单爬取动态页面