MySQL 索引和使用文件排序

Posted

技术标签:

【中文标题】MySQL 索引和使用文件排序【英文标题】:MySQL indexing and Using filesort 【发布时间】:2015-10-05 15:16:45 【问题描述】:

这与我的last problem 有关。我在列表表中新建了两列,一列用于组合视图views_point(每 100 个视图递增),另一列用于发布日期 publishedon_hourly(仅按年-月-日小时)以生成一些唯一值。

这是我的新桌子:

CREATE TABLE IF NOT EXISTS `listings` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `type` tinyint(1) NOT NULL DEFAULT '1',
  `hash` char(32) NOT NULL,
  `source_id` int(10) unsigned NOT NULL,
  `link` varchar(255) NOT NULL,
  `short_link` varchar(255) NOT NULL,
  `cat_id` mediumint(5) NOT NULL,
  `title` mediumtext NOT NULL,
  `description` mediumtext,
  `content` mediumtext,
  `images` mediumtext,
  `videos` mediumtext,
  `views` int(10) unsigned NOT NULL DEFAULT '0',
  `views_point` int(10) unsigned NOT NULL DEFAULT '0',
  `comments` int(11) DEFAULT '0',
  `comments_update` int(11) NOT NULL DEFAULT '0',
  `editor_id` int(11) NOT NULL DEFAULT '0',
  `auther_name` varchar(255) DEFAULT NULL,
  `createdby_id` int(10) NOT NULL,
  `createdon` int(20) NOT NULL,
  `editedby_id` int(10) NOT NULL,
  `editedon` int(20) NOT NULL,
  `deleted` tinyint(1) NOT NULL,
  `deletedon` int(20) NOT NULL,
  `deletedby_id` int(10) NOT NULL,
  `deletedfor` varchar(255) NOT NULL,
  `published` tinyint(1) NOT NULL DEFAULT '1',
  `publishedon` int(11) unsigned NOT NULL,
  `publishedon_hourly` int(10) unsigned NOT NULL DEFAULT '0',
  `publishedby_id` int(10) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `hash` (`hash`),
  KEY `views_point` (`views_point`),
  KEY `listings` (`publishedon_hourly`,`published`,`cat_id`,`source_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 ROW_FORMAT=FIXED AUTO_INCREMENT=365513 ;

当我运行这样的查询时:

SELECT *
FROM listings
WHERE (`publishedon_hourly` BETWEEN
       UNIX_TIMESTAMP( '2015-09-5 00:00:00' )
       AND UNIX_TIMESTAMP( '2015-10-5 12:00:00' ))
  AND (published =1)
  AND cat_id IN ( 1, 2, 3, 4, 5 )
ORDER BY by `views_point` DESC
LIMIT 10 

效果很好,解释如下:

但是当我像这样从一个月到一天更改日期范围时:

SELECT *
FROM listings
WHERE (`publishedon_hourly` BETWEEN
       UNIX_TIMESTAMP( '2015-09-5 00:00:00' )
       AND UNIX_TIMESTAMP( '2015-09-5 12:00:00' ))
  AND (published =1)
  AND cat_id IN ( 1, 2, 3, 4, 5 )
  ORDER BY `views_point` DESC
  LIMIT 10 

然后查询变慢并出现文件排序。有谁知道原因,我该如何解决?

数据样本(来自慢查询)

INSERT INTO `listings` (`id`, `type`, `hash`, `source_id`, `link`, `short_link`, `cat_id`, `title`, `description`, `content`, `images`, `videos`, `views`, `views_point`, `comments`, `comments_update`, `editor_id`, `auther_name`, `createdby_id`, `createdon`, `editedby_id`, `editedon`, `deleted`, `deletedon`, `deletedby_id`, `deletedfor`, `published`, `publishedon`, `publishedon_hourly`, `publishedby_id`) VALUES
(94189, 1, '44a46d128ce730c72927b19c445ab26e', 8, 'http://Larkin.com/sapiente-laboriosam-omnis-tempore-aliquam-qui-nobis', '', 5, 'And Alice was more and.', 'So they got settled down again very sadly and quietly, and.', 'Dormouse. ''Fourteenth of March, I think it so quickly that the Gryphon only answered ''Come on!'' and ran the faster, while more and more sounds of broken glass, from which she concluded that it was looking down at them, and then a voice sometimes choked with sobs, to sing this:-- ''Beautiful Soup, so rich and green, Waiting in a natural way. ''I thought you did,'' said the Dormouse, without considering at all what had become of it; and as it.', NULL, '', 200, 19700, 0, 0, 0, 'Max', 0, 1441442729, 0, 0, 0, 0, 0, '', 1, 1441442729, 1441440000, 0),
(19030, 1, '3438f6a555f2ce7fdfe03cee7a52882a', 3, 'http://Romaguera.com/voluptatem-rerum-quia-sed', '', 2, 'Dodo said, ''EVERYBODY.', 'I wish I hadn''t to bring but one; Bill''s got the.', 'I wonder what they''ll do well enough; don''t be particular--Here, Bill! catch hold of this remark, and thought to herself. (Alice had no idea what Latitude or Longitude I''ve got to the confused clamour of the other queer noises, would change to dull reality--the grass would be offended again. ''Mine is a long way. So she went on. ''I do,'' Alice said nothing; she had succeeded in curving it down ''important,'' and some were birds,) ''I suppose so,''.', NULL, '', 800, 19400, 0, 0, 0, 'Antonio', 0, 1441447567, 0, 0, 0, 0, 0, '', 1, 1441447567, 1441447200, 0),
(129247, 4, '87d2029a300d8b4314508786eb620a24', 10, 'http://Ledner.com/', '', 4, 'I ever saw one that.', 'The Cat seemed to be a person of authority among them,.', 'I BEG your pardon!'' she exclaimed in a natural way again. ''I wonder what was the same height as herself; and when she looked down at her feet as the question was evidently meant for her. ''I can tell you my history, and you''ll understand why it is I hate cats and dogs.'' It was all dark overhead; before her was another long passage, and the blades of grass, but she had sat down a very little! Besides, SHE''S she, and I''m sure I have dropped them, I wonder?'' As she said to herself; ''his eyes are so VERY tired of being all alone here!'' As she said to itself ''Then I''ll go round a deal.', NULL, '', 1000, 19100, 0, 0, 0, 'Drake', 0, 1441409756, 0, 0, 0, 0, 0, '', 1, 1441409756, 1441407600, 0),
(264582, 2, '5e44fe417f284f42c3b10bccd9c89b14', 8, 'http://www.Dietrich.info/laboriosam-quae-eaque-aut-dolorem', '', 2, 'Alice asked in a very.', 'THINK; or is it directed to?'' said the Mock Turtle,.', 'I can listen all day to such stuff? Be off, or I''ll have you executed.'' The miserable Hatter dropped his teacup and bread-and-butter, and then unrolled the parchment scroll, and read as follows:-- ''The Queen will hear you! You see, she came upon a little of the players to be lost, as she spoke--fancy CURTSEYING as you''re falling through the wood. ''It''s the stupidest tea-party I.', NULL, '', 800, 18700, 0, 0, 0, 'Kevin', 0, 1441441192, 0, 0, 0, 0, 0, '', 1, 1441441192, 1441440000, 0),
(44798, 1, '567cc77ba88c05a4a805dc667816a30c', 14, 'http://www.Hintz.com/distinctio-nulla-quia-incidunt-facere-reprehenderit-sapiente-sint.html', '', 5, 'The Cat seemed to Alice.', 'And the moral of that is--"Be what you mean,'' said Alice..', 'Alice very politely; but she felt very lonely and low-spirited. In a little faster?" said a sleepy voice behind her. ''Collar that Dormouse,'' the Queen said severely ''Who is it directed to?'' said the Footman, and began staring at the Footman''s head: it just at first, but, after watching it a violent blow underneath her chin: it had no pictures or conversations in it, ''and what is the capital of Paris, and Paris is the same thing, you know.'' ''I DON''T.', NULL, '', 300, 17600, 0, 0, 0, 'Rocio', 0, 1441442557, 0, 0, 0, 0, 0, '', 1, 1441442557, 1441440000, 0),
(184472, 1, 'f852e3ed401c7c72c5a9609687385f65', 14, 'https://www.Schumm.biz/voluptatum-iure-qui-dicta-modi-est', '', 4, 'Alice replied, so.', 'I should have liked teaching it tricks very much, if--if.', 'NEVER come to the Dormouse, not choosing to notice this question, but hurriedly went on, ''What''s your name, child?'' ''My name is Alice, so please your Majesty,'' said Two, in a great thistle, to keep back the wandering hair that WOULD always get into her face. ''Wake up, Alice dear!'' said her sister; ''Why, what a dear quiet thing,'' Alice went on, spreading out the answer to shillings and pence. ''Take off your hat,'' the King had said that day. ''No, no!'' said the Gryphon. ''They can''t have anything to say, she simply bowed, and took the watch and looked at it again: but he could.', NULL, '', 900, 17600, 0, 0, 0, 'Billy', 0, 1441407837, 0, 0, 0, 0, 0, '', 1, 1441407837, 1441407600, 0),
(344246, 2, '09dc73287ff642cfa2c97977dc42bc64', 6, 'http://www.Cole.com/sit-maiores-et-quam-vitae-ut-fugiat', '', 1, 'IS the use of a.', 'And when I learn music.'' ''Ah! that accounts for it,'' said.', 'Gryphon answered, very nearly carried it out loud. ''Thinking again?'' the Duchess by this time.) ''You''re nothing but a pack of cards, after all. I needn''t be so stingy about it, you know--'' ''But, it goes on "THEY ALL RETURNED FROM HIM TO YOU,"'' said Alice. ''Call it what you mean,'' the March Hare, ''that "I breathe when I breathe"!'' ''It IS the same side of WHAT? The other guests had taken his watch out of it, and talking over its head. ''Very uncomfortable for the first to speak. ''What size do you like to go and get.', NULL, '', 600, 16900, 0, 0, 0, 'Enrico', 0, 1441406107, 0, 0, 0, 0, 0, '', 1, 1441406107, 1441404000, 0),
(19169, 1, '116c443b5709e870248c93358f9a328e', 12, 'http://www.Gleason.com/et-vero-optio-exercitationem-aliquid-optio-consectetur', '', 4, 'Let this be a lesson to.', 'Sir, With no jury or judge, would be very likely to eat.', 'I wonder who will put on your head-- Do you think I can find them.'' As she said this, she was quite out of sight before the end of every line: ''Speak roughly to your little boy, And beat him when he sneezes; For he can EVEN finish, if he had never heard of such a subject! Our family always HATED cats: nasty, low, vulgar things! Don''t let him know she liked them best, For this must ever be A secret, kept from all the creatures wouldn''t be so kind,'' Alice replied, so eagerly that the way I want to get very tired of being upset, and their curls got entangled together. Alice was not a regular rule: you invented it just grazed his nose, you know?'' ''It''s the thing Mock Turtle would be only.', NULL, '', 700, 16800, 0, 0, 0, 'Unique', 0, 1441407961, 0, 0, 0, 0, 0, '', 1, 1441407961, 1441407600, 0),
(192679, 1, '06a33747b5c95799034630e578e53dc5', 10, 'http://www.Pouros.com/qui-id-molestias-non-dolores-non', '', 5, 'Rabbit just under the.', 'KNOW IT TO BE TRUE--" that''s the jury-box,'' thought Alice,.', 'Mock Turtle, who looked at Two. Two began in a hoarse, feeble voice: ''I heard every word you fellows were saying.'' ''Tell us a story.'' ''I''m afraid I can''t tell you how it was too dark to see what I should say "With what porpoise?"'' ''Don''t you mean by that?'' said the King; and as it was indeed: she was now more than Alice could not make out exactly what they WILL do next! As for pulling me out of court! Suppress him! Pinch him! Off with his head!"'' ''How dreadfully savage!'' exclaimed Alice. ''That''s the first witness,'' said the Duchess. ''Everything''s got a moral, if only you can find it.'' And she squeezed herself up and ran the faster, while more and more faintly came, carried on the end of every line:.', NULL, '', 800, 15900, 0, 0, 0, 'Gene', 0, 1441414720, 0, 0, 0, 0, 0, '', 1, 1441414720, 1441411200, 0),
(251878, 4, '3eafacc53f86c8492c309ca2772fbfe9', 5, 'http://www.Schinner.info/tempora-et-est-qui-nulla', '', 2, 'NOT!'' cried the Mouse,.', 'Twinkle, twinkle--"'' Here the Queen till she heard the.', 'Alice and all of them even when they hit her; and the sounds will take care of the gloves, and she dropped it hastily, just in time to begin at HIS time of life. The King''s argument was, that she had forgotten the Duchess to play croquet with the Dormouse. ''Write that down,'' the King added in an undertone to the fifth bend, I think?'' ''I had NOT!'' cried the Mouse, sharply and very neatly and simply arranged; the only difficulty was, that if something wasn''t done about it in less than a pig, my dear,'' said Alice, a little wider. ''Come, it''s pleased so far,'' said the Gryphon. ''Do you play croquet with the glass table and the King hastily said, and went by without noticing her. Then followed the Knave ''Turn them over!'' The Knave of.', NULL, '', 500, 15900, 0, 0, 0, 'Demarcus', 0, 1441414681, 0, 0, 0, 0, 0, '', 1, 1441414681, 1441411200, 0);

【问题讨论】:

我认为您很幸运,您的第一个查询中的数据可以通过其views_point 字段的唯一性完全解析。第二个不是这样,因为需要进行范围扫描。在publishedon_hourly 上添加一个键,并考虑从publishedcat_id 的子查询中获取这些行。 能否请您添加一些示例数据作为INSERT INTO ... 语句?最好是“内部”'2015-09-5 00:00:00' AND '2015-09-5 12:00:00' 的一些记录和一些“外部”的记录。哦,顺便说一句:你使用哪个版本的 mysql?当一个查询应该“同时”使用多个索引时,您应该密切关注 dev.mysql.com/doc/refman/5.5/en/index-merge-optimization.html 并选择左侧的“正确”版本。 @bishop 你能给我查询吗,我不明白你在子查询中添加它是什么意思 类似于:SELECT * FROM listings L1 JOIN listings L2 ON L1.id=L2.id AND L2.published=1 AND L2.cat IN (1,2,3,4,5) WHERE L1.publishedon_hourly BETWEEN UNIX_TIMESTAMP('2015-09-05 00:00:00') AND UNIX_TIMESTAMP('2015-09-05 12:00:00'); 在这里,我们通过使用自连接重写部分 where 来减少要考虑的行集。除了自联接,您还可以使用内联视图。完全未经测试。 我的意思是 MySQL 查看了您的月份跨度查询,并注意到使用可从 views_point 索引获取的行可以满足所有 where 条件。在您的日跨度查询中,情况并非如此。为什么这完全取决于您的数据,这就是为什么在不知道支持数据的情况下进行查询优化是一个巨大的废话。请参阅@VolkerK 评论。 【参考方案1】:

在您的第一个查询中,ORDER BY 使用views_point INDEX 完成,因为它用于查询的 WHERE 部分,因此在 MySQL 中可用于排序。

在第二个查询中,MySQL 使用不同的索引 listing_pcs 解析 WHERE 部分。这不能用于满足 ORDER BY 条件。 MySQL 使用 filesort 代替,如果不能使用索引,这是最好的选择。

如果索引与 WHERE 条件中使用的索引相同,MySQL 仅使用索引进行排序。这就是the manual 的含义:

在某些情况下,MySQL 不能使用索引来解析 ORDER BY,尽管它仍然使用索引来查找与 WHERE 子句匹配的行。这些案例包括:

用于获取行的键与 ORDER BY 中使用的键不同:

SELECT * FROM t1 WHERE key2=constant ORDER BY key1;

那么你能做什么:

    尝试增加您的sort_buffer_size 配置选项以使文件排序尽可能有效。对于排序缓冲区来说太大的大结果会导致 MySQL 将排序分解为块,这会更慢。

    强制 MySQL 选择不同的索引。值得注意的是,不同的 MySQL 版本选择默认索引的方式不同。例如,版本 5.1 非常糟糕,因为查询优化器已针对此版本进行了大量重写,需要大量改进。 5.6版就不错了。

    SELECT *
    FROM listings
    FORCE INDEX (views_point)
    WHERE (`publishedon_hourly` BETWEEN
           UNIX_TIMESTAMP( '2015-09-5 00:00:00' )
           AND UNIX_TIMESTAMP( '2015-09-5 12:00:00' ))
      AND (published =1)
      AND cat_id IN ( 1, 2, 3, 4, 5 )
    ORDER BY `views_point` DESC
    LIMIT 10
    

【讨论】:

【参考方案2】:

似乎是某种新闻数据库,所以试着考虑每个月制作某种新闻存档。

考虑一下这个解决方案,它不是最好的,但它可能会有所帮助

将这些列添加到 listings 表中 publishedmonth tinyint(2) UNSIGNED NOT NULL DEFAULT '0' publishedyear tinyint(2) UNSIGNED NOT NULL DEFAULT '0' publishedminute mediumint(6) UNSIGNED NOT NULL DEFAULT '0'

将此 INDEXING KEY 添加到 listings 表中

ADD KEY published_month (publishedmonth,publishedyear,publishedminute)

在插入过程中使用 php 代码中的这些值

publishedmonth 将有date('n') publishedyear 将有date('y') publishedminute 将有date('jHi')

转储大量记录然后测试此查询

SELECT * FROM listings WHERE publishedmonth = 2 AND publishedyear = 17 ORDER BY publishedminute

【讨论】:

我预测拆分日期将无济于事。既然您“接受”了答案,请说明您在测试后发现了什么。【参考方案3】: EXPLAIN 表示listings_pcs,但SHOW CREATE TABLE 没有列出该索引。我们错过了什么吗? 如果您只需要几列,请不要使用SELECT *。特别是 TEXT 列将阻止查询期间的一种形式的性能加速。 子查询通常显示查询的一部分。但是,在您的情况下(获取大量MEDIUMTEXT,并使用LIMIT),首先获取子查询中的ID,然后获取庞大的列可能是有效的。 (“懒惰评估”)见下文。 范围值 (publishedon_hourly) 在索引中最好放在最后,而不是第一个。 通常最好以 = 列 (published) 开始索引。 优化器有时会错误地选择关注ORDER BY 而不是WHERE。 (在您的情况下,两者都不是很有效率)。 INDEX(published, views_point) 可以避免排序,同时通过 WHERE 帮助一些人。 拥有始终在查询中测试的标志 (published) 会增加架构的复杂性和低效率。 BETWEEN 包含在内,所以第二个查询实际上是扫描 12 小时加一秒。 将日期拆分为年+月+日通常弊大于利。 不要将sort_buffer_size 设置为大于 RAM 的 1%。否则,您可能会遇到其他问题。 FORCE INDEX 今天可能会有所帮助,但明天当常量发生变化时会受到伤害。警告购买者。 通常最好将“click_count”或“likes”或“upvotes”放入单独的表中。这将快速变化的计数器与庞大的、相对静态的数据区分开来。因此,两者之间的干扰较少。 如果您执行上述操作,只需从计数器表中删除未发布的行,从而简化几件事。 大多数人诋毁filesort,但通常是其他方面的反派——在你的情况下,行的数量和大小。 请提供EXPLAIN FORMAT=JSON SELECT ...;可能会有一些有趣的线索。 您的发现很奇怪,足以保证在 bugs.mysql.com 上填补一个错误。

我会按照给定的顺序添加这些索引,然后看看优化器选择了什么:

INDEX(published, views_point)  -- aiming at the ORDER BY, plus picking up '='
INDEX(published, cat_id, publishedon_hourly) -- possibly the best for WHERE

或者,也许是

的“懒惰评估”
SELECT  L.*
    FROM  listings AS L
    JOIN (
        SELECT  id
            FROM  listings
            WHERE  `publishedon_hourly` BETWEEN UNIX_TIMESTAMP(...)
                                            AND UNIX_TIMESTAMP(...) 
              AND  published = 1
              AND  cat_id IN ( 1, 2, 3, 4, 5 )
            ORDER BY  `views_point` DESC
            LIMIT  10
         ) AS s  ON L.id = s.id
ORDER BY views_point DESC

-- with
INDEX(published, cat_id, publishedon_hourly, views_point, id)

注意事项:

子查询将是“使用索引”;也就是说,索引正在覆盖。 将有两种文件分类。一个是在子查询中,但从索引而不是庞大的文本中工作。一个只有 10 行,虽然很笨重。

【讨论】:

+10。似乎 listings_pc 索引是使用 ultrasecret elusivity=1 选项创建的。【参考方案4】:

非常奇怪的行为。如果没有看到相关数据,很难理解为什么 views_point 不会用于排序操作。您可以尝试为 MySQL 提供索引提示,以使用 views_point 进行此类排序。

SELECT * FROM listings
  USE INDEX FOR ORDER BY (`views_point`)
WHERE
  (
    `publishedon_hourly` BETWEEN UNIX_TIMESTAMP( '2015-09-5 00:00:00' )
    AND UNIX_TIMESTAMP( '2015-09-5 12:00:00' )
  )
  AND (published =1)
  AND cat_id IN ( 1, 2, 3, 4, 5 )
ORDER BY `views_point` DESC LIMIT 10

【讨论】:

我认为什么都没发生,你能帮我找到其他解决方案吗 @UnixMan 你能解释一下我提出的查询吗? 这是解释(表前缀但相同)i.stack.imgur.com/PGRrP.png【参考方案5】:

查询优化器并不完美。这是它做出错误决定的情况之一。它发生在一些边界线案例中。如果您的表中的数据即使发生很小的变化,它也可能会使用其他索引并运行更快的查询。

您不会等待它,您可以更改您的 listing_pcs 索引。它有 source_id 但你没有使用。那么为什么不将其替换为 view_points 呢?

KEY `listings` (`publishedon_hourly`,`published`,`point`,`cat_id`)

同样使用 tinyint(1) 对速度或节省空间没有多大用处。它仍然需要一个完整的字节。同样的 mediumint(5) 它需要 3 个字节。将deletedtypecatidpublished 合并为一列,并将索引放在该列上。

【讨论】:

组合 那些 列可能会增加不值得的复杂性。

以上是关于MySQL 索引和使用文件排序的主要内容,如果未能解决你的问题,请参考以下文章

Mysql查询使用索引使用文件排序使用临时

Mysql - ORDER BY详解

MySQL:为啥在使用索引时仍然“使用文件排序”?

MySQL:为啥简单查询不使用索引,执行文件排序

MYSQL存储引擎InnoDB(二十三):排序索引构建

使用 Using temporary 解释 mysql 性能中的计划;使用文件排序;使用索引条件