如何优化限制查询以便从庞大的表中更快地访问数据?

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了如何优化限制查询以便从庞大的表中更快地访问数据?相关的知识,希望对你有一定的参考价值。

我试图从大小为9 GB +的表中获取数据并拥有数百万条记录。我正在使用该数据填充DataTable。我从表中获取块的记录,即每页10个,通过Ajax和SQL Limit查询。

pagination

在上面的图像中,您可以看到我们有223,740页面,因此当我尝试访问最后一页时,查询将永远加载数据。但是,当我尝试访问第一页时,数据加载速度更快。但直接访问更高的偏移页面需要永远加载。

 public static function getAllEvaluationsWithNameForDataTable($start){
        $queryBuilder = new Builder();

        return  $queryBuilder
            ->from(array('e' =>  static::class))
            ->leftJoin('CxFrameworkModelsCommonUserCxUser',  'e.cx_hc_user_id = u.id', 'u')
            ->columns('e.id, e.first_name, u.initials as assigned_coach, e.gender, e.email, e.phone, e.age, e.version, e.evaluation_status, e.ip_address, e.date_created, e.date_updated')
            ->orderBy('e.id asc')
            ->limit(10, $start)
            ->getQuery()
            ->execute()
            ->toArray();
}

php函数/控制器:

public function getEvaluationsAction() {
        // Enable Json response
        $this->setJsonResponse();
        // This action can be called only via ajax
        $this->requireAjax();

        // Forward to access denied if current user is not allowed to view evaluation details
        if (!$this->CxAuth->currentUserIsAllowedTo('VIEW', CxEbEvaluation::getClassResourceName()))
            return $this->forwardToAccessDeniedError();

        if(isset($_GET['start'])){
            $start = $this->request->get('start');
        }else{
            $start = 10;
        }

        $recordsTotal = count(CxEbEvaluation::getAllForDataTable(array('id')));

        //Get Evaluations from DB
        $evaluation_quizzes = CxEbEvaluation::getAllEvaluationsWithNameForDataTable(intval($start));

        //for getting base URL
        $url = new Url();

        $data = array();

        foreach ($evaluation_quizzes as $key => $quiz) {
            $data[ $key ][ 'id' ] = $quiz[ 'id' ];
            $data[ $key ][ 'first_name' ] = $quiz[ 'first_name' ];
            if($quiz[ 'assigned_coach' ]){
                $data[ $key ][ 'assigned_coach' ] = $quiz['assigned_coach'];
            }else{
                $data[ $key ][ 'assigned_coach' ] = "Not assigned";
            }

            $data[ $key ][ 'gender' ] = $quiz[ 'gender' ];
            $data[ $key ][ 'email' ] = $quiz[ 'email' ];
            $data[ $key ][ 'phone' ] = $quiz[ 'phone' ];
            $data[ $key ][ 'age' ] = $quiz[ 'age' ];
            $data[ $key ][ 'version' ] = $quiz[ 'version' ];
            $data[ $key ][ 'quiz' ] =  $url->get('/admin/get-evaluation-quiz-by-id');
            $data[ $key ][ 'manage-notes-messages-and-calls' ] =  $url->get('/admin/manage-notes-messages-and-calls');
            $data[ $key ][ 'date_created' ] = date("m/d/Y H:i:s", $quiz[ 'date_created' ]);
            $data[ $key ][ 'evaluation_status' ] = $quiz[ 'evaluation_status' ];
        }
        // Return data array
        return array(
            "recordsTotal"    => $recordsTotal,
            "recordsFiltered" => $recordsTotal ,
            "data"            => $data //How To Retrieve This Data
        );
        // Return data
    }

使用javascript

cx.common.data.cxAdminDataTables.EbEvaluation = $CxRecordsTable.cxAdminDataTable({
        ajaxUrl: '<?php echo $this->CxHelper->Route('eb-admin-get-evaluations')?>' + eqQuizIdQueryString,
        serverSide: true,
        processing: true,
        recordsFiltered :true,
        columns: [
            cx.common.admin.tableEditColumn('id',{ delete: true }),
            { data: 'first_name' },
            { data: 'assigned_coach' },
            { data: 'gender' },
            { data: 'email' },
            { data: 'phone' },
            { data: 'age' },
            cx.common.admin.tableLinkColumn('quiz', quizLinkOptions),
            cx.common.admin.tableEditColumn('id', healthCoachLinkOptions),
            cx.common.admin.tableLinkColumn('manage-notes-messages-and-calls', manageNotesMessagesAndCalls),
            { data: 'date_created' },
            cx.common.admin.tableSwitchableColumn('evaluation_status', {
                editable: true,
                createdCell: function (td, cellData, rowData, row, col){
                    $(td).data('evaluation-status-id', rowData.id);
                },
                onText: 'Complete',
                offText: 'In progress'
            })
        ],
        toolbarOptions:{
            enabled: false
        },          success: function (data) {
                            cx.common.data.cxAdminDataTables.EbEvaluation.cxAdminDataTable("reloadAjax");
                        }
                    });
                }
                else {
                    $row.removeClass('alert');
                }
            });
        }
    });

我希望这个问题很清楚。如果需要其他任何东西,请更新我,我将提供。

(来自评论)

SELECT  e.id` AS id, e.first_name AS first_name,
        u.initials AS assigned_coach,
        e.gender AS gender, e.email AS email, e.phone AS phone,
        e.age AS age, e.version AS version,
        e.evaluation_status AS evaluation_status,
        e.ip_address AS ip_address, e.date_created AS date_created,
        e.date_updated AS date_updated
    FROM  evaluation_client AS e
    LEFT JOIN  cx_user AS u  ON e.cx_hc_user_id = u.id
    ORDER BY  e.id ASC
    LIMIT  :APL0 OFFSET, :APL1
答案

由Masivuye Cokile链接的Why does MYSQL higher LIMIT offset slow the query down?问题和答案,以及那里提供的https://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/链接,包含了关于为什么大偏移查询缓慢的优秀纲要。基本上,对于LIMIT 150000, 10mysql仍会扫描整个150000行,即使它稍后丢弃它们。为了加快速度,您可以:

  • 使用顺序分页,即“在ID #N之后显示10个条目”,它工作得非常快,是一个很好的选择,但丢弃实际的页码;您的用户将留下“next / prev”链接和/或您可以使用count查询计算的近似页码。
  • 或者在id上创建索引,然后强制mysql执行仅索引搜索。

对于第二种方法,您必须重写查询

SELECT ... 
  FROM table t 
WHERE ...
ORDER by t.id ASC
LIMIT 150000, 10

SELECT  ...
  FROM (
        SELECT  id
        FROM    table
        ORDER BY
                id ASC
        LIMIT 150000, 10
        ) o
JOIN table t
  ON t.id = o.id
WHERE ...
ORDER BY t.id ASC

或者,由于您不局限于单个查询,因此您可以使用查找页面上第一个项目的ID

SELECT id 
  FROM table 
 ORDER BY id ASC 
 LIMIT 150000, 1

然后使用所述id来检索实际数据:

SELECT ...
  FROM table
 WHERE id >= $id
   AND ...
 ORDER BY id ASC
 LIMIT 0, 10
另一答案

模式SELECT whatever FROM vast_table ORDER BY something LIMIT 10 large_number是一个臭名昭着的性能反模式。为什么?因为它必须检查很多行才能返回一些。

如果您的id值是主键(或任何索引列),则可以对其进行分页

SELECT whatever FROM vast_table WHERE id BETWEEN large_value AND large_value+9 ORDER BY id;

或者你可以试试

SELECT whatever FROM vast_table WHERE id >= large_value ORDER BY id LIMIT 10;

如果您的id值存在空白,则不会完美地分页。但它的表现相当不错。

另一答案

该问题与我的表中的日期列数据类型有关。我在日期字段中使用int数据类型,当我将日期列的数据类型更改为datetime时,搜索结果以秒为单位。

我找到解决方案@ http://dbscience.blogspot.com/2008/08/can-timestamp-be-slower-than-datetime.html的来源

以上是关于如何优化限制查询以便从庞大的表中更快地访问数据?的主要内容,如果未能解决你的问题,请参考以下文章

Linq MVC5 MSQL 从包含大量列的表中选择一些列,以便更好更快地查询

优化对只读sqlite数据库的快速访问?

如何在没有昂贵查询的情况下更快地自连接表(Oracle SQL)

优化查询以从不同的表中获取唯一(用户)记录

MySQL查询优化从大表中获取8-10条记录

比使用 LINQ 更快地查询数据库的每条记录