[CDash] slow label query

Matějů Miroslav, Ing. Mateju.Miroslav at azd.cz
Mon Feb 27 07:31:46 UTC 2017


Hi Paul,

As far as I know (but I am rather a C++ programmer like you than a DB expert), it is better to let MySQL do as much as possible and avoid the overhead of transfers and conversions between MySQL and PHP.

I think your issue is related to my problem with build2test table performance that I described in the mail thread Optimizing build2test table<http://public.kitware.com/pipermail/cdash/2016-August/001692.html>. Unfortunately, I haven’t been able to test and push the solution yet but it is quite easy to implement in an existing installation: Just add the testid column as the second column to the buildid index for the build2test table. I have done it using PhpMyAdmin on my production machine. It took about 2 hours (for more than 100 million rows) and the submissions (which were the main issue for me) begun to be processed about 3 times faster after that.

Hope this helps,

Ing. Miroslav Matějů
Programmer Analyst

AŽD Praha s.r.o.
Technology Division
Research and Development
Žirovnická 2/3146, 106 17  Prague
Czech Republic
Phone: +420 267 287 476
Web: www.azd.cz<http://www.azd.cz/>

From: CDash [mailto:cdash-bounces at public.kitware.com] On Behalf Of Wolfenbarger, Paul R
Sent: Thursday, February 23, 2017 5:35 PM
To: cdash at public.kitware.com
Subject: [CDash] slow label query

I am hoping for someone with more sql understanding than I have to help with an issue. In api/v1/index.php at line 1004 there is a label query that is causing a bottleneck in my installation

                $label_query_base =
                    "SELECT b2t.status, b2t.newstatus
                    FROM build2test AS b2t
                    INNER JOIN label2test AS l2t ON
                    (l2t.testid=b2t.testid AND l2t.buildid=b2t.buildid)
                    WHERE b2t.buildid = '$buildid' AND
                    l2t.labelid IN $label_ids";
                $label_filter_query = $label_query_base . $limit_sql;
                $labels_result = pdo_query($label_filter_query);

This is returning two non-indexed fields from build2test and joining on indexed fields and since my build2test and label2test tables are each about 43 million records, this takes 20 to 300 seconds each. Multipled by at least 40 builds in the loop and you can see my issue.

As a c++ programmer my first instinct was to split the query and make all the operations work on indexes. I did some queries at the command line and got sub-second returns from each query and the php would now look something like

                $testids = pdo_query(
                  “select testid from label2test
                   where label2test.buildid=$buildid
                   and label2test.labelid in $label_ids") ;
                $test_ids = implode("','", $testids) ;
                $label_filter_query =
                  "select status, newstatus
                   from build2test where buildid = $buildid
                   and testid in $test_ids" ;

However, from my reading it seems that multiple queries of this sort are frowned on in the sql world.  Before I set up a test instance to check this out do any of you have any insight into how to re-structure the query to get the kind of performance I need?

Thanks!

----------------------------------------------------------------------------------------
"If you understand what you're doing, you're not learning anything."
  -- A. L.

Paul R. Wolfenbarger, P.E., M.S.
Computational Simulation Infrastructure
DevOps Product Owner
Sandia National Laboratories
P.O. Box 5800 MS 0845
Albuquerque, NM  87185-0845
505-844-5458 phone
prwolfe at sandia.gov<mailto:prwolfe at sandia.gov>
----------------------------------------------------------------------------------------


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cdash/attachments/20170227/b248c97c/attachment-0001.htm>


More information about the CDash mailing list