[CDash] Duplicate rows in build2test

Matějů Miroslav, Ing. Mateju.Miroslav at azd.cz
Thu May 19 13:40:09 UTC 2016


Hi,

I am sending a few updates regarding the build2test problem. I copied my DB to a virtual machine and performed a few experiments to reduce the size of build2test table. I was experimenting with an older backup version with the size of 9.3 GiB. (The DB on my production server has 17.5 GiB currently while the single table build2test has 11.7 GiB.)

I tried to remove the duplicates using the method from https://stackoverflow.com/a/3312066/711006. First of all, I had to convert the table to MyISAM because of an InnoDB bug (https://stackoverflow.com/a/8053812/711006). The MyISAM version has about one half of the size of InnoDB version.
There were 1.9 million duplicate rows out of 67.6 M and the data size decreased slightly from 1.3 GiB to 1.2 GiB while creation of the new unique index caused the index grow from 2.3 GiB to 3.2 GiB. I have not tried to convert the table back to InnoDB yet but I would expect similar inflation of index. So I would not recommend to create the unique index anymore, at least until I check it with InnoDB.

I also found that the column status of type varchar(10) contains just three values. Even the command
    SELECT status FROM build2test PROCEDURE ANALYSE()\G
recommended to change the type to
    ENUM('failed','notrun','passed') NOT NULL

I tried it and the table size decreased from 9.3 GiB to 8.5 GiB reducing the size of both data and index, at least for InnoDB. However, the MyISAM version grew slightly (from 4.4 GiB to 5 GiB).

I am going to try more optimization methods if time allows.

Best regards,

Miroslav Matějů
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cdash/attachments/20160519/2ed4fc9f/attachment-0001.htm>


More information about the CDash mailing list