[Cdash] CDash causing high load and errors when deleting builds

Martin Apel martin.apel at simpack.de
Thu Oct 13 10:27:02 UTC 2011


Hi Julien,

we are using MyISAM tables inside MySQL.

Martin

On 13/10/11 12:22, Julien Jomier wrote:
> Hi Martin,
>
> We actually noticed the same things on a couple of CDash instances. The
> main issue is that deleting a build requires a lot of table locks
> because a lot of rows have to be deleted from many tables. The current
> CDash SVN is using a multi-delete mechanism which takes more memory but
> shoudl increase the speed of the delete.
>
> Do you know if you are using MyISAM or InnoDB (or others) as your SQL
> storage engine?
>
> Julien
>
> On 12/10/2011 10:44, Martin Apel wrote:
>> Hi all,
>>
>> I have been chasing a strange phenomenon with CDash for some weeks now
>> and have gathered a few facts by now:
>>
>> The visible effects were, that CDash was regularly (but not
>> reproducably) not accepting submissions from CMake. This mostly happened
>> such, that
>> the Build.xml file was accepted, but the subsequently submitted
>> Configure.xml and Test.xml were not. The clients had messages in their
>> log files such as "Operation too slow. Less than 1 bytes/sec transfered
>> the last 120 seconds".
>>
>> Searching for the cause I found out that the server machine (a Linux box
>> with Debian Lenny running CDash 1.8.2 on Apache 2.2.9) had a CPU load of
>> 100 % for about three hours while deleting old builds during the night.
>> The process causing the load is mysqld, which performs nearly no I/O,
>> but consumes a full CPU. During this time, it seems that network packets
>> were dropped.
>>
>> After some more investigation I found the CDash configuration option
>> CDASH_ASYNCHRONOUS_SUBMISSION, which was previously set to false.
>> I set this to true a few days ago and since then, no submissions were
>> lost anymore. Anyhow the high load caused by deleting those builds still
>> remains and tonight the CDash log showed the following messages:
>>
>> [2011-10-12T03:50:38][INFO][pid=24652](removeFirstBuilds): about to query for builds to remove
>> [2011-10-12T03:50:38][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23097 projectid: 1
>> [2011-10-12T03:58:05][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23099 projectid: 1
>> [2011-10-12T04:07:33][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23092 projectid: 1
>> [2011-10-12T04:17:00][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23094 projectid: 1
>> [2011-10-12T04:24:28][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23103 projectid: 1
>> [2011-10-12T04:33:52][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23098 projectid: 1
>> [2011-10-12T04:34:00][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23119 projectid: 1
>> [2011-10-12T04:43:27][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23120 projectid: 1
>> [2011-10-12T04:52:58][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23121 projectid: 1
>> [2011-10-12T05:02:03][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23123 projectid: 1
>> [2011-10-12T05:11:40][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23122 projectid: 1
>> [2011-10-12T05:19:13][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23124 projectid: 1
>> [2011-10-12T05:26:39][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23131 projectid: 1
>> [2011-10-12T05:40:03][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23127 projectid: 1
>> [2011-10-12T05:55:18][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23126 projectid: 1
>> [2011-10-12T06:00:03][ERROR][pid=23419](taking lock: projectid=1, other processor pid='22563' apparently stalled, lastupdated='2011-10-12 03:51:34'): AcquireProcessingLock
>> [2011-10-12T06:00:03][ERROR][pid=23419](1 submission records assumed stalled, reset to status=0): ResetApparentlyStalledSubmissions
>> [2011-10-12T06:00:41][INFO][pid=22563](pid '22563' does not own lock anymore: abandoning loop...): ProcessSubmissions
>> [2011-10-12T06:00:41][ERROR][pid=22563](lock not released, unexpected pid mismatch: pid='23419' mypid='22563' - attempt to unlock a lock we don't own...): ReleaseProcessingLock
>> [2011-10-12T06:02:46][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23128 projectid: 1
>> [2011-10-12T06:12:08][INFO][pid=24652](removeFirstBuilds): removing old buildid: 23125 projectid: 1
>> [2011-10-12T06:12:15][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24891 projectid: 1
>> [2011-10-12T06:13:22][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24893 projectid: 1
>> [2011-10-12T06:14:08][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24892 projectid: 1
>> [2011-10-12T06:14:53][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24894 projectid: 1
>> [2011-10-12T06:15:40][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24896 projectid: 1
>> [2011-10-12T06:16:26][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24897 projectid: 1
>> [2011-10-12T06:17:34][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24899 projectid: 1
>> [2011-10-12T06:18:28][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24898 projectid: 1
>> [2011-10-12T06:19:17][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24901 projectid: 1
>> [2011-10-12T06:20:03][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24900 projectid: 1
>> [2011-10-12T06:21:11][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24902 projectid: 1
>> [2011-10-12T06:21:58][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24908 projectid: 1
>> [2011-10-12T06:22:49][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24903 projectid: 1
>> [2011-10-12T06:23:34][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24904 projectid: 1
>> [2011-10-12T06:24:41][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24906 projectid: 1
>> [2011-10-12T06:25:31][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24905 projectid: 1
>> [2011-10-12T06:26:16][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24907 projectid: 1
>> [2011-10-12T06:27:24][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24909 projectid: 1
>> [2011-10-12T06:28:14][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24912 projectid: 1
>> [2011-10-12T06:29:10][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24911 projectid: 1
>> [2011-10-12T06:30:17][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24910 projectid: 1
>> [2011-10-12T06:31:02][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24914 projectid: 1
>> [2011-10-12T06:31:53][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24913 projectid: 1
>> [2011-10-12T06:33:02][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24915 projectid: 1
>> [2011-10-12T06:33:49][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24917 projectid: 1
>> [2011-10-12T06:34:38][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24919 projectid: 1
>> [2011-10-12T06:35:29][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24918 projectid: 1
>> [2011-10-12T06:36:39][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24916 projectid: 1
>> [2011-10-12T06:37:25][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24920 projectid: 1
>> [2011-10-12T06:38:19][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24921 projectid: 1
>> [2011-10-12T06:39:04][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24922 projectid: 1
>> [2011-10-12T06:40:13][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24923 projectid: 1
>> [2011-10-12T06:40:59][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24924 projectid: 1
>> [2011-10-12T06:42:01][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24927 projectid: 1
>> [2011-10-12T06:43:01][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24926 projectid: 1
>> [2011-10-12T06:44:09][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24925 projectid: 1
>> [2011-10-12T06:44:55][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24930 projectid: 1
>> [2011-10-12T06:46:28][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24928 projectid: 1
>> [2011-10-12T06:47:14][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24929 projectid: 1
>> [2011-10-12T06:48:46][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24931 projectid: 1
>> [2011-10-12T06:49:57][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24934 projectid: 1
>> [2011-10-12T06:51:26][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24932 projectid: 1
>> [2011-10-12T06:53:23][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24936 projectid: 1
>> [2011-10-12T06:55:20][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24933 projectid: 1
>> [2011-10-12T06:56:54][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24937 projectid: 1
>> [2011-10-12T06:58:47][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24949 projectid: 1
>> [2011-10-12T06:59:55][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24951 projectid: 1
>> [2011-10-12T07:00:43][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24950 projectid: 1
>> [2011-10-12T07:01:38][INFO][pid=24652](removeBuildsGroupwise): removing old buildid: 24952 projectid: 1
>>
>> The time of these messages corresponds exactly to the time with
>> continuously high load.
>> Does anybody have an idea, what is going on there, and why it takes ages
>> to delete those old builds?
>>
>> Martin
>>
>>
>> _______________________________________________
>> Cdash mailing list
>> Cdash at public.kitware.com
>> http://public.kitware.com/cgi-bin/mailman/listinfo/cdash
> .
>


-- 

Martin Apel                                    
Software Architect                              
Phone:   + 49 8105 77266-53
E-Mail:  martin.apel at simpack.de

SIMPACK AG
Friedrichshafener Strasse 1, 82205 Gilching, Germany
info at simpack.de, www.simpack.com
Phone: + 49 8105 77266-0
Fax:   + 49 8105 77266-11


Executive Board: Dr. Alexander Eichberger, Dr. Lutz Mauer
Chair of Supervisory Board: Silvia Förster (CPA)
Commercial Register München HRB 181 229



More information about the CDash mailing list