[vtk-developers] cdash/gerrit emails about failing tests...

Thu Jan 31 19:30:04 EST 2013

I'll adding a coverage submission from karego-at, the machine that
does valgrind. If you've got some spare hardware sitting around,
please consider setting them up to to hit coverage and memory checking
for VTK too.

In general I don't think that the problem is a lack of training or
discipline or insufficient tools (although I don't disagree that we
could improve of all of the above). I think that the second law of
software thermodynamics is at play. The larger to code base, the more
tedious effort is required to keep it tidy. The tools we have have
gotten significantly better, and I don't think gerrit (or its
wonderful UI) is a serious problem. Our dev tools just haven't gotten
betterer than the increased number of developers and commit volume
they've let us get to.

It is also not true that once the dashboard is clean, it will stay
that way if only the developers are more disciplined. What happens
when code developed years ago has some problem on a new OS that is
added to the dashboard? The dashboards will always need manual
intervention from time to time. The problem is that no one has that
time, all the time.

Every developer on the list is primarily concerned with the specific
tasks they have to get done. Occasionally one of us gets pulled back
to the task of cleaning up (usually before VTK and PV releases, or
when things get really bad such as after a huge major-release type
refactoring of master), but none of us always keeps at least one eye
on the dashboard.

So yeah, lets discuss how to and improve our processes, and all chip
in like the good team we are to clean up the dashboard. But no I don't
think that anyone should be ashamed of the state of things.

David E DeMarle
Kitware, Inc.
R&D Engineer
21 Corporate Drive
Clifton Park, NY 12065-8662
Phone: 518-881-4909

On Wed, Jan 30, 2013 at 6:43 PM, David Gobbi <david.gobbi at gmail.com> wrote:
> Hi Bill,
>
> I'm just saying that the situation isn't quite as bad as the dashboard
> makes it look.  And, to be honest, I remember plenty of times in years
> past where the dashboard was much worse for extended periods of time.
>
> As far as the dashboard is concerned, the number of things that have
> to be fixed is small and quite manageable:
>
> 1) The coverage machine needs to be more stable, you can't be doing
> coverage on a bleeding-edge system.
>
> 2) The 25 tests that fail on all machines must be fixed.  This is a pretty
> small number.  Heck, in the past I've fixed that number of failing tests
> by myself in a week during my spare time.  Unfortunately I don't have
> as much spare time as I used to.  But I can take 5 of the 25.
>
> 3) Valgrind tests.  Most developers ignore this part of the dashboard
> completely.  This is not good.
>
> There would be a #4, compiler warnings, but the dashboard is
> remarkably clean in this regard, so warnings are a low priority at the
> moment.
>
> Now the overall issue of developer participation in the code quality
> process... that's a much bigger issue than the dashboard alone.
> Is it a mentorship issue, i.e. are new developers not being taught
> the "ways of the source"?  Are there too many developers, i.e. too
> many cats to herd?  Does gerrit make it too time-consuming to
> submit follow-up fixes when people break the dashboard?  (I myself
> have found that some developers do not respond when I ask for
> a review... and I feel guilty about going to the "reliable" reviewers
> over and over again).
>
>  - David
>
>
>
>
>
> On Wed, Jan 30, 2013 at 3:10 PM, Bill Lorensen <bill.lorensen at gmail.com> wrote:
>> David,
>>
>> In years past, I gave many talks bragging about the high quality of
>> our toolkits. I would often give a live demo and point to the nightly
>> dashboard. We and others used software quality as a selling point of
>> our commitment to open source processes. I know for certain that we
>> won at least two large government grants because of our committment to
>> quality.
>>
>> We also gave many GE internal talks, taunting our process and I
>> believe many GE businesses to improve their software processes.
>>
>> I suspect that you, as our first outside developer, also promoted the
>> quality of VTK.
>>
>> Bill
>>
>> On Wed, Jan 30, 2013 at 4:57 PM, Bill Lorensen <bill.lorensen at gmail.com> wrote:
>>> I'm saying that the machine that reports coverage and the machine that
>>> runs valgrind tests less than 1/2 the code.
>>>
>>> I agree that there are so many failing tests that we have no idea
>>> about the quality of vtk.
>>>
>>> In the past, we bragged about our process. We cannot do that anymore.
>>>
>>> Bill
>>>
>>> On Wed, Jan 30, 2013 at 4:51 PM, David Gobbi <david.gobbi at gmail.com> wrote:
>>>> On Wed, Jan 30, 2013 at 1:25 PM, Bill Lorensen <bill.lorensen at gmail.com> wrote:
>>>>
>>>>> Coverage is down to 44%. This means we test less than 1/2 of vtk's code.
>>>>> Why? Because over 900 tests are failing on the coverage machine:
>>>>> http://open.cdash.org/viewTest.php?onlyfailed&buildid=2789553
>>>>
>>>> Your statement that we test less than 1/2 of the code is false.  There are
>>>> some dashboard machines (e.g. hythloth) cover much more.  I know that
>>>> I'm being picky with semantics here, but the truth is, we have so many
>>>> failing tests that the dashboard isn't even able to produce accurate code
>>>> quality metrics.
>>>>
>>>>  - David
>>>
>>>
>>>
>>> --
>>> Unpaid intern in BillsBasement at noware dot com
>>
>>
>>
>> --
>> Unpaid intern in BillsBasement at noware dot com
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html
>
> Follow this link to subscribe/unsubscribe:
> http://www.vtk.org/mailman/listinfo/vtk-developers
>