[VTK ARB] Git-SVN

Berk Geveci berk.geveci at kitware.com
Thu Oct 8 09:23:22 EDT 2009


> As someone that appears to have lots of experience using all these tools,
> what is your personal preference?

It is hard to give a straightforward answer so I will give a long,
oblique answer instead :-) If you want an executive summary, jump to
the last paragraph.

I now do pretty much all of my development using git. I use the git
repository that tracks ParaView maintained by Brad King. We also push
it here: http://github.com/Kitware/ParaView/ so that it is publicly
accessible. I love git as a version control system. It is well-design
and very powerful. Having a local repository allows you to do many
things you cannot do with a centralized version control system:

* Having the luxury of version control even if you are not ready to
publish your work or don't have network access. This saved me from
loosing work many many times. Plus I can go back to something I did 2
weeks ago and it is documented in the commit log. I can do diffs and
such to remember what I did before.

* Very very fast.

* I can do fancy things like "git grep" to search through a version in
the repository without even checking it out

Any downsides? Not really. BUT we don't have to switch the main VTK
repository to git for me to make use of any of this. Tracking with git
(specially svn) works really well.

So what is the difference between svn and git (or mercurial) on the
central repo? Well, there is one thing you cannot do when tracking
cvs/svn with git/mercurial: merge changes the DVCS way. A common
workflow for DVCS is as follows:

1. Branch the trunk to work on something (ideally ALL development
happens on branches). The branch is local to the user's repo.
2. Write code, commit, repeat many times.
3. Now I want changes from the trunk because I want to see if I broke
something in the latest version. Merge from trunk. I am still on the
branch but the last commit now points to the branch and trunk as
parents.
4. Repeat 2 and 3 multiple times
5. Now I want to commit to the trunk. I checkout the trunk and I merge
my branch to the trunk. I then push this commit (after testing of
course). This automatically pushes all of the commits that this commit
depends on upstream.

The final version of the central repo will have the whole graph of all
the commits I made. You cannot do this when the central repo is
cvs/svn because they use a tree to represent the history instead of a
DAG. So, you are stuck with linearizing your history by "rebasing"
instead of merging. Rebasing essentially means changing the history by
pulling all commits from trunk and putting them before my changes on
the branch - which moves the branch point to the end of trunk. It
works but it is not the DVCS way.

Am I making sense? This is hard to explain.

The bottom line? Personally, I would love it if we were using git or
mercurial as our central repo. It took me a long time to get used to
DVCS (as much as I got used to it) but now that I am there, I love it.
BUT a DAG history is harder to grasp and work with than a tree history
so it would be a burden on some developers.

I think that the choice depends on the answer to this question: Are we
going to do a lot of distributed development in VTK?
* If every team working on core VTK will have write access to the VTK
repository, they can branch as necessary (specially with light-weight
branching of svn) to do their work that can impact stability. Then svn
is the best choice.
* If many teams working on core VTK will not have write access
(because we don't trust them or we don't have the resources to keep up
with them), DVCS is the right choice.
If we can focus on answering this question, we can easily make a
decision between svn and git/hg.

I apologize for the long message.

-berk


On Mon, Oct 5, 2009 at 3:46 PM, Claudio Silva <csilva at sci.utah.edu> wrote:
> Berk,
>
> Thanks for the links. At Utah, we've moved to SVN a few years back, and
> although I have used git a bit, I wouldn't consider myself an expert.
>
> As someone that appears to have lots of experience using all these tools,
> what is your personal preference?
>
> Claudio.
>
>
>
> On Oct 5, 2009, at 12:24 PM, Berk Geveci wrote:
>
>> I don't think that we should think this in terms of Git vs. SVN. We
>> should instead think of it in terms of centralized version control
>> (CVS, SVN etc.) vs. distributed version control. These two articles
>> are a good summary of the differences between the two:
>>
>> http://www.ericsink.com/entries/dvcs_dag_1.html
>> http://www.ericsink.com/entries/dvcs_dag_2.html
>>
>> If we decide that distributed version control is the right thing for
>> VTK, we can either pick a tool that is well supported on all platforms
>> we are interested in (probably Mercurial) or we can wait until one of
>> them becomes mature enough before switching.
>>
>> If we decide that centralized version control is the right thing, we
>> should probably switch to SVN as soon as we can. Tracking SVN with Git
>> is possible but it is more limited than using Git or Mercurial for
>> everything (see the articles above to understand why).
>>
>> -berk
>>
>> On Mon, Oct 5, 2009 at 3:33 AM, Paolo Quadrani <p.quadrani at cineca.it>
>> wrote:
>>>
>>> Dear all,
>>> attached you can find a document that make a comparison between Git and
>>> SVN.
>>>
>>> For new version of our MAF framework we decided to use SVN as main
>>> repository for code sharing and then each developer if want can have its
>>> own
>>> Git local repository to manage its own history for its local code
>>> changes,
>>> and maintain changes also if it has no network connection.
>>>
>>> Just a comment on code contribution: if someone wants to contribute with
>>> its
>>> code, I think that you cannot accept code without documentation (in
>>> doxygen
>>> style for example) and the related test class that prove that the code
>>> works.
>>>
>>> I can propose also that submitting code could be done through a web
>>> interface that present a form to be filled with fundamental information
>>> that
>>> the code must have and you decide what is important for you and for
>>> validating the code. This means that you filter at begin the code and
>>> accept
>>> only valid code and don't loose time to review tons of classes sent via
>>> mail.
>>> There should be also some pre checks for checking that coding conventions
>>> are respected. For MAF we have some python scripts that check that
>>> classes
>>> have the documentation inside and it is written in doxygen style, scripts
>>> that check that code is written respecting the coding convention and so
>>> on.
>>>
>>> Cheers
>>>
>>> Paolo Quadrani
>>> _________________________________
>>> CINECA
>>> System and Technology Department
>>>
>>> Via Magnanelli 6/3  40033
>>> Casalecchio di Reno
>>> Italy
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Arb mailing list
>>> Arb at vtk.org
>>> http://public.kitware.com/cgi-bin/mailman/listinfo/arb
>>>
>>>
>> _______________________________________________
>> Arb mailing list
>> Arb at vtk.org
>> http://public.kitware.com/cgi-bin/mailman/listinfo/arb
>
>



More information about the Arb mailing list