Home > Uncategorized > Mercurial vs Git performance

Mercurial vs Git performance

I’ve just converted Netbeans/main Mercurial repository to Git using fast-export:

git clone http://repo.or.cz/r/fast-export.git
mkdir netbeans-git
cd netbeans-git
git init
~/fast-export/hg-fast-export.sh -r ~/netbeans-hg

The conversion lasts more than 24h. Using Python cProfile I’ve found that all measurable time was spent into the patch extraction of Mercurial. I did not investigate more but here are 2 hypothesis:

  • Mercurial is not designed to work on large source tree.
  • fast-export is not using Mercurial in the most efficient way.

Having such huge repositories in both Mercurial and Git is a good opportunity to measure how much Mercurial is slower than Git. For the following tests, I’m using Git 1.7.4.1 and Mercurial 1.7.5. I chose two commands:

  • status which shows how the tool scale with large source tree.
  • log which shows how the tool scale with a large number of commit.

The Netbeans/main repository repository contains more than 190000 commits and the current source tree contains 90519 files.

Here is the Git test script:

echo REPO SIZE
du -hs .git
echo STATUS TIMING
#Clear disk cache
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
bash -c "time git status"
echo LOG TIMING
#Clear disk cache
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
bash -c "time git log > /dev/null"

And the Mercurial one:

echo REPO SIZE
du -hs .hg
echo STATUS TIMING
#Clear disk cache
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
bash -c "time hg status"
echo LOG TIMING
#Clear disk cache
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
bash -c "time hg log > /dev/null"

And here is the result:

      Git                         Mercurial
-----------------             -----------------
REPO SIZE                     REPO SIZE
613M    .git                  2,7G  .hg

STATUS TIMING                 STATUS TIMING
real    0m17.240s             real  0m44.854s
user    0m0.632s              user  0m2.944s
sys 0m1.400s                  sys   0m1.880s

LOG TIMING                    LOG TIMING
real    0m10.798s             real  1m1.934s
user    0m4.236s              user  0m48.823s
sys 0m0.384s                  sys   0m1.848s

Looking at real/user/sys values we can see that Mercurial is doing much more disk access in the status command, so
here the performance problem doesn’t come from Python. On the other hand the Mercurial log command time is almost entirely spent in the CPU, which is a more expected behavior.

Here are two interesting (yet a bit old) articles about Git vs Mercurial comparison:

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: