[PATCH] Partially-cached gits are now cleaned up instead of persisting.

Richard Maw richard.maw at codethink.co.uk
Mon Oct 7 13:39:43 BST 2013

On Thu, Oct 03, 2013 at 02:44:38PM +0100, Daniel Firth wrote:
> @@ -259,16 +276,19 @@ class LocalRepoCache(object):
>                  return self.get_repo(reponame)
>              else:
>                  errors.append(error)
> -
>          repourl = self._resolver.pull_url(reponame)
>          path = self._cache_name(repourl)
> +        tmp_path = self._mkdtemp(self._cachedir)
> +        target = os.path.join(tmp_path, "actual")
>          try:
> -            self._git(['clone', '--mirror', '-n', repourl, path])
> +            self._git(['clone', '--mirror', '-n', repourl, target])
>          except cliapp.AppException, e:
>              errors.append('Unable to clone from %s to %s: %s' %
> -                          (repourl, path, e))
> +                          (repourl, tmp_path, e))
> +            self._rmtree(tmp_path)
>              raise NoRemote(reponame, errors)

If the tmp_path is only cleared up on failure, --cachedir will fill up
with junk quickly.

I would recommend that instead of cloning into a subdirectory of your
temporary directory, clone directly into it, git will handle cloning
into an empty directory correctly.

That way, you wouldn't need to get rid of the temporary directory,
it will have become the actual one.

> +        self._move(target, path)
>          return self.get_repo(reponame)

shutil.move has similar semantics to mv, so if the repository is already
cloned, such as by a second morph process, then if `path` does exist,
your fresh clone would be copied into it.

I would recommend using os.rename instead. This will raise an exception
if the directory already exists, and since you've created the temporary
directory in the same filesystem, you don't need the added semantics
shutil.move has for copying it if it is on a different filesystem.

More information about the baserock-dev mailing list