On Tue, Nov 26, 2013 at 04:52:20PM +0000, Lars Wirzenius wrote:
On Mon, Nov 25, 2013 at 12:55:31PM +0000, Richard Maw wrote:
> temporary_build_branch will collect uncommitted changes in the system
> branch provided, returning the repository and ref to the body of the
> while statement.
> The temporary build branch is cleaned up after the body of the while
Yep, not sure why I made that mistake.
However, it seems to me that this function should be part of the
SystemBranch class, and not a utility function. Possibly even its own
dedicated class, if the code would otherwise make the SystemBranch
class very large, but I'd prefer to wait with that until it happens.
It's large enough and has enough independent functionality that I think
it ought to be a separate class.
> +def hash_morphologies(gd, morphologies, loader): # pragma: no
> + '''Hash morphologies and return object info'''
> + for morphology in morphologies:
> + loader.unset_defaults(morphology)
> + sha1 = gd.store_blob(loader.save_to_string(morphology))
> + yield 0100644, sha1, morphology.filename
This function name and docstring don't mention git at all. It'd be
good to do that, to set a context, and to also document what the
"object info" that is returned means. The magic octal constant could
do with a comment or, preferably, a name that explains what it is.
Ack, will expand documentation in the next iteration.
I'd make this function return a list, rather than emulate lazy
evaluation using a generator. The caller must otherwise know they're
dealing with a generator, and if they don't know or remember that,
there's a potential source of bugs. Keeping it simple and returning a
list is better, since we are very much unlikely to be dealing with
millions of morphologies.
In this case I'm not using yield to make it into a generator for lazy
evaluation, I was using yield since it's less code than creating,
appending and returning the list.
However I'll agree that the potential for bugs is a real problem. When
I was testing various methods which were generators, but I only cared
about the side-effects, not the output, I got incredibly confused why
it wasn't doing what I thought it would.
I propose we add a clause to coding style that generator functions
shouldn't have side-effects unless it's inconcievable that anyone would
run it just for the side effects.
I'm not sure about the hash_morphologies name either. Should
function be re-named similarly to how store_blob got renamed?
Possibly, but I'll probably make it a private method of the
TemporaryBuildBranch class anyway so it probably doesn't matter too much.
I'm not sure if this makes it simpler than building a context manager
I found it simpler and more readable to do it as a wrapped generator
function, since I find it easier to read as
1. start here and do any setup
2. yield any values to caller here in a try block
3. if you want to suppress a caught exception, just don't re-raise it
in an except block
4. clean up in the finally block
1. write an __init__ method that takes parameters, but may or may not do
any setup, depending on if you want the context manager to be
2. write an __enter__ method that does some setup and returns any values
to the caller
3. write an __exit__ method
3a. if you want it to behave differently based on whether an exception
was raised check the parameters for the exception type, if any of
them are None then you didn't have an exception.
3b. if you want to suppress the exception then return True.
Also, it allows you to nest another context manager in the implementation
without messing about, calling __enter__ and __exit__ directly.
Basically, if you want to factor out common cleanup the
contextlib.contextmanager decorator is useful, but if you want to have
a context manager with other methods, you've got to deal with the pain,
and it's easiest just to provide a .close() method to do cleanup, so
if people want to use it as a context manager to automatically clean up
they can just wrap the creation with contextlib.closing().
However, the function is quite long as it currently exists,
and it would be good to see it split up into smaller functions. If
this becomes a SystemBranch method, or its own class, that might be a
good place to split it up into helper methods.
There's quite a few obvious ways to split up the functionality into
methods, so I'm going to do it as a class with a .close() method for
cleaning up, since it's more concise to use contextlib.closing() than
> +def temporary_build_branch(status, sb, build_ref_prefix,
> + name, email, build_uuid): # pragma: no cover
Eight argument is a lot. That's a bad code smell. If this were its own
class, some of those could be provided via the initialiser or class
attributes. Not sure what else to do to fix that code smell, though.
I haven't got an opinion on how the code actually works. It's a bit
much to grasp at once: splitting it up and having well-named helper
functions/methods would probably clear it up nicely.
As a class, I think we can get away with only two parameters to the
constructor. All the rest can be passed to the helper methods.
I was concerned that it wouldn't neatly map from a sequential function
to a bunch of helper methods, since they needed to be run in order,
but while there's an initial order of having to update the build refs
before pushing, and adding uncommitted changes will overwrite changes
made in inject_build_refs, so you should call inject_build_refs after
add_uncommitted_changes, it makes a fair amount of sense to make it
possible to use them independently.
I'm considering dropping the Temporary bit of the name, since we could
easily tweak the behaviour so that if there's no uncommitted changes it
returns the ref of the system branch, rather than the temporary build
ref, and if there's no uncommitted changes then push will push your
system branch, rather than the temporary branch.