1. "master", which is the dev state of the project I have branched
2. "stable", a stable branch of the project
3. "master-dev", which is my branch of master
4. "stable-dev", my branch of stable
At some point I made the decision to start doing most development in "stable-dev", as "master" has diverged to the point that my patches don't make any sense anymore, as some modules got rewritten. Therefore master-dev is actually behind a few months at this point. Additionally, the same applies in reverse: My "master-dev" and "stable-dev" branches have refactored some modules to the point where some patches coming from "master" or "stable" need major rework in order to apply.
Now I want to clean this up, so I get a "master-dev" that I can actually work with. This means merging the changes from "stable-dev" and "master". This is where it starts to get complicated.
Let's first look at the "stable-dev" branch: It got merged with "stable" at a number of points, which means it imports a number of changes that were rebased from "master", with compatibility adjustments, mostly hidden in merge commits. Additionally, there are some commits that are actually exclusive to the "stable" branch (version numbers etc.) that I don't want in "master-dev", or it would get messy.
My decision here was (is?) to manually filter and rebase "my" changes, hoping that I don't miss anything important in the merge commits. I'm entirely unsure though what state of "master-dev" I should actually rebase to, given that I'm skipping the interleaved commits coming from "master". If I was going by the ideal of having buildable intermediate results as often as possible and minor headaches about hiding magic in merges, I would have to "replay" the equivalent commits from "master" in the right order, which is positively too much work.
Second, there's the incompatible changes from the "master" branch itself. Now it's reasonably easy to say that I want to merge with "master", as that's the history that I'm caring about here. Yet doing this as a monolithic merge means that I would be hiding possibly thousands of lines of new code in a merge.
On the other hand, just "splitting" the merge into trivial and non-trivial changes (the former in a monolithic commit, the latter as new commits referencing the original changes) is also quite a nightmare. A bit down the line I find myself having to "amend" the merge constantly as changes move into and out of my "trivial" classification. And the intermediate states are completely nonsensical, which means that I have to fly blind for a week.
Right now I'm about to scrap all my work up to this point and move to an approach that has worked in the past - namely doing the merge in "stages" up to the "non-trivial" commits, then having one "merge" commit for each of the non-trivial commits that has the actual changes. Maybe then with the rebased commits interleaved.
Argh. Why is there no system where I can say "everything from master, master-dev and stable-dev, minus anything exclusive to stable, go figure out the dependencies and call me if something happens that's actually interesting". Darcs already does it, how frickin' hard can it be. Any better ideas about how to get this working better? :/
> Now I want to clean this up, so I get a "master-dev" that I can actually work with. This means merging the changes from "stable-dev" and "master". This is where it starts to get complicated.
In my understanding, you should merge/cherypick
* stable => stable-dev
* master => master-dev
You shouldn't have to merge from stable-dev to master-dev since stuff that has been committed to stable should also be in master (from which you merge) in the main project anyway. Except if the changeset in stable is exclusive for stable, for example if it fixes a bug with some piece of code that has been changed all over already in master already. But in this case, you don't want to have it in master-dev either.
git rebase -i -pfor squashing commits into merges does seem to cause some major explosions - so far it doesn't seem to actually create merges, and produces quite nonsensical new conflicts. It also eats the "rebase-todo" file on every occasion without backup, which means that I had to rebuild my whole rebase plan a few times over before arriving at this insight.
Next up: Let's see whether rerere actually manages to make my life easier for a change.
I think I'm just going to write my own shell scripts now instead of hoping for Git to surprise me. Here's how I rebase merges - I'm basically creating a new merge commit, and resolve any conflicts using the diff of an existing merge, which you have to tell it manually. That together with a lot of "
git show x | patch -p1" will probably give me something I can work with. Anybody have a better idea for how to achieve this?
BASE=`git rev-parse HEAD`
MERGE=$1 # The commit we want to merge with
REBASE=$2 # The merge we want to rebase
REBASE_BASE=$3 # Our equivalent relative to the rebase commit
git merge $MERGE --no-commit
CONFLICTS=`git diff --name-only --diff-filter=U`
if [[ ! -z $CONFLICTS ]]; then
# Reset file to state before merge
git checkout $BASE $CONFLICTS
# Import diff from other merge
git diff -r $REBASE_BASE -r $REBASE $CONFLICTS | patch -p1
# Add to index
git add $CONFLICTS
git rebase -i -pto get over its strange merge problems.
Bad news is that it seemingly at random made non-merge commits out of merges. I'd swear I did the exact same thing every time. Maybe squashing kills merges even if rebasing managed to recreate them or something? Aaaargh...
1. never make a mistake while merging
2. rebase instead, which means throwing away history
3. leave significant amounts of history which are known to be nonsensical (aka unbuildable)
Which all seem impracticable at large scale. What a disappointment. I will probably grit my teeth and go with option 3, even if it means hundreds of "oops" commits.
Basically, neither of them really address the issue of patch dependencies. Instead they reason about revision dependencies, which is pretty restrictive. And when people want to break out of that (as they always will want to do within about the first five minutes) they fall back to "rebase" mechanics which just dumps a whole lot of patches on you so you can construct another completely pointless revision history out of it.
The way I have come to look at it is that I am building a "story" - not of something that actually happened, but of something that could have happened in an ideal world. A world where I never had to go back on a change, or never had to completely rewrite a feature because the code it was based on was changed in parallel. That's unqestionably a good idea, as it documents your changes better than a jumbled mess of back-and forth. Yet all this should have references to where development actually came from, in case a bug needs to be traced along that path (which is why I don't want to do a mega-rebase, as the people on Stack Overflow suggest).
So if that's what Git workflow demands, I try to construct it. Staged merges sort-of manage to get me there, but now the restrictions on squashing are really derailing my efforts...
(Addendum: Just to clarify, I am talking about work here, specifically merging this and this, which means getting through about 150k LOC in changes with conflicts all over the place. All this is way too subtle for a project the size of OpenClonk, as we don't even have that much code or activity in the first place.)
This is the third time I have recreated this merge tree - going for option 4 that I didn't mention above, which is stop using Git's machinery and do stuff by hand using
But now, again, I notice that there is a mistake in the highlighted merge. Yay me.
Or was it just something that you took over from CR and where you did not do the path adjustments etc?
I am trying to be as smart about it as humanly possible, but the merge nightmares are just unavoidable.
git mergewith the same parents was an appropriate way to rebase a merge...? They're not even looking at the contents of the merge in question...
I'm now doing "
git merge --no-commit -s ours ...; git cherry-pick -m x -n ...; git commit ...", which seems a lot saner to me.
>2. rebase instead, which means throwing away history
The msysgit developers have decided that the answer to not wanting to throw history away is "duplicate it". They rebase, and then keep the old history by manufacturing a merge commit that points to the old history, but takes the state of the repository entirely from the rebased branch.
And it still doesn't provide any insight into the "and what if I screw up on the way there?" problem :(
> And it still doesn't provide any insight into the "and what if I screw up on the way there?" problem :(
Oh, that's easy: You rebase again. And now you have three sets of commits! * To make it look as if you did not screw up, simply wait until the master branch has a few more commits. Clearly, you rebased in order to bring your branch up to date with regards to the master branch! Nothing to see here, move along!
* Alternatively, only keep the original set of commits and throw the intermediate stage away. Since there's only one merge commit at the end, that should be possible.
I think the official answer is to have smaller topic branches that contain no commits that are not intended to land in the master branch, and rebase those regularly, or preferably merge them into master. I guess the theory is that the work to adapt the patches is bigger than the work to arrange them into a nice history, and once you avoid the bigger part the smaller goes away on its own.
I might add that even though I put a week of work into this, I am still months away from catching up with HEAD. I suppose you could say the mistake was doing changes spread this badly over the source code - but I sort of have no choice at this point.
If you try to rebase this interactively as follows:
Probably with the intention of getting
You instead get
A - E
With E dangling there, not reached by anything and therefore effectively removed. Reason is that the merge-enabled rebase process insists on re-using the same parents, being completely inconsistent with using rebase for reordering commits.
If we now want to edit A, like squash in a later commit, doing a "linear" rebase like I was suggesting would leave you with a situation like follows (here for the B branch):
pulling the unmodified A commit in again as the parent of C. Hence the current rebase process will go down both paths - requiring it to be "magical" about the commit's parents and removing considerable flexibility in reordering patches.
Currently I'm thinking that something like follows would be a sensible way to put generalized rebases:
branchthingy being automatically generated in the initial version of the
git-todoand marking points in the rebase process where Git should do a
git checkoutof the new version of A. This makes the branching explicit, but the merging implicit - which still leaves some space for surprises, but would by closer to the spirit of Git's rebase for my taste.
and now do
Therefore moving F towards A and removing D entirely. Apart from the discussion whether that sort of operation would be a terrible idea or not, the (I feel) best result should be
Meaning that the "magic" will have to recreate merge E detecting both that B has gotten a new child that it didn't have before and is a more likely target for the merge, and that D was removed and should be replaced by its parent. Hm.
(I hope nobody minds me randomly musing here. It's off-topic, I'm not breaking much, right? ;) )
A --- B --- C
- X --- Y
Now while reviewing & testing Y, I find that a bug is blocking me from proper testing - which is fixed in C. So I merge C:
A --- B --- C
- X --- Y --- Z
And now I can figure out all the problems that Y had, and produce a fix F, which I really want to squash with Y, as it has nothing to do with Z:
A --- B --- C
- X --- Y --- Z --- F
But this is precisely something that unmodified Git can't do.
But now, when F already icorporates changes from Z (C) it is something hard to achieve (unless you go through all of your modified files and reset some of them to "pre-Z" state).
A --- B --- C
- X --- Y --- Z
F --- Z'
I mean, I do want a branch with all changes together, so the merge Z' is what I would do next. I still don't like this extra complexity. The thing is - in most cases Z and F are completely independent as far as the code goes, therefore they should be interchangeable. I just have to test with Z because it otherwise breaks the build at some point.
> Z and F are completely independent as far as the code goes therefore they should be interchangeable
That is why they should be implemented in separate branches. If you want interchangeability you should keep them in separate branches an continue developing your code. When "time comes" you can merge either Z or F (depending on which one is more appropriate) or perform a rebase in case you have already committed something on top of Z or F
> I still don't like this extra complexity.
Well, this is the (in)famous "Git-way": Branches are cheap. Git encourages you to create new branch for every changes which you suspect may be discarded in the end.
> If you want interchangeability you should keep them in separate branches an continue developing your code.
But I can't develop my code when I can't test. And testing requires Z.
> Branches are cheap.
I'm not concerned with breaking Git, just my own brain. This approach just doesn't scale for complicated merges where I have to juggle a good dozen of Fs and Zs.
> Now while reviewing & testing Y, I find that a bug is blocking me from proper testing - which is fixed in C. So I merge C:
I think that's the mistake. Instead, discard the broken merge and merge X and C directly. Git rerere (I don't know whether that's enabled by default yet) should automatically reuse the merge resolutions you committed in Y, so you're not discarding your work. Generally, only merging "known stable" commits like releases is recommended, though I don't know how feasible that is in your case.
Plus stable commits are not only exceptionally tricky to identify - for documentation reason I would like to merge in the problematic commits in as much isolation as possible. If I include all fixes in the merge, there's a decent chance that a different non-trivial merge issue will pop up.
And sure, if the fix is only available together with lots of other changes, discarding the merge commit with the broken commit might not be the best idea - but its probably worth a try. Minimizing broken commits in the history is just as useful as other aspects of a helpful history.
The colors stand for commit authors, with me being light blue. At least Firefox is able to search the graph, so a few pointers...
- 7bf44c3 is the merge head I'm trying to push forward along the main branch that goes to the right - which would actually go a lot further to the right at this point.
- Some might note 251be09, which is actually a side-branch of my changes that would make the merges even harder, which is why I'm holding out on merging them.
- Around 4783ce7 you can see the situation I was referring to when I started ranting - these are mostly easy "1 line" commits that fix things that came up in the history following f1da701. All merges on the way are not easy by any stretch of imagination (each has a few hundred LOC code complexity because of conflicts). Squashing those together into the second development line starting with 312ae09 took me multiple days. Now I have finally a Git that can do this with --recreate-merges, so this is less scary now.
- My current situation is that I am wrestling with d92bd17, which is a massive 4000 LOC change - which only over the next dozen commits actually becomes stable. I should probably have selected a better merge point, but it's damn hard to know something like that in advance.
Powered by mwForum 2.29.7 © 1999-2015 Markus Wichitill