1*022bb490SJonathan Corbet.. SPDX-License-Identifier: GPL-2.0 2*022bb490SJonathan Corbet 3*022bb490SJonathan Corbet===================================== 4*022bb490SJonathan CorbetHandling messy pull-request diffstats 5*022bb490SJonathan Corbet===================================== 6*022bb490SJonathan Corbet 7*022bb490SJonathan CorbetSubsystem maintainers routinely use ``git request-pull`` as part of the 8*022bb490SJonathan Corbetprocess of sending work upstream. Normally, the result includes a nice 9*022bb490SJonathan Corbetdiffstat that shows which files will be touched and how much of each will 10*022bb490SJonathan Corbetbe changed. Occasionally, though, a repository with a relatively 11*022bb490SJonathan Corbetcomplicated development history will yield a massive diffstat containing a 12*022bb490SJonathan Corbetgreat deal of unrelated work. The result looks ugly and obscures what the 13*022bb490SJonathan Corbetpull request is actually doing. This document describes what is happening 14*022bb490SJonathan Corbetand how to fix things up; it is derived from The Wisdom of Linus Torvalds, 15*022bb490SJonathan Corbetfound in Linus1_ and Linus2_. 16*022bb490SJonathan Corbet 17*022bb490SJonathan Corbet.. _Linus1: https://lore.kernel.org/lkml/CAHk-=wg3wXH2JNxkQi+eLZkpuxqV+wPiHhw_Jf7ViH33Sw7PHA@mail.gmail.com/ 18*022bb490SJonathan Corbet.. _Linus2: https://lore.kernel.org/lkml/CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com/ 19*022bb490SJonathan Corbet 20*022bb490SJonathan CorbetA Git development history proceeds as a series of commits. In a simplified 21*022bb490SJonathan Corbetmanner, mainline kernel development looks like this:: 22*022bb490SJonathan Corbet 23*022bb490SJonathan Corbet ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 24*022bb490SJonathan Corbet 25*022bb490SJonathan CorbetIf one wants to see what has changed between two points, a command like 26*022bb490SJonathan Corbetthis will do the job:: 27*022bb490SJonathan Corbet 28*022bb490SJonathan Corbet $ git diff --stat --summary vN-rc2..vN-rc3 29*022bb490SJonathan Corbet 30*022bb490SJonathan CorbetHere, there are two clear points in the history; Git will essentially 31*022bb490SJonathan Corbet"subtract" the beginning point from the end point and display the resulting 32*022bb490SJonathan Corbetdifferences. The requested operation is unambiguous and easy enough to 33*022bb490SJonathan Corbetunderstand. 34*022bb490SJonathan Corbet 35*022bb490SJonathan CorbetWhen a subsystem maintainer creates a branch and commits changes to it, the 36*022bb490SJonathan Corbetresult in the simplest case is a history that looks like:: 37*022bb490SJonathan Corbet 38*022bb490SJonathan Corbet ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 39*022bb490SJonathan Corbet | 40*022bb490SJonathan Corbet +-- c1 --- c2 --- ... --- cN 41*022bb490SJonathan Corbet 42*022bb490SJonathan CorbetIf that maintainer now uses ``git diff`` to see what has changed between 43*022bb490SJonathan Corbetthe mainline branch (let's call it "linus") and cN, there are still two 44*022bb490SJonathan Corbetclear endpoints, and the result is as expected. So a pull request 45*022bb490SJonathan Corbetgenerated with ``git request-pull`` will also be as expected. But now 46*022bb490SJonathan Corbetconsider a slightly more complex development history:: 47*022bb490SJonathan Corbet 48*022bb490SJonathan Corbet ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 49*022bb490SJonathan Corbet | | 50*022bb490SJonathan Corbet | +-- c1 --- c2 --- ... --- cN 51*022bb490SJonathan Corbet | / 52*022bb490SJonathan Corbet +-- x1 --- x2 --- x3 53*022bb490SJonathan Corbet 54*022bb490SJonathan CorbetOur maintainer has created one branch at vN-rc1 and another at vN-rc2; the 55*022bb490SJonathan Corbettwo were then subsequently merged into c2. Now a pull request generated 56*022bb490SJonathan Corbetfor cN may end up being messy indeed, and developers often end up wondering 57*022bb490SJonathan Corbetwhy. 58*022bb490SJonathan Corbet 59*022bb490SJonathan CorbetWhat is happening here is that there are no longer two clear end points for 60*022bb490SJonathan Corbetthe ``git diff`` operation to use. The development culminating in cN 61*022bb490SJonathan Corbetstarted in two different places; to generate the diffstat, ``git diff`` 62*022bb490SJonathan Corbetends up having pick one of them and hoping for the best. If the diffstat 63*022bb490SJonathan Corbetstarts at vN-rc1, it may end up including all of the changes between there 64*022bb490SJonathan Corbetand the second origin end point (vN-rc2), which is certainly not what our 65*022bb490SJonathan Corbetmaintainer had in mind. With all of that extra junk in the diffstat, it 66*022bb490SJonathan Corbetmay be impossible to tell what actually happened in the changes leading up 67*022bb490SJonathan Corbetto cN. 68*022bb490SJonathan Corbet 69*022bb490SJonathan CorbetMaintainers often try to resolve this problem by, for example, rebasing the 70*022bb490SJonathan Corbetbranch or performing another merge with the linus branch, then recreating 71*022bb490SJonathan Corbetthe pull request. This approach tends not to lead to joy at the receiving 72*022bb490SJonathan Corbetend of that pull request; rebasing and/or merging just before pushing 73*022bb490SJonathan Corbetupstream is a well-known way to get a grumpy response. 74*022bb490SJonathan Corbet 75*022bb490SJonathan CorbetSo what is to be done? The best response when confronted with this 76*022bb490SJonathan Corbetsituation is to indeed to do a merge with the branch you intend your work 77*022bb490SJonathan Corbetto be pulled into, but to do it privately, as if it were the source of 78*022bb490SJonathan Corbetshame. Create a new, throwaway branch and do the merge there:: 79*022bb490SJonathan Corbet 80*022bb490SJonathan Corbet ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN 81*022bb490SJonathan Corbet | | | 82*022bb490SJonathan Corbet | +-- c1 --- c2 --- ... --- cN | 83*022bb490SJonathan Corbet | / | | 84*022bb490SJonathan Corbet +-- x1 --- x2 --- x3 +------------+-- TEMP 85*022bb490SJonathan Corbet 86*022bb490SJonathan CorbetThe merge operation resolves all of the complications resulting from the 87*022bb490SJonathan Corbetmultiple beginning points, yielding a coherent result that contains only 88*022bb490SJonathan Corbetthe differences from the mainline branch. Now it will be possible to 89*022bb490SJonathan Corbetgenerate a diffstat with the desired information:: 90*022bb490SJonathan Corbet 91*022bb490SJonathan Corbet $ git diff -C --stat --summary linus..TEMP 92*022bb490SJonathan Corbet 93*022bb490SJonathan CorbetSave the output from this command, then simply delete the TEMP branch; 94*022bb490SJonathan Corbetdefinitely do not expose it to the outside world. Take the saved diffstat 95*022bb490SJonathan Corbetoutput and edit it into the messy pull request, yielding a result that 96*022bb490SJonathan Corbetshows what is really going on. That request can then be sent upstream. 97