1*022bb490SJonathan Corbet.. SPDX-License-Identifier: GPL-2.0
2*022bb490SJonathan Corbet
3*022bb490SJonathan Corbet=====================================
4*022bb490SJonathan CorbetHandling messy pull-request diffstats
5*022bb490SJonathan Corbet=====================================
6*022bb490SJonathan Corbet
7*022bb490SJonathan CorbetSubsystem maintainers routinely use ``git request-pull`` as part of the
8*022bb490SJonathan Corbetprocess of sending work upstream.  Normally, the result includes a nice
9*022bb490SJonathan Corbetdiffstat that shows which files will be touched and how much of each will
10*022bb490SJonathan Corbetbe changed.  Occasionally, though, a repository with a relatively
11*022bb490SJonathan Corbetcomplicated development history will yield a massive diffstat containing a
12*022bb490SJonathan Corbetgreat deal of unrelated work.  The result looks ugly and obscures what the
13*022bb490SJonathan Corbetpull request is actually doing.  This document describes what is happening
14*022bb490SJonathan Corbetand how to fix things up; it is derived from The Wisdom of Linus Torvalds,
15*022bb490SJonathan Corbetfound in Linus1_ and Linus2_.
16*022bb490SJonathan Corbet
17*022bb490SJonathan Corbet.. _Linus1: https://lore.kernel.org/lkml/CAHk-=wg3wXH2JNxkQi+eLZkpuxqV+wPiHhw_Jf7ViH33Sw7PHA@mail.gmail.com/
18*022bb490SJonathan Corbet.. _Linus2: https://lore.kernel.org/lkml/CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com/
19*022bb490SJonathan Corbet
20*022bb490SJonathan CorbetA Git development history proceeds as a series of commits.  In a simplified
21*022bb490SJonathan Corbetmanner, mainline kernel development looks like this::
22*022bb490SJonathan Corbet
23*022bb490SJonathan Corbet  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
24*022bb490SJonathan Corbet
25*022bb490SJonathan CorbetIf one wants to see what has changed between two points, a command like
26*022bb490SJonathan Corbetthis will do the job::
27*022bb490SJonathan Corbet
28*022bb490SJonathan Corbet  $ git diff --stat --summary vN-rc2..vN-rc3
29*022bb490SJonathan Corbet
30*022bb490SJonathan CorbetHere, there are two clear points in the history; Git will essentially
31*022bb490SJonathan Corbet"subtract" the beginning point from the end point and display the resulting
32*022bb490SJonathan Corbetdifferences.  The requested operation is unambiguous and easy enough to
33*022bb490SJonathan Corbetunderstand.
34*022bb490SJonathan Corbet
35*022bb490SJonathan CorbetWhen a subsystem maintainer creates a branch and commits changes to it, the
36*022bb490SJonathan Corbetresult in the simplest case is a history that looks like::
37*022bb490SJonathan Corbet
38*022bb490SJonathan Corbet  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
39*022bb490SJonathan Corbet                          |
40*022bb490SJonathan Corbet                          +-- c1 --- c2 --- ... --- cN
41*022bb490SJonathan Corbet
42*022bb490SJonathan CorbetIf that maintainer now uses ``git diff`` to see what has changed between
43*022bb490SJonathan Corbetthe mainline branch (let's call it "linus") and cN, there are still two
44*022bb490SJonathan Corbetclear endpoints, and the result is as expected.  So a pull request
45*022bb490SJonathan Corbetgenerated with ``git request-pull`` will also be as expected.  But now
46*022bb490SJonathan Corbetconsider a slightly more complex development history::
47*022bb490SJonathan Corbet
48*022bb490SJonathan Corbet  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
49*022bb490SJonathan Corbet                |         |
50*022bb490SJonathan Corbet                |         +-- c1 --- c2 --- ... --- cN
51*022bb490SJonathan Corbet                |                   /
52*022bb490SJonathan Corbet                +-- x1 --- x2 --- x3
53*022bb490SJonathan Corbet
54*022bb490SJonathan CorbetOur maintainer has created one branch at vN-rc1 and another at vN-rc2; the
55*022bb490SJonathan Corbettwo were then subsequently merged into c2.  Now a pull request generated
56*022bb490SJonathan Corbetfor cN may end up being messy indeed, and developers often end up wondering
57*022bb490SJonathan Corbetwhy.
58*022bb490SJonathan Corbet
59*022bb490SJonathan CorbetWhat is happening here is that there are no longer two clear end points for
60*022bb490SJonathan Corbetthe ``git diff`` operation to use.  The development culminating in cN
61*022bb490SJonathan Corbetstarted in two different places; to generate the diffstat, ``git diff``
62*022bb490SJonathan Corbetends up having pick one of them and hoping for the best.  If the diffstat
63*022bb490SJonathan Corbetstarts at vN-rc1, it may end up including all of the changes between there
64*022bb490SJonathan Corbetand the second origin end point (vN-rc2), which is certainly not what our
65*022bb490SJonathan Corbetmaintainer had in mind.  With all of that extra junk in the diffstat, it
66*022bb490SJonathan Corbetmay be impossible to tell what actually happened in the changes leading up
67*022bb490SJonathan Corbetto cN.
68*022bb490SJonathan Corbet
69*022bb490SJonathan CorbetMaintainers often try to resolve this problem by, for example, rebasing the
70*022bb490SJonathan Corbetbranch or performing another merge with the linus branch, then recreating
71*022bb490SJonathan Corbetthe pull request.  This approach tends not to lead to joy at the receiving
72*022bb490SJonathan Corbetend of that pull request; rebasing and/or merging just before pushing
73*022bb490SJonathan Corbetupstream is a well-known way to get a grumpy response.
74*022bb490SJonathan Corbet
75*022bb490SJonathan CorbetSo what is to be done?  The best response when confronted with this
76*022bb490SJonathan Corbetsituation is to indeed to do a merge with the branch you intend your work
77*022bb490SJonathan Corbetto be pulled into, but to do it privately, as if it were the source of
78*022bb490SJonathan Corbetshame.  Create a new, throwaway branch and do the merge there::
79*022bb490SJonathan Corbet
80*022bb490SJonathan Corbet  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
81*022bb490SJonathan Corbet                |         |                                      |
82*022bb490SJonathan Corbet                |         +-- c1 --- c2 --- ... --- cN           |
83*022bb490SJonathan Corbet                |                   /               |            |
84*022bb490SJonathan Corbet                +-- x1 --- x2 --- x3                +------------+-- TEMP
85*022bb490SJonathan Corbet
86*022bb490SJonathan CorbetThe merge operation resolves all of the complications resulting from the
87*022bb490SJonathan Corbetmultiple beginning points, yielding a coherent result that contains only
88*022bb490SJonathan Corbetthe differences from the mainline branch.  Now it will be possible to
89*022bb490SJonathan Corbetgenerate a diffstat with the desired information::
90*022bb490SJonathan Corbet
91*022bb490SJonathan Corbet  $ git diff -C --stat --summary linus..TEMP
92*022bb490SJonathan Corbet
93*022bb490SJonathan CorbetSave the output from this command, then simply delete the TEMP branch;
94*022bb490SJonathan Corbetdefinitely do not expose it to the outside world.  Take the saved diffstat
95*022bb490SJonathan Corbetoutput and edit it into the messy pull request, yielding a result that
96*022bb490SJonathan Corbetshows what is really going on.  That request can then be sent upstream.
97