Development

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Show more information in diff labels in mailer.py

    11 answers - 1709 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Malcolm Rowe wrote:
    Do the resulting patches still work the same with patch(1)?
    Daniel Rall wrote:
    I like the additional information, though I'm a little unsure about
    the format. Can you still paste this into most 'patch' tools? I
    think GNU patch should work. I'm not sure about Solaris patch, etc.
    First of all I don't think that anyone will take a message from
    mailer.py, extract the patch information and then apply it to some
    files. Why should someone do that? It's obviously much easier to get
    the diff with 'svn diff' - or just say 'svn up'. IMH we only use the
    unified diff format because people are used to read it.
    Second, if you look at today's output, copied files are shown as
    file1 (original)
    file2 Sun Sep 9 07:20:00 2001
    If you give this to patch, patch will patch file1 and not generate a
    new file file2! So in case of copied files it doesn't work today
    either.
    Third, I tested the new diffs.
    cygwin (patch 2.5.8) copied files didn't work (see above) and
    deleted files result in a 0 byte file. It's all the same behaviour
    compared to today's diff format.
    Solaris 5.9 (patch doesn't show any version number) copied files
    didn't work (see above), deleted files result in a 0 byte file and
    patch doesn't create the new file with the 'add diff', but it's also
    the same behaviour compared to today's diff format.
    Mathias
    To unsubscribe, e-mail: dev-unsubscribe (AT) subversion (DOT) tigris.org
    For additional commands, e-mail: dev-help (AT) subversion (DOT) tigris.org
  • No.1 | | 1061 bytes | |

    9/14/06, Mathias Weinert <wein (AT) mccw (DOT) dewrote:
    Malcolm Rowe wrote:
    Do the resulting patches still work the same with patch(1)?

    Daniel Rall wrote:
    I like the additional information, though I'm a little unsure about
    the format. Can you still paste this into most 'patch' tools? I
    think GNU patch should work. I'm not sure about Solaris patch, etc.

    First of all I don't think that anyone will take a message from
    mailer.py, extract the patch information and then apply it to some
    files. Why should someone do that? It's obviously much easier to get
    the diff with 'svn diff' - or just say 'svn up'. IMH we only use the
    unified diff format because people are used to read it.

    FWIW, I've seen people actually do this. I agree, they're insane, but
    it does happen.
    -garrett

    To unsubscribe, e-mail: dev-unsubscribe (AT) subversion (DOT) tigris.org
    For additional commands, e-mail: dev-help (AT) subversion (DOT) tigris.org
  • No.2 | | 999 bytes | |

    Thu, Sep 14, 2006 at 12:38:31PM +0200, Mathias Weinert wrote:
    Third, I tested the new diffs.
    cygwin (patch 2.5.8) copied files didn't work (see above) and
    deleted files result in a 0 byte file. It's all the same behaviour
    compared to today's diff format.
    Solaris 5.9 (patch doesn't show any version number) copied files
    didn't work (see above), deleted files result in a 0 byte file and
    patch doesn't create the new file with the 'add diff', but it's also
    the same behaviour compared to today's diff format.

    Excellent, thanks for checking. Like Garrett, I've seen people do
    this. (In one case, because they could access their mailbox - but not
    the Subversion server - from outside the company's firewall).

    Regards,
    Malcolm

    To unsubscribe, e-mail: dev-unsubscribe (AT) subversion (DOT) tigris.org
    For additional commands, e-mail: dev-help (AT) subversion (DOT) tigris.org
  • No.3 | | 617 bytes | |

    [Mathias Weinert]
    cygwin (patch 2.5.8) copied files didn't work (see above) and
    deleted files result in a 0 byte file. It's all the same behaviour
    compared to today's diff format.

    GNU patch, on Unix, will delete a file in two cases:

    The file has a timestamp of (time_t)0. This will look something
    like "1969-12-31 18:00:00.000000000 -0600".
    or
    The file is named "/dev/null".

    I think it's useful to do at least one of those things.

    PGP SIGNATURE
    Version: GnuPG v1.4.5 (GNU/Linux)

    VeI3+Dmxt2wwWmzPRGQ=
    =wfc2
    PGP SIGNATURE
  • No.4 | | 1381 bytes | |

    [Mathias Weinert]
    As the second one doesn't make it obvious that the file is missing
    (on the first sight there is a "normal" path).

    It doesn't seem confusing to me at all. Don't all Unix users (i.e.,
    most people who know how to read udiff format) know what "/dev/null"
    means?

    file1r5Sun Sep 9 15:40:00 2001 (original)
    ()00:00:00 1970 (empty, because file is deleted)

    Could you please confirm that this works with your GNU patch?

    That case works, yes. However, the extra "r5" field, in the general
    case, messes up the patch -T and -Z options. Since those options are
    the only reason I can think of to bother printing timestamps, you may
    as well remove the "Sun Sep 9 15:40:00 2001" entirely.

    I'm still not convinced that "()" is a clearer magic filename than
    "/dev/null". The advantage of "/dev/null" is that you can omit the
    timestamp in _all_ cases, and I find the lack of such clutter
    refreshing when reading udiffs, especially since this usually prevents
    the headers from wrapping past 80 columns. My own patch
    management scripts already remove timestamps from my diff headers, for
    readability, and I happen to know I'm not the only one who does this.

    PGP SIGNATURE
    Version: GnuPG v1.4.5 (GNU/Linux)

    XDGAijd137Jj987tQfD4Rv0=
    =
    PGP SIGNATURE
  • No.5 | | 2457 bytes | |

    Mon, 18 Sep 2006, Mathias Weinert wrote:

    Peter Samuelson wrote:

    I'm still not convinced that "()" is a clearer magic filename than
    "/dev/null". The advantage of "/dev/null" is that you can omit the
    timestamp in _all_ cases, and I find the lack of such clutter
    refreshing when reading udiffs, especially since this usually prevents
    the headers from wrapping past 80 columns. My own patch
    management scripts already remove timestamps from my diff headers, for
    readability, and I happen to know I'm not the only one who does this.

    As already said in an earlier post IMH the output of mailer.py is more
    a textual information than a base for any kind of automatic processing.
    the other hand there seem to be several people out there who use
    mailer.py's output to apply patches etc.

    While change notification email output is more intended for human
    consumption than machine consumption, occassionally you come across
    some changes to a project that you only have easy access to in that
    form. It's awfully convenient to be able to cut and paste from the
    patch in the notification email

    So why not defining an additional configuration parameter 'diff_style'
    with the possible values 'human' which prints the diff labels like
    suggested (may be using /dev/null for deleted paths) and 'patch' which
    prints no additional information?

    Two additional thoughts about this:
    1. Although we would introduce two types of diff output, both of them
    would work with patch if no additional flags are given with the
    patch command.

    The only reason I see for such an enhancement would be if the mailer
    produced truly human-friendly output, such as the diff views in Trac
    and ViewVC, which is not intended to be digested by programs like
    'patch'.

    IMH, adding this complexity and minor additional tool incompatibility
    doesn't seem like the best trade-off for the additional information it
    provides.

    2. I don't mind using /dev/null for deleted paths. The only additional
    argument that comes into my mind is that a Windows user will not know
    anything about /dev/null.

    While I agree, () doesn't strike me as particularly more informative.

    PGP SIGNATURE
    Version: GnuPG v1.4.2.2 (GNU/Linux)

    jHt+7W6sWxIYUhet0u0=
    =MP/c
    PGP SIGNATURE
  • No.6 | | 1631 bytes | |

    9/19/06, Mathias Weinert <wein (AT) mccw (DOT) dewrote:
    , so let's use
    file1 r5 Sun Sep 9 15:40:00 2001 (original)
    /dev/null 00:00:00 1970 (empty, because file is deleted)
    for deleted files.

    Putting all arguments together I suggest
    - to change the output of mailer.py's diff as already proposed in
    an earlier post
    - accept that options -T and -Z may not work properly (which doesn't
    make so much sense IMH because when patching a wc file it should get
    a new timestamp)

    I strongly disagree. -T should continue to work if the user wants it
    - especially when there's no legitimate reason for us to break the
    unified diff format here. Diff analysis tools will want to parse the
    date from the header info - this would break them for no particular
    reason.

    So, instead of:

    file2 r1 Sun Sep 9 01:46:40 2001

    (Patch seems to interpret 'r1' as 00:00 of the current day.)

    An acceptable compromise would be:

    file2 Sun Sep 9 01:46:40 2001 (r1)

    Put the revnum and anything else in parens after the datestamp; but it
    must be filename tab datestamp - whatever comes after the datestamp is
    free form text. But, we can not put random stuff in the middle of the
    unified diff format.

    FWIW, my suggestion works fine with GNU patch and -T - so it should be
    parsable by most diff analysis tools that work off mail archives. --
    justin

    To unsubscribe, e-mail: dev-unsubscribe (AT) subversion (DOT) tigris.org
    For additional commands, e-mail: dev-help (AT) subversion (DOT) tigris.org
  • No.7 | | 429 bytes | |

    Tue, 19 Sep 2006, Mathias Weinert wrote:
    , so let's use
    file1 r5 Sun Sep 9 15:40:00 2001 (original)
    /dev/null 00:00:00 1970 (empty, because file is deleted)

    Is the timestamp really necessary on the "/dev/null" line?

    (Alan Barrett)

    To unsubscribe, e-mail: dev-unsubscribe (AT) subversion (DOT) tigris.org
    For additional commands, e-mail: dev-help (AT) subversion (DOT) tigris.org
  • No.8 | | 1433 bytes | |

    I think the parenthesized text is much too verbose:

    [Mathias Weinert]
    /dev/null00:00:00 1970(empty, because file is newly added)
    dir1/file3Sun Sep 9 01:46:40 2001(r1)

    For added files, "empty" in place of "rNNN" already implies that the
    file is newly added.

    file2Sun Sep 9 15:40:00 2001(r5, original)
    /dev/null00:00:00 1970(empty, because file is deleted)

    I would abbreviate "(empty, because file is deleted)" as "(deleted)".

    dir2/file5Sun Sep 9 01:46:40 2001(r1, original)
    dir2/file5Sun Sep 9 04:33:20 2001(r2)

    For modified files, "original" is redundant. That is what the line
    _always_ means, and anyone who knows unified diff format _knows_ this.

    Now for file copies, we have a problem:

    file1Sun Sep 9 01:46:40 2001(r1, copy source)
    dir2/file7Sun Sep 9 07:20:00 2001(r3, unchanged copied file)

    Is this an empty diff (just a diff header and nothing else), or is it
    the full contents of the file? Either way, you cannot expect diff -R
    to do the right thing, unless file1 is labeled as /dev/null.

    file1Sun Sep 9 01:46:40 2001(r1, copy source)
    dir3/file8Sun Sep 9 10:06:40 2001(r4)

    This one is copy + modify. Is this supposed to show a diff between
    file1 and file8, or the full contents of file8?

    PGP SIGNATURE
    Version: GnuPG v1.4.5 (GNU/Linux)

    L0Y/TWLZCLz1L7h+MxWfvdY=
    =PBFS
    PGP SIGNATURE
  • No.9 | | 1501 bytes | |

    +1 on Justin's suggestion of moving the revision number into the
    parenthesized text. (I was going to suggest the same.)

    Tue, 19 Sep 2006, Peter Samuelson wrote:

    I think the parenthesized text is much too verbose:

    While there's certainly room to shorten the text, I don't dislike the
    explicitness of the messages. However, isn't this information
    redundant with the A/D/etc. tags which come in the meta data at the
    beginning of the change log notification email?

    Now for file copies, we have a problem:

    file1Sun Sep 9 01:46:40 2001(r1, copy source)
    dir2/file7Sun Sep 9 07:20:00 2001(r3, unchanged copied file)

    Is this an empty diff (just a diff header and nothing else), or is it
    the full contents of the file? Either way, you cannot expect diff -R
    to do the right thing, unless file1 is labeled as /dev/null.

    At first I thought that if we're printing this header, we should show
    the full content of the file in the diff. However, that wouldn't be
    orthogonal with what I'd want for the case below

    file1Sun Sep 9 01:46:40 2001(r1, copy source)
    dir3/file8Sun Sep 9 10:06:40 2001(r4)

    This one is copy + modify. Is this supposed to show a diff between
    file1 and file8, or the full contents of file8?

    I'd want to see file1 and file8.

    PGP SIGNATURE
    Version: GnuPG v1.4.2.2 (GNU/Linux)

    QfD20SkwN+HGamzKfG8mcaA=
    =D3zV
    PGP SIGNATURE
  • No.10 | | 2680 bytes | |

    [Mathias Weinert]
    I think the parenthesized text is much too verbose:

    The initial intention of my patch was exactly to show more verbose
    texts and information

    Right, but I can't believe anyone would think "(empty, because file
    is deleted)" is really clearer than "(deleted)".

    dir2/file5 Sun Sep 9 01:46:40 2001 (r1, original)
    dir2/file5 Sun Sep 9 04:33:20 2001 (r2)

    For modified files, "original" is redundant. That is what the line
    _always_ means, and anyone who knows unified diff format _knows_ this.

    Agreed. But as "original" is already part of the current mailer.py
    output I let it stand there.

    Well, you're changing the rest of the strings, I'd go ahead and change
    that one too.

    since revision 21310. Since this revision "copied" only reports
    files that were copied and not changed and shows the complete file
    (as if it was added). Before this revision such files weren't
    reported at all.

    I asked about this (the diff which is shown for a copy, and for a
    copy+modify) for two reasons: first, it is not obvious what the _right_
    behavior is ('patch' requires a whole file in both cases, but human
    readers would usually prefer an incremental diff, which is empty in the
    one case); second, the optimal format of the diff headers depends on
    exactly what is presented.

    This again makes me think about an option diff_style or something
    similar, althoguh I am not totally convinced that this is necessary.

    The copy, rename (copy+delete), copy+modify, rename+modify and delete
    cases all have different optimal behavior for someone proofreading the
    patch vs. someone applying it with 'patch'.

    This makes me wonder if it's finally time to formally specify a "udiff
    with metadata" format, to be generated and applied to working copies by
    some contrib/client-side/svndiff.{py,pl}. This format would:
    - represent all possible changes to a working copy, in a parseable way;
    - be as human-readable as possible;
    - be based on unified diff, which is the most readable line-based diff
    format I know of;
    - be usable by GNU 'patch' and compatibles, with the caveat that things
    that patch can't handle natively, like renaming, property changes,
    and their effects (svn:eol-style and svn:executable), will not work

    I don't know that binary files can be represented without compromising
    readability. base64 with variable line lengths?

    PGP SIGNATURE
    Version: GnuPG v1.4.5 (GNU/Linux)

    SBdm6XDaut8mI5SBk8HHDQ==
    =dfTI
    PGP SIGNATURE
  • No.11 | | 1482 bytes | |

    Wed, 20 Sep 2006, Mathias Weinert wrote:

    If you want to use the diffs as input for a patch program we should
    provide two diffs. is the copy as described above and one is the
    succeeding modify.

    Please, no

    the other hand as a human reader I will always prefer only one diff
    which only contains the changes after the copy action beacuse the
    information that this is a new file as a result of a copy I can find
    in the enhanced texts in brackets (or in the header of the diff or at
    the beginning of the message, as Daniel pointed out).

    +1

    So what do we have until now?
    - Some people like the more verbose texts and some don't.
    - Most people only use the diffs for human consumption while several
    use it as input for a patch program.

    This again makes me think about an option diff_style or something
    similar, althoguh I am not totally convinced that this is necessary.
    And BTW it won't help people who cannot set personal diff options as
    it is toady for recipients of *the* Subversion repository changes.

    The primary use cases for the change notification emails generated by
    this script are, in order of priority:

    1) Code review.
    2) Patch repository.

    #1 is the primary use case, with #2 being more of a bonus that we
    should try hard not to lose.

    PGP SIGNATURE
    Version: GnuPG v1.4.2.2 (GNU/Linux)

    3Bpf8g7InbkyBDdanic=
    =Q
    PGP SIGNATURE

Re: Show more information in diff labels in mailer.py


max 4000 letters.
Your nickname that display:
In order to stop the spam: 6 + 5 =
QUESTION ON "Development"

EMSDN.COM