Development

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • New: unformatted files from gfortran are incompatible with g77 unformatted files and solar

    8 answers - 603 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    I ran into a problem with unformatted files with gfortran. It appears that it is
    padding the delimitters to 8 byte boundaries (or the delimitters are 64 bit
    longs) rather than the normal 4 byte delimitter approach.
    For instance, I have a program that does this:
    write(1) 1
    end
    g77 and solaris f95 returns (ignoring the byteswapping issues):
    od -x fort.1:
    0000000 0004 0000 0001 0000 0004 0000
    gfortran returns:
    0000000 0004 0000 0000 0000 0001 0000 0004 0000
    0000020 0000 0000
    is there a flag to make the delimitters 4 bytes rather than 8 bytes?
  • No.1 | | 1054 bytes | |

    Additional Comments From rrr6399 at futuretek dot com 2005-09-11 04:07
    I just found this discussion:

    It doesn't look like from the docs that it was implemented in the main-line yet,
    is it available somehow else?

    It seems to me that unformatted files on 32 bit machines should be compatible.
    I don't know of any other fortran compiler that assumes 64 bit record markers.
    We really need a flag to change the behavior if it is not available already.

    BTW, it is pretty typical that aero engineers involved in CFD (computational
    fluid dynamics) need to ship around large (50-1000 MB) binary files between
    various machines without having to reformat them. Typically they generate files
    that are all big-endian using a compiler switch to avoid having to byte swap as
    well. So, while we're at it, it'd be great to have a compiler switch that
    reversed the byte order of integers (2, 4, 8 byte) and floating point numbers
    (4, 8 byte) when they are read from or written to an unformatted file.
  • No.2 | | 789 bytes | |

    Additional Comments From jblomqvi at cc dot hut dot fi 2005-09-11 12:09
    Bud Davis is back and working on the pluggable record markers patch. Expect it
    to be completed and committed within a few weeks.

    There is no simple solution that is right for all situations. Gfortran uses
    64-bit record markers by default since we want compatibility between LP32 and
    LP64 bit platforms (which incidentally g77 doesn't provide), and we want to
    support records bigger than 2 GB.

    There has been some discussion about a byteswapio patch, but nothing has been
    done. Patches are welcome, of course.

    And, I would hardly classify the bug as "critical".

    If you want portable binary io you're probably better off using a library such
    as netcdf anyway.
  • No.3 | | 972 bytes | |

    Additional Comments From rrr6399 at futuretek dot com 2005-09-11 13:24
    I believe it really is critical since myself and many others who may use
    gfortran need to interoperate with data generated by legacy codes on the same
    system that were compiled with g77 or on other systems (Sun, SGI) compiled with
    their native f77 or f90/f95 compilers. Some of the codes are proprietary, many
    are from other third parties, it isn't really feasible to force them to use
    another binary file library. Plus, I've been working with unformatted FRTRAN
    files for 15 years and this is the first time I've had this type of issue with
    the structure of an unformatted file.

    So given the current situtation, I'll have to write a format convertor for
    unformatted data from any gfortran code to the "standard" g77 format in order to
    interoperate.

    I would've thought that the FRTRAN spec would've covered this kind of thing.
  • No.4 | | 328 bytes | |

    Additional Comments From pinskia at gcc dot gnu dot org 2005-09-11 14:27
    More than that this is a dup of bug 19303. unformatted was never supposed to be used with different
    versions of the compiler, or across different targets. It is just like using
    write in C.

    This bug has been marked as a duplicate of 19303
  • No.5 | | 3353 bytes | |

    Additional Comments From rrr6399 at futuretek dot com 2005-09-11 20:47
    Well, just to warn you, you're going to have a lot of steamed engineers on your
    hands when they discover that they either have to recompile all of their FRTRAN
    codes on every platform with gfortran, write all of their data (50 - 1000MB
    binaries and larger) as ASCII, or convert their files to the traditional g77
    format to interoperate with the rest of their processes. What will actually
    happen is that users will get burnt once and then they'll drop gfortran like a
    hot potato and put in a request to purchase a commercial compiler from Intel or PGI.

    Another reason that this feature is so painful is that engineers tend to
    pipeline their unformatted files from one process to the next. There can
    literally be 10 or more programs in a process that will read from or write to a
    given unformatted file. Any program in the process that was compiled with
    gfortran will break the process and probably in such a subtle way that it'll
    take each user hours to figure out what went wrong. (I spent four hours
    on it yesterday discovering what the problem was in my process and that's with
    knowing how to read the output from "od".)

    Furthermore, when one writes binary in C, you get exactly what your variables
    are sized to in your code. If the platform is a 32 bit machine and is IEEE
    compliant, you pretty much know that a short is 16bit, an int is 32 bit, a long
    is 32 bit and a long long is 64 bit. Typically, many times developers even
    define macros or new types that guarantee that the variables are the same
    lengths independent of 32 bit or 64 bit architectures. There are also compiler
    switches many times that govern the length of the various primitive types. So if
    portability of the data is important to you, the resulting binary file is
    interoperable except for big-Endian versus little-Endian issues(that can be
    worked with a flag at the top or always writing in one endianess.) With C binary
    files, of course, you don't have to worry about the the silly record markers
    either that muck up the works.

    I think the goal of allowing record lengths 2GB is a good long time target,
    but having been in the field for many years, I imagine that the current use
    cases for record lengths 2GBs are very very few compared to those involving
    interoperability with other compilers and platforms. The few users requiring
    >2GB record lengths can easily modify their write statement to output multiple

    records as well rather than one large one.

    Hopefully, once there is a compiler switch in place, everybody will be happy. :-)

    p.s. It is too bad the FRTRAN spec (even the 2003) threw in the towel
    on interoperable binary files. It forces everybody to deal with these issues
    in different ways using various third party libaries or ad hoc cobbled-together
    solutions. As shown by the CRBA standard and others, the specification of
    interoperable binary files is completely doable.

    I imagine the FRTRAN vendors will continue to ensure that their binary file
    format can be completely specified by the user just to meet the needs of their
    customers even though the spec doesn't force them to.
  • No.6 | | 786 bytes | |

    Additional Comments From kargl at gcc dot gnu dot org 2005-09-11 22:05
    (In reply to comment #5)

    Furthermore, when one writes binary in C, you get exactly what your variables
    are sized to in your code. If the platform is a 32 bit machine and is IEEE
    compliant,

    What happens when one or the other of these conditions isn't met?

    I imagine the FRTRAN vendors

    The correct spelling of the name of the language is Fortran.

    Your comments #3 and #5 are nice little rants. Actual code to fix
    the problem speaks volumes over your rants. In particular, you've
    been told that Bud Davis is working on the problem. If there was
    an easy solution to the problem, Bud (or one the others working on
    gfortran) would have fixed it long ago.
  • No.7 | | 1952 bytes | |

    Additional Comments From rrr6399 at futuretek dot com 2005-09-11 23:22
    I'm not sure why I'm getting so much pushback on this silly thing.

    I realize that disagreeing with the assumptions made during the design may be
    regarded by some as "rants", but what I was attempting to do (perhaps poorly) is
    illustrate why simple decisions that might seem fairly benign can have huge
    efficiency impacts on a large population of users. There has been a pattern of
    these decisions made over the years that have wasted thousands (if not millions)
    of hours of people's precious time. (Big Endian vs. Little Endian, \ versus /,
    CR vs CR/LF vs LF, 8 byte vs 4 byte markers, etc.)

    If you read some of the previous comments, you'll see that some don't think it's
    an issue. It really is a problem that should take high priority. I know Bud is
    going to apply a variation of the patch he wrote a few months ago soon and I'm
    happy about that. I hope there isn't any pushback from the rest of the
    developers. I think the default should actually be 4 byte markers, but that's
    just my humble opinion.

    BTW, I think both spellings of FRTRAN (FRmula TRANslation)
    are correct actually:

    (Not that it really matters in the big scheme of things.)

    I'll also post a small C program to convert to the g77 format soon as a
    temporary fix until the patch is in place. (I'm completely hammered with
    work right now, but I'll try to contribute more in the future. I've already
    sent in some code snippets on the little endian/big endian issue.)

    Also, if I wanted to be condescended to I'd go talk to my wife. :-)
    I hope that we can all keep this professional in the future and
    respect people's time (development, trouble shooting and bug reporting)
    that they put into this to help make a better product for
    everybody.
  • No.8 | | 3240 bytes | |

    Additional Comments From kargl at gcc dot gnu dot org 2005-09-12 00:35
    (In reply to comment #7)

    I realize that disagreeing with the assumptions made during the design may be
    regarded by some as "rants", but what I was attempting to do (perhaps poorly) is
    illustrate why simple decisions that might seem fairly benign can have huge
    efficiency impacts on a large population of users.

    Why do you think that this was a "simple decision" in the initial design?
    The world is moving to 64-bit CPUs, and a 32-bit record marker effects
    performance (think about alignment issues). Bud has thought about this
    problem for several months, produced a plausible patch, and then Real Life
    got into his way. A fix to this problem takes time. There is no simple solution.

    If you read some of the previous comments, you'll see that some don't think it's
    an issue. It really is a problem that should take high priority.

    This isn't pushback but reality. There are only a handful of
    volunteers hacking on the code. What is a high priority to you
    may not be very high on some hacker's lists. To me, fixing the
    known bugs in modules is much higher priority than changing a
    functioning portion of the compiler.

    I know Bud is going to apply a variation of the patch he wrote a
    few months ago soon and I'm happy about that. I hope there isn't
    any pushback from the rest of the developers.

    I doubt that there will be pushback. Yes, we will review the code
    and make suggestions. But, most of the developers will welcome Bud's
    effort.

    I think the default should actually be 4 byte markers, but that's
    just my humble opinion.

    I only use opteron base systems where a 64-bit marker is preferred.

    BTW, I think both spellings of FRTRAN (FRmula TRANslation)
    are correct actually:

    (Not that it really matters in the big scheme of things.)

    Read the Standard. It very carefully uses "FRTRAN 77" to identify
    specific references to IS 1539:1980. Indeed, the passage in 1.6
    says "Each Fortran International Standard since IS 1539:1980 (informally
    referred to as FRTRAN 77)". Note, "RTRAN" actually appears in small
    caps. Everywhere else the Standard carefully uses Fortran.

    I'll also post a small C program to convert to the g77 format soon as a
    temporary fix until the patch is in place.

    Thanks.

    (I'm completely hammered with
    work right now, but I'll try to contribute more in the future. I've already
    sent in some code snippets on the little endian/big endian issue.)

    So, you can appreciate the demands on the developers. :-)
    I would love to devote several hours a week to gfortran, but
    time is occupied by Real Life.

    I hope that we can all keep this professional in the future and
    respect people's time (development, trouble shooting and bug reporting)
    that they put into this to help make a better product for
    everybody.

    Sorry if my comment appeared to be too strong, but your Comment #3 and
    #5 appeared to be "preaching to the choir". We know there's a problem.
    Bud is working on it.

Re: New: unformatted files from gfortran are incompatible with g77 unformatted files and solar


max 4000 letters.
Your nickname that display:
In order to stop the spam: 5 + 5 =
QUESTION ON "Development"

EMSDN.COM