C/C++

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • C Interpreter and sizeof operator

    12 answers - 746 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    If one were writing a C interpreter, is there anything in the standard
    standard that requires the sizeof operator to yield the same value for
    two different variables of the same type?
    Let's assume that the interpreter does conform to the range values
    for, say, type int, but allocates storage for the variables based
    on their value. So, for two variables foo and bar
    int foo = 0; /* interpreter allocates two bytes */
    int bar = 200000000; /* interpreter allocates four bytes */
    Does the standard require that sizeof foo == sizeof bar thereby
    making this allocation scheme broken, unless hidden in some way?
    is it perfectly acceptable for the sizeof operator to different
    results?
    Regards,
  • No.1 | | 1409 bytes | |


    "ozbear" <ozbear@bigpond.comwrote in message
    news:438cdcef.9338968@news-server
    If one were writing a C interpreter, is there anything in the standard
    standard that requires the sizeof operator to yield the same value for
    two different variables of the same type?

    Yes.

    Let's assume that the interpreter does conform to the range values
    for, say, type int, but allocates storage for the variables based
    on their value. So, for two variables foo and bar

    int foo = 0; /* interpreter allocates two bytes */
    int bar = 200000000; /* interpreter allocates four bytes */

    Note that type 'int' is only required to have a max value of 32767
    (but is allowed to be larger).

    Also note that the size of a byte (type 'char') must be at least 8
    bits (but is allowed to be larger).

    Does the standard require that sizeof foo == sizeof bar

    Yes. Each type must have a specific size, iow in your
    example, the possible ranges of 'foo' and 'bar' must be
    the same.

    thereby
    making this allocation scheme broken, unless hidden in some way?
    is it perfectly acceptable for the sizeof operator to different
    results?

    No, not for objects of the same type. Why would you want to do this
    anyway? What if I later in the code wanted to write:
    foo = 10000;
    -Mike

  • No.2 | | 2045 bytes | |

    ozbear wrote:
    If one were writing a C interpreter, is there anything in the standard
    standard that requires the sizeof operator to yield the same value for
    two different variables of the same type?

    Yes. (I know this because I looked up the exact same thing lately.) If x and
    y have the same type, sizeof x = sizeof y must hold without exception.

    6.5.3.4: "The sizeof operator yields the size (in bytes) of its operand,
    which may be an expression or the parenthesized name of a type. The size is
    determined from the type of the operand. The result is an integer. If the
    type of the operand is a variable length array type, the operand is
    evaluated; otherwise, the operand is not evaluated and the result is an
    integer constant."

    Note that the *value* of the operand is irrelevant. its *type* matters.
    Therefore the size of an object may *not* depend on its value, but only on
    its type. If x and y are of type 'int', then sizeof x = sizeof(int) and
    sizeof y = sizeof(int), therefore sizeof x = sizeof y. Bad Things happen if
    you violate this.

    Let's assume that the interpreter does conform to the range values
    for, say, type int, but allocates storage for the variables based
    on their value. So, for two variables foo and bar

    int foo = 0; /* interpreter allocates two bytes */
    int bar = 200000000; /* interpreter allocates four bytes */

    Does the standard require that sizeof foo == sizeof bar thereby
    making this allocation scheme broken, unless hidden in some way?

    Yes.

    You can try and fudge this in cases where the application cannot possibly
    trip over the wrong size, but it's tricky to do this without violating the
    standard, and in an interpreter it's very unlikely to be of any value. Just
    pick a constant size for integers (4 bytes happens to be a very common one).

    is it perfectly acceptable for the sizeof operator to different
    results?

    No, it's not.

    S.
  • No.3 | | 1449 bytes | |

    In article <438cdcef.9338968@news-server>, ozbear@bigpond.com (ozbear)
    wrote:

    If one were writing a C interpreter, is there anything in the standard
    standard that requires the sizeof operator to yield the same value for
    two different variables of the same type?

    Let's assume that the interpreter does conform to the range values
    for, say, type int, but allocates storage for the variables based
    on their value. So, for two variables foo and bar

    int foo = 0; /* interpreter allocates two bytes */
    int bar = 200000000; /* interpreter allocates four bytes */

    Does the standard require that sizeof foo == sizeof bar thereby
    making this allocation scheme broken, unless hidden in some way?
    is it perfectly acceptable for the sizeof operator to different
    results?

    sizeof (foo) must be equal to sizeof (bar).

    memcpy (&foo, &bar, sizeof (foo)) must have exactly the same effect as a
    simple assignment foo = bar. And that assignment must work, so your
    interpreter must change the memory allocated to foo from 2 byte to 4
    byte.

    If I write a function

    void f (int* p, int value) { *p = value; }

    then calling

    f (&foo, 2000000000)

    must work. Basically, everything must worked as guaranteed by the C
    Standard. As long as the interpreter makes sure that everything works as
    it should, it is free to do whatever it likes.
  • No.4 | | 787 bytes | |

    In article <438ce9a7$0$11070$e4fe514c@news.xs4all.nl>,
    Skarmander <invalid@dontmailme.comwrote:

    You can try and fudge this in cases where the application cannot possibly
    trip over the wrong size, but it's tricky to do this without violating the
    standard, and in an interpreter it's very unlikely to be of any value. Just
    pick a constant size for integers (4 bytes happens to be a very common one).

    So if I write

    long long i;
    char array [100];

    for (i = 0; i < 100; ++i) array [i] = i;

    the compiler is free to use 1, 2, 3, 4 or any other number of bytes for
    i, because it doesn't make any difference to the code. The compiler is
    allowed to cheat, as long as you cannot detect that it cheats.
  • No.5 | | 1276 bytes | |

    Christian Bau wrote:
    In article <438ce9a7$0$11070$e4fe514c@news.xs4all.nl>,
    Skarmander <invalid@dontmailme.comwrote:
    >
    >You can try and fudge this in cases where the application cannot possibly
    >trip over the wrong size, but it's tricky to do this without violating the
    >standard, and in an interpreter it's very unlikely to be of any value. Just
    >pick a constant size for integers (4 bytes happens to be a very common one).
    >

    So if I write

    long long i;
    char array [100];

    for (i = 0; i < 100; ++i) array [i] = i;

    the compiler is free to use 1, 2, 3, 4 or any other number of bytes for
    i, because it doesn't make any difference to the code. The compiler is
    allowed to cheat, as long as you cannot detect that it cheats.

    Yes. The compiler is even free not to use any bytes for i at all and emit a
    fully initialized array in the object file somewhere, if this is the only
    piece of code that assigns to 'array'.

    , the burden is on the compiler writer to make sure these
    optimizations are always safe, but as long as programs have the required
    semantics, the standard couldn't care less.

    S.
  • No.6 | | 528 bytes | |

    Christian Bau wrote:

    memcpy (&foo, &bar, sizeof (foo)) must have exactly the same effect
    as a simple assignment foo = bar.

    Not exactly. The assignment may be done on an element by element basis,
    which would not copy the padding bytes.

    For the valid data itself, the two structs will be same of course, but
    a bitwise comparison would not necessarily be the same.

    That is:

    memcmp(&foo, &bar, sizeof foo)

    need not evaluate to 0 following foo = bar.

    Brian

  • No.7 | | 808 bytes | |

    Note: Brian snipped the declaration of foo and bar.
    int foo = 0;
    int bar = 200000000;

    Brian (Default User) wrote:
    Christian Bau wrote:
    >>memcpy (&foo, &bar, sizeof (foo)) must have exactly the same effect
    >>as a simple assignment foo = bar.

    >

    Not exactly. The assignment may be done on an element by element basis,
    which would not copy the padding bytes.

    For the valid data itself, the two structs will be same of course, but
    a bitwise comparison would not necessarily be the same.

    That is:

    memcmp(&foo, &bar, sizeof foo)

    need not evaluate to 0 following foo = bar.

    What if, as in this case, foo and bar are both ints? Must memcmp
    evaluate to zero following foo = bar?
  • No.8 | | 1281 bytes | |

    Simon Biber wrote:
    Note: Brian snipped the declaration of foo and bar.
    int foo = 0;
    int bar = 200000000;

    Brian (Default User) wrote:
    >Christian Bau wrote:

    memcpy (&foo, &bar, sizeof (foo)) must have exactly the same effect
    as a simple assignment foo = bar.
    >>

    >Not exactly. The assignment may be done on an element by element basis,
    >which would not copy the padding bytes.
    >>

    >For the valid data itself, the two structs will be same of course, but
    >a bitwise comparison would not necessarily be the same.
    >>

    >That is:
    >>

    >memcmp(&foo, &bar, sizeof foo)
    >>

    >need not evaluate to 0 following foo = bar.
    >

    What if, as in this case, foo and bar are both ints? Must memcmp
    evaluate to zero following foo = bar?

    No. ints can have padding bits. Assignment need not make these padding bits
    equal, since all that is required is that the "value" of 'bar' is copied to
    'foo'. Values may have multiple representations.

    S.
  • No.9 | | 1439 bytes | |

    Skarmander wrote:

    Simon Biber wrote:
    Note: Brian snipped the declaration of foo and bar.
    int foo = 0;
    int bar = 200000000;

    Brian (Default User) wrote:
    >Christian Bau wrote:

    memcpy (&foo, &bar, sizeof (foo)) must have exactly the same effect
    as a simple assignment foo = bar.
    >>

    >Not exactly. The assignment may be done on an element by element basis,
    >which would not copy the padding bytes.
    >>

    >For the valid data itself, the two structs will be same of course, but
    >a bitwise comparison would not necessarily be the same.
    >>

    >That is:
    >>

    >memcmp(&foo, &bar, sizeof foo)
    >>

    >need not evaluate to 0 following foo = bar.
    >

    What if, as in this case, foo and bar are both ints? Must memcmp
    evaluate to zero following foo = bar?
    --
    No. ints can have padding bits. Assignment need not make these padding bits
    equal, since all that is required is that the "value" of 'bar' is copied to
    'foo'. Values may have multiple representations.

    But, on systems that have padding bits, would copying them have any effect?
    Does overwriting any padding bytes in structs cause any "bad" behavior?
  • No.10 | | 1920 bytes | |

    Kenneth Brody wrote:
    Skarmander wrote:
    >Simon Biber wrote:

    Note: Brian snipped the declaration of foo and bar.
    int foo = 0;
    int bar = 200000000;

    Brian (Default User) wrote:
    Christian Bau wrote:
    memcpy (&foo, &bar, sizeof (foo)) must have exactly the same effect
    as a simple assignment foo = bar.
    Not exactly. The assignment may be done on an element by element basis,
    which would not copy the padding bytes.

    For the valid data itself, the two structs will be same of course, but
    a bitwise comparison would not necessarily be the same.

    That is:

    memcmp(&foo, &bar, sizeof foo)

    need not evaluate to 0 following foo = bar.
    What if, as in this case, foo and bar are both ints? Must memcmp
    evaluate to zero following foo = bar?

    >No. ints can have padding bits. Assignment need not make these padding bits
    >equal, since all that is required is that the "value" of 'bar' is copied to
    >'foo'. Values may have multiple representations.
    >

    But, on systems that have padding bits, would copying them have any effect?
    Does overwriting any padding bytes in structs cause any "bad" behavior?

    That's a completely different matter. No, it would not, since padding bytes
    in structs are never used for anything (other than meeting alignment
    requirements) and can never cause trap representations like padding bits in
    integers can. And even in integers, if "foo = bar" copies the padding bits
    of 'bar' to 'foo' as well, that's not just perfectly legal, it's what you'd
    expect most implementations to do.

    The question was whether assignment and memcpy() did the same thing. They
    don't. memcpy() gives you stronger guarantees than assignment.

    S.
  • No.11 | | 223 bytes | |

    Simon Biber wrote:
    Note: Brian snipped the declaration of foo and bar.
    int foo = 0;
    int bar = 200000000;
    , guess I did.
    It wasn't my intention to muddy the waters.
    Brian
  • No.12 | | 1834 bytes | |

    Kenneth Brody wrote:
    Skarmander wrote:
    >Simon Biber wrote:

    Note: Brian snipped the declaration of foo and bar.
    int foo = 0;
    int bar = 200000000;

    Brian (Default User) wrote:
    Christian Bau wrote:
    memcpy (&foo, &bar, sizeof (foo)) must have exactly the same effect
    as a simple assignment foo = bar.
    Not exactly. The assignment may be done on an element by element basis,
    which would not copy the padding bytes.

    For the valid data itself, the two structs will be same of course, but
    a bitwise comparison would not necessarily be the same.

    That is:

    memcmp(&foo, &bar, sizeof foo)

    need not evaluate to 0 following foo = bar.
    What if, as in this case, foo and bar are both ints? Must memcmp
    evaluate to zero following foo = bar?

    >No. ints can have padding bits. Assignment need not make these padding bits
    >equal, since all that is required is that the "value" of 'bar' is copied to
    >'foo'. Values may have multiple representations.
    >

    But, on systems that have padding bits, would copying them have any effect?
    Does overwriting any padding bytes in structs cause any "bad" behavior?

    I believe that doing a memcpy between two objects of identical type is
    always K whether of not there are padding bits.

    However, even in the absence of padding bits on an int doing foo=bat
    *may* have a different effect to doing memcppy(&foo, &bar, sizeof foo)
    and lead to memcmp(&foo, &bar, sizeof foo) returning a non-zero result.
    The reason being that 1s complement and sign-magnitude implementations
    are allowed (but not required) to have -0 as valid as long as a -0
    compares equal to +0 (apart from using memcmp, obviously).

Re: C Interpreter and sizeof operator


max 4000 letters.
Your nickname that display:
In order to stop the spam: 0 + 9 =
QUESTION ON "C/C++"

EMSDN.COM