Compilers

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Definition of basic blocks

    5 answers - 912 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Hi all,
    for my master's thesis I'm looking for the definition of basic blocks that
    are used in the compiler back end.
    What I actually want to know, is, if call instructions are treated like
    any other instruction or if they cause the end of a basic block.
    I've encountered both versions. Some people use call instructions amid
    a basic block, other use them at the end of a basic block and continue
    with the subsequent instructions in a new block.
    Are both version correct?
    Thank you for your answer.
    I would also appreciate if you could give my any references (paper, url )
    concerning that issue.
    Best regards,
    Christian
    [It entirely depends on your language and the goals of your optimizations.
    If a call can change variables visible in the block, you'd best break
    the block there. If not, no need. -John]
  • No.1 | | 943 bytes | |

    Christian Christmann <plfriko@yahoo.dewrote:
    # for my master's thesis I'm looking for the definition of basic blocks that
    # are used in the compiler back end.
    #
    # What I actually want to know, is, if call instructions are treated like
    # any other instruction or if they cause the end of a basic block.

    Each instruction in a basic block is a state transformer. A call
    can transform a lot more of the state than most instructions, but
    the overall effect is the same. Basic blocks are minimal single
    entry/single exit regions.

    # I've encountered both versions. Some people use call instructions amid
    # a basic block, other use them at the end of a basic block and continue
    # with the subsequent instructions in a new block.

    That might have to do with multiple return calls, or perhaps
    throw/catch returns.

    # Are both version correct?

    Whatever works is correct.
  • No.2 | | 1104 bytes | |

    Christian Christmann <plfriko@yahoo.dewrote:
    for my master's thesis I'm looking for the definition of basic blocks that
    are used in the compiler back end.

    Some early references I know include:

    @Article{allen70cfg,
    author = "F. E. Allen",
    title = "Control Flow Analysis",
    journal = "{ACM SIGPLAN} Notices",
    volume = "5",
    number = "7",
    pages = "1",
    year = "1970",
    online = "no paper",
    }

    If I recall correctly, there's a problem in this CFG definition, as it
    doesn't define the starting basic block of a control flow graph. In a
    later survey, there's a definition that cures this:

    @inbook{kennedy81survey,
    editor="S. S. Muchnick and N. D. Jones",
    author="K. Kennedy",
    title={Program Flow Analysis: Theory and Applications},
    chapter="1. A Survey of Data Flow analysis Techniques",
    publisher="Prentice Hall",
    location="New Jersey",
    year="1981",
    }

    Algorithm for generating basic blocks is avilable e.g. on page 528 of
    the Dragon book.

    br,
    Pietu Pohjalainen
  • No.3 | | 699 bytes | |

    Christian Christmann wrote:
    Are both version correct?

    I'd say so - some reasons:

    If you want to define as "a sequence of instructions whose order is
    independent of the data" then you need not split at call places, because
    you can easily "emulate" this by letting defs(call-instr) =
    caller-save-regs. This would mean fewer blocks and thus potentially
    faster analyses.

    TH, if you want more sophisticated analyses, e.g. inter-procedure
    register allocation, you probably do better making it an extra block,
    especially in presence of indirect (i.e. data-dependent) calls.

    Just some random thoughts Haven't done the latter, so far.
  • No.4 | | 2470 bytes | |

    Christian Christmann wrote:

    What I actually want to know, is, if call instructions are treated like
    any other instruction or if they cause the end of a basic block.
    I've encountered both versions. Some people use call instructions amid
    a basic block, other use them at the end of a basic block and continue
    with the subsequent instructions in a new block.

    Are both version correct?

    Depends on the language and semantics. If your language is strictly
    call-by-value and there's no "funny stuff" going on (such as
    continuations or implicit side effects) then you can take a call as
    part of a basic block.

    If it's call-by-reference, things are a little murkier; I defer to
    someone with superior knowledge.

    As our moderator pointed out, if there are implicit side effects -
    variables in scope that aren't passed explicitly to the function, but
    the function can modify them - you need to break the block there.

    If there is a chance that a call will cause an "unusual execution
    path" to be taken (eg, the routine being called or some member of the
    transitive closure of the set of routines *it* calls can execute a
    throw past the calling code, or a call to a reified continuation), you
    need to break the block there.

    If the language is fully recursive and the calling function is a
    member of the set of functions that is the transitive closure of
    functions called by the called function, then you need to break the
    block with the call.

    If the continuation (the return from the call) can be captured and
    reified in your language so that it can be stored in a variable and
    called like a routine any number of times, as in some lisps, you
    definitely need to break the block there, because in that case a call
    to the continuation is a "jump" to the code after the call that can be
    executed from anywhere the variable holding the continuation as a
    value is in scope.

    Hmmm, I'm trying to think of any other "funny stuff" that causes you
    to need to break a basic block with a call. There's probably more,
    some of it quite esoteric. F'r example, I'm pretty sure that
    call-by-name semantics, which are ridiculously rare in computer
    languages and generally considered to be hazardous by those who fully
    understand them, would require you to break a basic block with a call.

    Bear
  • No.5 | | 1811 bytes | |

    Thanks for your detailed contribution :-)

    Ray Dillinger wrote:

    What I actually want to know, is, if call instructions are treated like
    any other instruction or if they cause the end of a basic block.
    I've encountered both versions. Some people use call instructions amid
    a basic block, other use them at the end of a basic block and continue
    with the subsequent instructions in a new block.

    Are both version correct?

    Depends on the language and semantics. If your language is strictly
    call-by-value and there's no "funny stuff" going on (such as
    continuations or implicit side effects) then you can take a call as
    part of a basic block.

    If it's call-by-reference, things are a little murkier; I defer to
    someone with superior knowledge.

    As our moderator pointed out, if there are implicit side effects -
    variables in scope that aren't passed explicitly to the function, but
    the function can modify them - you need to break the block there.

    As long as a BB reflects (strictly sequential) control flow, only
    modifications of the point of (no) return can break this flow after an
    call. Calling conventions (for themselves) cannot influence the return
    address, also side effects become effective only in known places
    (conditional or indirect jumps). When exceptions or longjmp's can occur,
    then there exist more instructions besides calls and jumps, which can
    break the control flow.

    When data flow is of interest, things can become much more complicated,
    because any analysis should know about all the possible modifications.
    IM it's quite useless to know, in which places weird things can happen,
    without a formal description of these effects.

    DoDi

Re: Definition of basic blocks


max 4000 letters.
Your nickname that display:
In order to stop the spam: 3 + 2 =
QUESTION ON "Compilers"

EMSDN.COM