Compilers

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Genetic programming and code evolution

    1 answers - 3084 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    23 Dec 2005 20:54:18 -0500, G Rosset wrote:
    I would like to create a genetic algorithm to modify C/C++ and improve
    it for either speed or space or a combination of both.
    This would be quite similar to the tool called Critticall, see
    http://www.critticall.com/
    I am skeptical about the page, if not to say more.
    From the machine learning perspective the way a genetic algorithm
    works is selection. If you want to optimize for some criterion (like
    speed) you need to measure it to be able to select the most promising
    mutations. Measuring program speed is a way too expensive for the
    incredible number of mutations you would need.
    The second problem is the genome language. You don't want to randomly
    modify hexadecimal code, because the number of outcomes is too
    big. You also don't want to randomly change the program source text,
    because most of mutations will not compile. So you believe that
    modification of an intermediate code would do better? It won't. Random
    mutations will not preserve program correctness and checking for
    correctness is absolutely unrealistic. So you need to limit all
    mutations to ones that guaranteed produce a semantically *equivalent*
    program.
    Now you have another problem, by applying such mutations one by one
    you will never be able to leave a local optimum. The tree distance
    between the current program and a somewhat better one is just too
    big. You need a large number of simultaneous mutations to reach it. It
    is statistically impossible. Furthermore, considering a program as a
    whole is hopeless. Programs are decomposed into smaller blocks, that
    actually lets their optimization while preserving their local
    semantics. But the decomposition itself is also a sufficient subject
    of optimization, and you have no chance to attack it using trees.
    This is not the way humans are optimizing programs. They make changes,
    which being translated into trees, are scattered all across the tree.
    People operate program semantics, which is not well captured by the
    tree.
    The bottom line is: you need a better genome language than syntactic
    trees, you need better selection criteria. Probably software patterns
    is the most close thing which comes in mind.
    q1) Do you know of any educational/free/open compiler which can compile
    to a intermediate code which can easily be represented in a tree
    structure and be modified by a genetic algorithm ?
    I have a parser which does semantic call backs (they can produce a
    tree or do anything else):
    I think you should definitely consider pre- and postconditions rather
    than raw code. You might also be interested in fuzzy methods, because
    it is close to the way humans take their decisions. As I said, direct
    measures are unrealistic, so you need some soft heuristic criteria
    instead. Fuzzy might be also a way to postpone a decision (which
    actually means that you keep both (or more) alternatives alive.)
    Good luck!
  • No.1 | | 758 bytes | |

    Dmitry A. Kazakov <mailbox@dmitry-kazakov.dewrote:
    >From the machine learning perspective the way a genetic algorithm
    >works is selection. If you want to optimize for some criterion (like
    >speed) you need to measure it to be able to select the most promising
    >mutations. Measuring program speed is a way too expensive for the
    >incredible number of mutations you would need.


    People *have* used genetic algorithms to do things like finding better
    image-processing techniques, evaluating results by measuring program
    speed. However, Dmitry is basically correct -- it's appallingly slow and
    expensive, verging on impractical, even with carefully-selected program
    representations.

Re: Genetic programming and code evolution


max 4000 letters.
Your nickname that display:
In order to stop the spam: 6 + 6 =
QUESTION ON "Compilers"

EMSDN.COM