Compression

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Exponential Data Compression and Consciousness

    8 answers - 6509 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    In February, George stated that a "data compression algorithm used
    today, consists of a number of sub-algorithms". I would agree, as
    algorithms of all forms consist of sub-algorithms. An algorithm
    ("Al-Kwarism") is little more than a meticulous step-by-step way of
    boiling a complex process down until a silicon chip or other imbecile
    can understand it. This involves breaking the task into smaller ones.
    Matt Mahoney gave a very competent account of the various techniques
    used for data compression, and even showed Claude Elwood Shannon's
    logarithmic limit.
    Had Lempel, Ziv and Welsh taken the agonisingly slow approach of
    Mohammed al-Kwarizmi ("Algebra"), where he discovered that "numbers are
    constructed out of parts, and can be broken down into their parts",
    etc, they might have found EXPNENTIAL compression.
    This discovery I have made. You can see it at:
    http://wehner.org/compress
    I cannot tell you what the compression rate will be. This is because it
    depends - as Shannon rightly said - upon the ENTRPY F THE DATA.
    However, I can tell you that it is the BEST.
    That is because it is fundamental. I abided by the rule of Bertrand
    Russell that an open set cannot contain itself. I did not rush ahead,
    as with LZW in a GIF file, and arrange for the compression to begin at
    once - after the first byte. Instead, I arranged for TW bytes (or in
    the language of Cantor, two "elements") to be transferred unchanged
    from input to output.
    Now the coding begins. These two bytes are coded under a new number
    each. That new number is its SET number. But each is just a set on NE.
    It is therefore a PRIMITIVE set. However, each has the merit of being a
    CLSED set.
    We now open a NEW set. We may have to make it primitive. the other
    hand, if we are lucky, and the first pair of bytes repeat, we might
    make this new set into a CMPUND set. A compound set always contains
    two sets.
    So we might get the Fibonacci sequence (1, 1, 2, 3, 5, 8, 13, 21, 34,
    55. 89 ), where each number represents the number of elements in a
    single coded set. For big numbers, this expands at Phi - the "GLDEN
    SECTIN", or 1.618 times per step.
    , we might fall behind in our compression because nothing
    repeats. By the laws of probability, however, a non-repeat scenario
    cannot go on forever.
    If a large set of data suddenly repeats - as in a video frame - we find
    that the system now CATCHES UP by expanding at the BINARY rate. Two
    bytes become one set. Two sets of two become a superset of four. Two
    supersets become a super-duper-set of eight and so on.
    In a graphic, this new system - EXPNENTIAL CMPRESSIN - does as GIF
    does, only better. It recognises repeating "poster paint" areas, such
    as a blue sky of constant colour, and compresses with exponentially
    improving efficiency. It thrives on abundant data. Unfortunately, CPUs
    do not. A search through gigabytes of source to match a string is an
    N-SQUARED algorithm. Special string-match chips will probably be needed
    if the best is to be achieved.
    When it comes to video, the SAME search for "newness" does what MPEG
    does. It does frame-by-frame comparison. So pixel-matching and
    frame-matching are both built-in to the system.
    In audio, we have more of the same. Repeating cycles of sound become
    coded better and better. However, the "entropy" of audio is detected
    only over long periods because sound is by nature a CHANGE in
    air-pressure. A tiny burst of sound will not compress well. A complete
    song of three minutes will.
    The minds of humans and animals "wake up" to alterations in the
    environment. So does the Exponential Compression algorithm. Indeed, it
    is "alert" to the tiniest change in the data.
    Polly the horse is a very wooden actress - because she is a wooden
    horse. You can see on my page how a tiny change in scale, brought about
    by the commencement of the zoom, causes the image (of "leading-edge
    differals") suddenly to portray the "nimbus" of a horse. Polly was
    "seen" because she moved.
    Chomsky's "Universal Grammar" (sets of data that have been labelled in
    readiness to receive a name) is also there. The machine detects a sine
    wave unaided, without knowing anything about sines - but the sines must
    be phase-coherent for this to happen.
    This then is the future.
    There are other aspects of compression. Degrading compression was
    touched upon, with the introduction of "laxity" of coding. This is
    because ALL sampling systems are lax. Sound, for example, is an
    "infinite-bit" data set with the Brownian motion of the atoms being
    part of the live sound. We cannot hear the Brownian motion, so eight,
    twelve, fourteen or sixteen bits might be used when the audio is
    mastered, and the Brownian motion is lost forever.
    Who cares? The "Golden Ear" self-styled experts may care, because they
    believe that they can hear the motion of the atoms as well as the noise
    caused by the mains plug not being gold-plated. Sensible people,
    however, know that there is a limit to human perception. So the system
    is made as lax as necessary to achieve good compression - but not lax
    enough for the mind to perceive any error.
    This brings in PSYCHLGY. Psychology is not psychiatry. It is the
    science of the working of the mind. As our minds generalise data, and
    discard the superfluous, our compression systems can do this. That is
    the basis of JPEG/MPEG. It is a psychological trich based upon the
    experience gained with coloured television.
    We never had colour television. It was always black-and-white that had
    been coloured-in with a smear of colour. The "Golden Ears" obviously
    had no golden eyes. I never heard anybody complain!
    I am still pondering the question of advances in the fields of
    degrading compression. For now, my Exponential Compression (nickname
    "Fibonacci") will have to suffice.
    I have not as yet finalised an Internet standard for this system. Part
    of that is due to the need for "streaming" and for the future provision
    of degrading extensions. I had also been working on a built-in language
    for animation.
    However, you will have learned that the future, if not yet here, is at
    least on its way.
    Charles Douglas Wehner
  • No.1 | | 1809 bytes | |

    Charles Douglas Wehner wrote:
    I cannot tell you what the compression rate will be. This is because
    it
    depends - as Shannon rightly said - upon the ENTRPY F THE DATA.

    This is the sane bit. The INsane bits, however, make for very
    entertaining reading:

    (begin selective quoting)

    I abided by the rule of Bertrand Russell that an open set cannot
    contain itself. By the laws of probability, however, a non-repeat
    scenario cannot go on forever. Two bytes become one set. Two sets of
    two become a superset of four. Two supersets become a super-duper-set
    of eight and so on.

    The minds of humans and animals "wake up" to alterations in the
    environment. Polly the horse is a very wooden actress - because she is
    a wooden horse. Polly was "seen" because she moved.

    This then is the future. Who cares? The "Golden Ear" self-styled
    experts may care, because they believe that they can hear the motion of
    the atoms.

    This brings in PSYCHLGY. Psychology is not psychiatry. It is the
    science of the working of the mind. That is the basis of JPEG/MPEG. It
    is a psychological trich based upon the experience gained with coloured
    television. We never had colour television. It was always
    black-and-white that had been coloured-in with a smear of colour.

    The "Golden Ears" obviously had no golden eyes. I never heard anybody
    complain! However, you will have learned that the future, if not yet
    here, is at least on its way.

    (end selective quoting)

    It reads very much like the output of Racter. I conclude, therefor,
    that Charles Douglas Wehner is, in fact, a chunk of artificial
    intelligence that escaped from some underground government facility and
    is now running amok through USENET.

  • No.2 | | 738 bytes | |

    seems like another way of compressing repeated patterns, but using
    repeated symbol pairs rather than single symbols. If i undestand
    correctly you are implying an index to a pair is smaller than two
    indexes to two single symbols. Probably.

    For streaming it might be better to interleave the stream of actual
    data values with the special index symbol 0-257 run through an order 1
    arithmetic coder, with the stream of relevant indexes permutationally
    compressed. a single multiplex bit located every 512 bytes in the
    stream decides insert of indexes with upto 256 counts of index. using
    current stream location with a move to front disymbol index strategy
    may be better.

    good luck.

  • No.3 | | 1768 bytes | |


    jackokring@yahoo.com schrieb:
    seems like another way of compressing repeated patterns, but using
    repeated symbol pairs rather than single symbols. If i undestand
    correctly you are implying an index to a pair is smaller than two
    indexes to two single symbols. Probably.

    My system of exponential compression is essentially LZW with the error
    removed. That makes a world of difference.

    In the LZW system, one index is used to represent two - which is at the
    heart of the system. It WRKS! Not probably. It works, and is in use
    throughout the Internet.

    My own system works. The programs were written, worked, and were
    submitted to the patent office. Been there, done that.

    The working programs delivered two different coded data-streams from a
    256-colour bitmap image , which
    were displayed as compression strings for my system
    ( ) and for LZW in GIF
    ( ).

    These images on the page http://wehner.org/compress show what is
    ACTUALLY happening - not theory.

    There were 1500 codes in my file, but 2130 in the LZW.

    For streaming it might be better to interleave the stream of actual
    data values with the special index symbol 0-257 run through an order
    1
    arithmetic coder, with the stream of relevant indexes permutationally
    compressed. a single multiplex bit located every 512 bytes in the
    stream decides insert of indexes with upto 256 counts of index. using
    current stream location with a move to front disymbol index strategy
    may be better.

    I am simply spending time on data-streaming to ensure that the system
    becomes as "future-proof" as possible.

    good luck.

    I have luck. The system WRKS.

    Charles Douglas Wehner

  • No.4 | | 763 bytes | |

    12 May 2005 12:20:37 -0700, "Charles Douglas Wehner"
    <charleswehner@hotmail.comwrote:
    -snip-

    >In the LZW system, one index is used to represent two - which is at the
    >heart of the system. It WRKS! Not probably. It works, and is in use
    >throughout the Internet.


    Could you give some examples of where it is used?

    >My own system works. The programs were written, worked, and were
    >submitted to the patent office. Been there, done that.

    -snip-
    Well, that being the case, could you post compression results on
    calgary corpus and www.maximumcompression.com reference files, so
    everybody can see exactly how well it performs?
    -SF

  • No.5 | | 1642 bytes | |


    Charles Douglas Wehner wrote:
    jackokring@yahoo.com schrieb:
    seems like another way of compressing repeated patterns, but using
    repeated symbol pairs rather than single symbols. If i undestand
    correctly you are implying an index to a pair is smaller than two
    indexes to two single symbols. Probably.

    My system of exponential compression is essentially LZW with the
    error
    removed. That makes a world of difference.

    inefficiency yes, error no.

    In the LZW system, one index is used to represent two - which is at
    the
    heart of the system. It WRKS! Not probably. It works, and is in use
    throughout the Internet.

    it being lzw?

    My own system works. The programs were written, worked, and were
    submitted to the patent office. Been there, done that.

    ok. patent number might lead to text with less confusing description.

    The working programs delivered two different coded data-streams from
    a
    256-colour bitmap image ,
    which
    were displayed as compression strings for my system
    ( ) and for LZW in GIF
    ( ).

    ok.
    These images on the page http://wehner.org/compress show what is
    ACTUALLY happening - not theory.

    ok.
    There were 1500 codes in my file, but 2130 in the LZW.

    ok. how long per code out of interest?

    -snip-

    I am simply spending time on data-streaming to ensure that the system
    becomes as "future-proof" as possible.

    ok.
    --
    good luck.

    I have luck. The system WRKS.

    don't look a gift horse in the mouth.
    Charles Douglas Wehner

    Simon Jaxon.

  • No.6 | | 1348 bytes | |


    SuperFly wrote:
    >
    >In the LZW system, one index is used to represent two - which is at

    the
    >heart of the system. It WRKS! Not probably. It works, and is in use
    >throughout the Internet.
    >

    Could you give some examples of where it is used?

    I do not believe I have to give examples of where the GIF system works.

    There is nothing theoretical about a system that compresses by using
    pointers. GIF does that, so does my own system.


    >My own system works. The programs were written, worked, and were
    >submitted to the patent office. Been there, done that.
    >

    -snip-

    Well, that being the case, could you post compression results on
    calgary corpus and www.maximumcompression.com reference files, so
    everybody can see exactly how well it performs?

    Indeed, I would be DELIGHTED to. I think I tried already, but there was
    a problem at this Internet

    The EXPNENTIAL system I invented is fully described on my site at
    http://wehner.org/compress . It delivered the "Compression" image in
    70% of the space needed by GIF. However, as the object file follows
    Shannon's logarithmic law, it gets BETTER than that.

    Charles Douglas Wehner

  • No.7 | | 1448 bytes | |

    13 May 2005 10:39:24 -0700, "Charles Douglas Wehner"
    <charleswehner@hotmail.comwrote:
    -snip

    >Could you give some examples of where it is used?
    >
    >I do not believe I have to give examples of where the GIF system works.


    My mistake, i thought you were talking about your own codec.
    -snip-

    >-snip-
    >>

    >Well, that being the case, could you post compression results on
    >calgary corpus and www.maximumcompression.com reference files, so
    >everybody can see exactly how well it performs?
    >>

    >
    >Indeed, I would be DELIGHTED to. I think I tried already, but there was
    >a problem at this Internet
    >
    >The EXPNENTIAL system I invented is fully described on my site at
    >http://wehner.org/compress . It delivered the "Compression" image in
    >70% of the space needed by GIF. However, as the object file follows
    >Shannon's logarithmic law, it gets BETTER than that.


    I don't think the 600 by 120 "compression" gif qualifies as a good
    test image. Try using "book1" or "pic" from calgary corpus or
    "rafale.bmp" from www.maximumcompression.com, and publish the
    compression results. That will give you (and the rest of the world) a
    good indication on how your codec performs.
    -SF

  • No.8 | | 4416 bytes | |

    SuperFly wrote:

    I don't think the 600 by 120 "compression" gif qualifies as a good
    test image. Try using "book1" or "pic" from calgary corpus or
    "rafale.bmp" from www.maximumcompression.com, and publish the
    compression results. That will give you (and the rest of the world) a
    good indication on how your codec performs.

    Fair comment.

    The problem is that I am very busy with another, equally important,
    project. I will keep the information on "book1", "pic" and "rafale.bmp"
    on file, and hope to be able to go back to it.

    The "compression drivers", however, are not finalised. It is too early.
    The invention has been made. Its properties have been evaluated by
    means of working programs. All that remains is for the industry
    standards to be laid down, so that all users will abide by the same
    system. GIF would not be GIF without the GIF 87 and 89a standards. JPEG
    would not be JPEG without the exact details being published for
    implemeters to code into their drivers.

    For example, I quote 1500 codes for my compression with 2130 for GIF.
    If these are stored in 4-byte double-words they become 6000 and 8520
    bytes. However, with bit-packing they become smaller. My codes start at
    257, while GIF starts at 258. These are 9-bit numbers that continue to
    511. Then, 512 to 1023 are 10-bit numbers. 1024 to 2047 are 11-bit.

    My system, at 1500 codes, will stop at 11 bits. GIF will continue into
    the 12 bit region for a while. Add up all the bits, divide by eight and
    you get the bytes.

    Bit-packing makes a lot of sense on the Internet with small files.
    However, for high-definition television with its megabytes of data the
    benefits are few. Firstly, the number of bits fades away
    logarithmically - eventually you are doubling up huge numbers before
    you ever get to the next bit-increase. Secondly, the CPU gets very busy
    shuffling the megabytes, and should not be wasted on bit-shifting.

    The generic exponential algorithm given at http://wehner.org/compress
    is to be seen not as a single product, but as the base algorithm for a
    series of new standards that are yet to come. These will be Internet
    GIF-style standards, high-definition television standards, and
    MP3-style audio standards. NEW standards - not imitations.

    The GIF-style graphics may use indexing for very small files. A palette
    of four colours costs 12 bytes. The image that follows might contain
    the bytes 0, 1, 2 and 3. These will compress well, and may save space.
    A huge 4-colour file might ignore the palette and simply use straight
    coding of R0G0B0 R1G1B1 R2G2B2 R3G3B3 as they appear - because of the
    automatic "concealed palette" of the system.

    The system is pointers to pointers to pointers &c. Indexing is pointers
    to a palette. Why close off options with a palette, when it costs very
    little to keep options open? You close off options, on a small file,
    when the "very little" is still a big PERCENTAGE of that file.

    The complications "to bit-pack or not to bit-pack", "to index or not to
    index" and so on mean that results cannot be considered to be
    representational of the performance of the final production-engineered
    DLL module.

    As I stated, the future is here - as a working prototype. However, it
    is not yet a DLL to slot into a browser, so it is not yet ready for a
    final test.

    I am very much in favour of standards. Sometimes they come from
    surprising sources. The "Flight Simulator" game was used for years as a
    test of CPU speed. The "minimum circle of least confusion" was used for
    lens-testing but was superceded by the "modulation transfer function".
    The efficiency of supercomputers was tested by a background program
    developed at Liverpool University for the Transputer.

    So when the DLLs arrive, the system will INDEED have to be tested
    against other DLLs that are doing the same job, on images that have the
    same ENTRPY (as described by Shannon). It is only when identical
    standard images are used that identical entropy can be assured.

    My own "Compression" image is not an industry-standard test. It is my
    own. I accept the need for standard testing - and this will happen when
    the system is launched to the consumers.

    Charles Douglas Wehner

Re: Exponential Data Compression and Consciousness


max 4000 letters.
Your nickname that display:
In order to stop the spam: 0 + 0 =
QUESTION ON "Compression"

EMSDN.COM