Compression
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
Go Back   Web Development Archives FAQs Compression

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Display Modes
 
Unread Web Development Archives Sponsor:
  #1  
Old July 4th, 2008, 06:10 PM
Einstein
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
An analysis of the counting argument.

This is just to stir debate and talk, not to say 'eureka I solved it'
and post nonsense :P

The counting argument is used to justify the fact that it is
impossible to obtain recursive lossless compression. The counting
argument is simple in nature. Every possible outcome of 4 bits cannot
become every possible outcome of 3 bits, nor any other standard can be
applied. It is also known as the pigeonhole principle.

An example of this at work.
If you have 12 programs and this is the 'actual code' of those twelve
programs

000
001
010
100
011
101
110
111
00
01
10
11

Then you have issues if you are trying to compress downward these
programs. Under some systems some of our programs can compress, but do
to mathematical or statistical luck. Most however will usually be
uncompressible, or you need to incorporate loss into the system to be
able to continue.

There has been many efforts to bypass the counting argument. These all
require systems which add bits to something one could call 'a command
section' or 'identifying bits'. Ultimately each has failed.


However if there were a way, then these following to me would seem to
be the only ways possible to 'defeat' the counting system with one
'still out there' requirement.

1) Segments
You take a long section of code. You subdivide it into subsections.
Each section can be compressed or not compressed separately. Your
issue is effectively identifying the sections and what is happening
inside of them. Using the pigeonhole example, we have 16 pigeon holes.
If we make it 4 sets of four we can then possibly close several holes
in some sets and have say 12 holes in a 2,3,4,3 pattern but still have
a maximum possible 16 holes which may or may not have a pigeon inside
them. Now to those who study math effectively they may realize 16
holes is 65,536 outcomes. But if we could do this somehow properly
counting, this is 810,000 (19.627~ bits) outcomes possible provided
we leave at least 1 hole active per subgroup.

The problem is of course how do we keep track? In the above example I
added 3 bits (and change) to the size, but to count this I would need
to either know how many holes were closed, how many holes are present
per a section, or some other accounting of the whole. The lowest I can
devise costs 8 bits. There are two ways to accomplish the 8 bits that
I can find, and many ways that cost more.

2) Changing the 'make up' of the code
If you somehow could 'sort' the code via an algorithm and get a
certain percentage of 1's on one side, and 0's on another, you could
in theory compress. This was the main method of my earlier research,
and I though I had a magic bullet in it. I was wrong I fully admit it.
But it is an interesting line of thought.

You can effectively change the statistics of a section of code. It can
happen as my own research indicates. With hard-coded methods you can
get a deviation from norm as high as a 20%-22%-28%-30% (say this
example 11, 01, 10, 00) ratio of something 2 bits or higher instead of
a near typical 25%-25%-25%-25% (in approximation) ratio. However as
Huffman and Shannon correctly show, you need something a bit more to
be able to recursively compress. Not only that but code that is
subject to compression via standard methods can get compression from
those methods, but may result in a change that is not compressible via
our selected method. Therefore suddenly we need a small command
section, albeit a few bits. Then you need to add in what the changes
resulted in, which is more likely to compress (11 or 00?) and then you
again are past the amount which would lead to compression except in a
very low number of the outcomes (and on a scale mattering on the total
bits in the string being compressed) possible.


Now it seems certain lossless repeatable compression is not highly
probable, or even probably remotely possible to happen. However if it
were possible, then one of these two means (am I missing one or more
possible ways?) would seem to be the means to achieve this. However I
would note, even if it is achievable, there would be entropy
eventually, we cannot reach 1 bit from every possible formula in the
universe after all which is the final argument of the counting
argument :P

So thoughts?

Reply With Quote
  #2  
Old July 5th, 2008, 07:10 PM
Providence
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
An analysis of the counting argument.

well you could utilize the fact that truly random data just doesn't
exist.

if you encoded the laws of physics, the designs of computer systems,
and human genomes you could definitely put a lower and upper bounds on
any data produced by a computer, and compress that way.



in a more diffuse way, what i aimed at was looking at the fact that
information quantizes at all.

take the string 100000000

in binary, it represents a high and eight lows.

in base ten, it's a hundred million.

in base twenty trillion and three, it quantizes the exact same, but
each of the digits has much more information inside it.


so, take a non-quantized unary ideal, ("one" in base 499534 for
example), cycle through all the different quantized representations
(499533 different examples) and you'll see that base 2 is actually the
worst case scenario.

you may find, for instance that in base 3240 the result of that unary
ideal is 57577757757775775. take out the 'entropy' which is actually
super, super easy to perceive in this method. it'd be
02022202202220220 with another information structure containing '5'.
each digit is now base 3 instead of base 3240, which is pretty vast.

extending this concept, *if* we could create custom molecules of the
exact quantization dimensions that relates to the found best scope, we
could avoid entropy entirely taking advantage of the lowest possible
information encoding energy state in the quantum domain. basically,
massaging the system so that entropy has no basis as a concept, as
there's just nothing left to optimize in the quantization of
information.



otherwise, you can simply count the probability of each possible scope
range occuring, and add them together as these are all the
probabilities for any given unary ideal.
seriously do it. you'll find that because binary is the worst-case
scenario, the likelyhood that the final representation will end up as
binary is less likely than any other result. then, it's just the
overflow proportion (say like 3^3 = 9 compared to 2^4 = 16) from the
compacted range + unary ideal + etc. that relates to whether or not
the full string is actually better to write as binary or as the
compacted range + unary ideal + etc.



as this problem is pretty much totally NP-Complete (save the highest
value digit comparisons, so you can have early outs from the scoping
algorithm), which means that in effect you have to examine each and
every scenario as the digital representation is so varied between even
individual scoped representation, the amount of sheer computing power,
even with GPU acceleration through CUDA and such, is inadequate. i
suspect that in about 10-20 years there will be enough compute
resources to apply this methodology.



chris.

Reply With Quote
  #3  
Old July 6th, 2008, 02:10 PM
Marco Al
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
An analysis of the counting argument.

Einstein wrote:

There has been many efforts to bypass the counting argument. These all
require systems which add bits to something one could call 'a command
section' or 'identifying bits'.

No, they all require a fundamental lack of cohesive thinking. Unless you
believe basic math is internally inconsistent the only way to disbelieve
the counting argument is to be stupid or naive. Seeing as that you have
been here a while we can forget you are naive so either you are
stupid or you don't really disbelieve the counting argument.

My guess is the second.

So thoughts?

Welcome to my kill file.

Marco

Reply With Quote
Reply

Viewing: Web Development Archives FAQs Compression > An analysis of the counting argument.


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway