This page discusses how pictures are represented by a computer,
and how different representations can differ greatly in the amount
of data required.


They say, a picture's worth 1000 words.

Well, it's also a few bits and bytes.

To the casual observer, a picture is just a picture.
You look at it, you laugh, cry, whatever ...

but for a computer, there's really no such thing as a picture.
Computers have to fake it.

To a computer, EVERYTHING is just data.
How it gets interpreted is up to some devices (like
a screen, or an audio speaker, for example) and whoever
is using that device.

Let's consider pictures.

Your screen is divided into what are called "picture elements",
or pixels. They are tiny - just dots basically.

Each pixel is further subdivided into three parts:
red, green, and blue - in order to produce any color.
Hold a decent magnifying glass up to your screen and you can see them.

Then a picture consists of a lot of "pixel specs" -
three numbers that say (how much red, how much green, how much blue)
for each pixel.

Let's look at some examples:

Here are some squares:
These are specified using just one of the three available base colors.
The percentages refer to how strong each of the 3 colors should be within each pixel.

 
100% red
 
100% green
 
100% blue

For darker versions, we can use 50% intensity:

 
50% red  
 
50% green  
 
50% blue

Let's try mixing some colors.
The numbers are percentages of red, green, blue:

 
50, 100, 50
 
100, 50, 50
 
50, 50, 100
 
50, 0, 100  
 
50, 100, 0
 
0, 50, 100
 
0, 0, 0       
 
50, 50, 50
 
100, 100, 100

In this way, by specifying the color of each pixel, you get a picture.

Let's say we want a small black square inside a white square, something like this:

 

If we represent a white pixel by "W" and black one by "b", the "square" (as data) might be this:

WWWWWWW
WWWWWWW
WWbbbWW
WWbbbWW
WWbbbWW
WWWWWWW
WWWWWWW

However, we don't need to specify all 49 pixels individually.

We could say, instead
"Two lines of all white": 2 x (7 W)
"Three lines of white-black-white": 3 x (2 W 3 b 2 W)
"Two lines of all white": 2 x (7 W)

That would be a much more efficient way of doing it.

In practice, that's an oversimplification, and there are many different ways of doing the representation.

Here are a few of the most common.

BMP: Bit map
This is least efficient way - each pixel is specified one by one.
PNG: (Portable network graphics)
This format is good when there are just a few colors in the picture.
JPG: (in this case the "P" stands for photographers)
This format is good when there are many colors in the picture, such as in a photograph.

The results for each - for the same "picture" - can be very different
in terms of the size of the file containing it.



Just a few colors:
PNG file: 6,168 bytes ... JPG file: 35,490 bytes



Many colors:
PNG file: 385,951 bytes ... JPG file: 46,601 bytes

In the first instance, the JPG is about 6 times the amount of data as the PNG.
But in the second instance, it is the reverse: the PNG is 8 times as large as the JPG.

These days, when we have terabyte disk drives, and gigabyte network speeds,
it isn't noticeable (except in extreme cases),
but for an old-timer like me, who came up in the days of scarce resources,
it's a painful waste of time and space.