Anatomy of a digital image « Blog by Surreal Road

Anatomy of a digital image

The concept of the digital image has been around since the early days of computing. Back in the 1960s, getting a computer to display even a small picture took incredible resources. Nowadays, we are so overwhelmed with digital images that we barely notice them. From pictures on websites, photos from digital cameras to interfaces on mobile phones, digital imaging helps us to interact more intuitively with technology, as well as providing perfect copies of pictures that can be transmitted across the world almost instantly, or saved onto disks for future use.

A typical digital image

But how exactly do these images work? How are they put together, and what are the different types?

Picture Elements

All information, or data, that can be processed by computer technology is binary. This means that if you were to look at a piece of computer data in its simplest form, it could be described as a combination of ones and zeroes, for example “10110010”. Given the correct context, the computer knows how to interpret this stream of numbers, whether it is a digital image, a spreadsheet, or even a piece of software code. The combinations form the building blocks of any type of data.

Digital images are made up of a number of building blocks known as “pixels”, which is short for “picture elements”. These are tiny squares which are a solid colour. Put enough of them together, and you have a picture, in the same way a detailed mosaic is made from many small ceramic tiles.

Close-up, individual pixels can be seen.

Even photography works in a similar way. Look at a photograph under a microscope, and you can see the nidividual elements, or “grains” that form the bigger picture.

Because the grains in photographs are randomly shaped, they are harder to distinguish between than pixels of the same size. Pixels have to be regularly-shaped in order to make them easy for them to be displayed. Not only do images have square-shaped pixels, but most viewing devices, such as a computer monitor, do as well. For a digital image to be displayed correctly, one pixel of the image should exactly fit one pixel on the display.

Resolution

The more pixels there are in an image, the higher its resolution. This means that more data is stored in the image. The advantage of this is that you can have more detail in the image. The disadvantage of this is that you end up with bigger file, or chunk of data, which means it takes longer to transmit, is slower to display and work with, and uses more disk space when you save it.

If the resolution of an image is too low, aliasing can happen.

Aliasing can be seen along a curve

Aliasing is where curved or diagonal edges in an image appear jagged. Because pixels are just squares, it is easy to accurately draw horizontal or vertical straight lines just by stacking them next to each other, but when you try to draw a curved or diagonal line using too few pixels, the corners stick out, making the curve appear to have steps. One way to reduce this effect is to have a larger resolution, so that more pixels can be used to represent the edge.

This is the same image at a higher resolution. Note that the edge appears smoother

An alternative method is known as anti-aliasing. Using this method, pixels on edges have their colours altered to give the impression of a smoother line.

The lower-resolution image with anti-aliasing

Bit-Depth

Every single pixel can be given a different colour. The simplest image would only allow a choice of black or white. This may be necessary, for example for LCD displays on mobile phones, which can only display pixels as black or white. While this may not seem like much, with an image of a high resolution, it is possible to get a fairly detailed black-and-white image.

This image uses a combination of about 27,500 black and white pixels

Most displays, for example computer monitors, can have a much higher colour range. Most black-and-white digital images (also called greyscale images) are capable of 256 shades of grey, including black and white. This number comes from binary maths, using 8 binary digits (bits) per pixel to store colour information. This is referred to as the bit-depth of a digital image. A higher bit depth allows for a greater tonal range, but again at the cost of a larger file. An image with a 10-bit per pixel depth would allow 1024 shades of grey, but would result in a file that is 25% bigger.

This image has a bit depth of 8 bits per pixel

If the bit-depth of an image is too low, quantisation, or “banding”, can happen.

Banding occurs in this image with a low bit-depth

Banding is where “steps” can be seen between areas of smooth shading. This step is caused by the colour difference between two adjacent shades of grey (or “greyscale levels”) being too far apart. Having a larger bit-depth can reduce this effect.

An alternative method is to use a process called “dithering”. Dithering is where pixels are scattered slightly to soften the apparent step between levels.

Image showing how dithering can reduce the apparent effect of banding

Colour

In painting, you can make new colours by mixing different primary colours together. For example, mixing yellow and blue paint will make green. This is known as the subtractive method of colour mixing, as the more colours you mix in, the closer you get to black. To produce colour images on a monitor, different amounts of red, green and blue light are mixed together. This is based on the additive system of colour mixing, where the more colours you mix together, the closer you get to white. Any colour can be made by mixing different quantities of red, green, and blue.

In a colour digital image, there are a number of “channels”. Typically, there is a red, green, and blue channel. Within each channel, there are a possible number of colour values, as there are for a greyscale images. In a sense, there are three separate greyscale images which are mixed together to make a colour image.

A red, green, and blue channel are combined to make a colour image

Layers

It is also possible for an image to have more channels. The most common fourth channel is known as the “alpha” channel, and is usually used for saving transparency information. This may be useful in an image with several “layers”.

A layered image is one which contains several separate images combined together. This is to allow for superimposing pictures, animation, or for special effects. Whilst each layer is a separate image, they typically share qualities such as having the same resolution, bit-depth and number of channels.

Gamut

So far, we’ve mainly used computer monitors to explain how digital images work. However, there are many different ways of using a digital image, other than displaying them on a monitor. Digital images can be printed onto paper, put on video, recorded by lasers onto photographic film, or even just be analysed by machines without ever being looked at.

Each of these different methods understands colour in a different way. For example, there are many colours that can be seen on projected film that cannot be displayed on video. The ability of a system to be able to represent certain colours is known as its “colour space” or “gamut”. Most of the time, the gamut of different systems will overlap, and be able to all show the same colour. So for example, an image displayed on a monitor will look exactly like the colour print out. However, where areas of colour do not match up, the colours are said to be “out of gamut” and may look wrong.

Even the human eye has its own gamut. For example, some video cameras can see infra-red or ultra-violet, which is outside of the gamut of the human eye.

In order to cope with this, digital images are often optimised for a specific colour space. This means that the images may be interpreted a certain way, depending on its application. For example, the common 3-channel RGB colour image discussed, aims to be suitable for viewing on most computer monitors. It uses a “linear” scale, meaning that a pixel greyscale level of 100 will be twice as bright as a level of 50. Printing houses may adopt the 4-channel CYMK colour space, which is interpreted in a similar way to the cyan, yellow, magenta, and black inks used for printing on paper.

Cyan, magenta, yellow and black channels combine to make a CYMK colour-space image

Images for photographic film typically use a 3-channel RGB colour space, but using a logarithmic scale. This is because film responds to light differently at the extremes of colour.

There are even more colour spaces available. Graphic artists may use HLS, which has a channel for the hue (colour), one for lightness (brightness of the colour), and one for saturation (purity of the colour). Colour scientists use LAB, where there is a channel for the luminance, and two for colour.

Gamma

A further complication arises when you consider that viewing devices and viewing conditions vary. The way an image is displayed on your monitor may not match exactly how it is displayed on someone else’s. Usually, the difference is subtle, but for accurate reproduction of an image, it must be viewed on a calibrated device.

The most common way of doing this is using gamma calibration. Gamma is a specific scale that combines both contrast and brightness. You calibrate your viewing device for a specific point on the scale. Once this is done, you are able to view any digital image with accurate brightness and contrast levels. Many digital images also have additional gamma information encoded into them, to help make the translation more accurate.

Further to this, many displays also have an option to set the colour temperature. This effectively sets the “white point”, which is the level where the RGB levels on the display mix to make pure white. This can help to ensure that the actual colour balance of an image is viewed correctly. The most common use of colour temperature calibration is to compensate for different lighting conditions. For example, under fluorescent lighting, the colour white may appear slightly blue-green.

Vectors

An alternative to using pixels in digital images is to use “vectors”. Vectors are basically mathematical representations of shapes, such as rectangles and circles. They can be filled shapes, transparent, and can overlap each other. The advantages of vectors is that they are “resolution-independent”, meaning that you can zoom into or out of them with out having problems such as aliasing. They typically require less information to store them, resulting in smaller file sizes. The disadvantage is that vectors cannot easily represent complex details that a photograph or pixel-based image could. You couldn’t, for example, make a vector image by scanning a photo. It is important to remember that even a vector image must be “rasterised” (converted to pixels) before it can be displayed. There are several vector-based imaging programs available, such as Corel DRAW.

A vector image

Even zoomed into the vector image, the sharp edge is retained

Similar to this is the idea of fractal images. Fractals are complex formulae that generate infinitely complex patterns that are resolution-independent. However, it is very difficult to generate anything other than patterns with them, and so they are not as common as vectors.

A fractal image

Digital Image Operations

One of the main advantages of digital technology is that manipulation and analysis of the underlying data is very simple. Because digital images are just numbers, it is possible to affect underlying parameters just by doing simple maths. For example, to increase the brightness of a pixel, you just increase the level of the pixel. To increase the overall brightness of an image, you increase the level of all pixels.

Image with increased brightness

However, pixels that are already at the maximum level will remain at the maximum. These pixels are said to be “clipped”. If too many pixels are clipped, the effect on the image can be destructive.

A clipped image

Convolution is a process where a digital image is modified using a mathematical matrix of numbers to transform the image. One of the more common convolution matrices is for sharpening an image. Changing the size or values of the convolution matrix will increase, reduce, or alter the effect.

A convolution matrix applied to a selected area, in this case, has a sharpening effect on the image

Many other options are available, usually referred to as filters, which can be used for a variety of artistic, analytical, or special effects. Image processing applications are available that allow a vast number of different operations across an image, or even just to a specific area, such as Adobe Photoshop.

File Formats

As there are so many different ways to represent a digital image, there are hundreds of different formats for different purposes. They can vary based upon the number of channels and layers, the bit-depth, and the colour space. There are even formats that are more flexible, allowing any number of options; however these are normally limited to being readable by a specific application.

One of the main problems with digital images is the file size. The size of an image is roughly:

number_of_pixels x number_of_channels x bit_depth x number_of_layers

It is very easy to quickly produce large files. For example, a photographic quality image is typically 35-40 MB. This is too high put onto a website or to email to people. One of the ways to solve this is to reduce the amount of data. For example, you can lower the resolution of an image by a process known as resampling, which merges or discards a proportion of pixels, typically involving some form of anti-aliasing to preserve detail. Other methods may be to limit the colour palette to a much smaller number. For example, ‘GIF’ files allow a colour palette of about 256 colours, pre-selected from a range of about 16.8 million, thereby reducing the file size to a third of the original size.

Compression

The most common method to reduce file size is by using some form of compression. There are two main ways of compressing a digital image. The first, known as “lossless” compression, simply rearranges the saved data in a more efficient way, using less disk space. Using this method, it is possible to reduce the file size by half, but it is very dependent on the content of the image.

The other method is known as “lossy” compression. This method optimizes image size by discarding some of the data. Ideally, this will be data that is not necessary. For example, “JPEG” compression works by reducing the number of colours based on the colour response of the human eye. In theory, it discards colours the eye wouldn’t detect easily anyway. Other methods are even more esoteric, for example converting areas to equivalent fractals.

It’s also possible to encrypt images, meaning that you have to supply a password or other “digital key” to be able to view the image. Further to this, there are also options available to digitally “watermark” an image, so that ownership and copyright information remains embedded in the image.

Jack James has been working with digital imaging technology for 10 years. He has worked within a number of digital intermediate environments since joining Cinesite (Europe) Ltd.’s Digital Lab in 2001 to work on HBO’s Band of Brothers. He has a number of film credits, and has published the book "Digital Intermediates for Film & Video" with Focal Press.

Posted: January 22nd, 2005
Categories: Articles
Tags:

Add your comment

*/ ?>

Archives

Pages

Search

Friends