HOME | ICEFIELDS | ICEFIELDS FAQs | LIBRARY | ABOUT
HOME | ICEFIELDS | ICEFIELDS FAQs | LIBRARY | ABOUT
Is color hard or soft data?
Tuesday, July 27, 2010
My last two blogs described soft data as data described without a precise real-world measurement. Describing soft data with a measurement system makes the soft data hard without actually changing the data. See Part 1 for my definitions of hard and soft data.
I described Adobe’s InDesign software as a device to harden data. It does it by two methods:
1.InDesign adds a measurement system. In fact you cannot even create a page with InDesign without selecting a measurement system – inches, picas, cm, etc. The selected measurement system hardens the data added to the page.
2.InDesign changes the data to conform more tightly a real-world measurement system. InDesign can adjust the raster image’s ppi to better match the requirements of the halftone.
An image can be described in rows and columns of pixels. A pixel is hard data. Each pixel is described with three integers – tristimulus data. Three integers describe coordinates in three-dimensional space – a color space. The coordinates allow an observer to compare a pixel with other pixels in the same color space.
The image’s columns and rows allow the comparison of the number of pixels in the image to the number of pixels in any other image. Counting rows and columns is an insufficient metric to determine the size of either the pixel or the image, hence there isn’t enough information gained by this measurement to compare the size of one image to another image. In Part 2, I described that counting makes data hard. But I should have said counting makes soft pixel data into harder data.
Counting alone is not enough. A value system must also be used. If a person has 200 coins and another person has 50 coins, which person has the greater worth? Another measurement system must be used to further harden the data, determine worth, and allow a comparison of worth. A base-10 counting system is used to count the coins and a base-100 monetary system is used to describe the value of the coins. Now back to the image.
Counting pixels doesn’t say much about the image – like the total number of coins. The real-world contains a great number of details and colors, The many colors in nature suggest that the image with the greatest number of pixels is more representative of the real-world. However the number of pixels may be the result of scaling. There isn’t yet a scaling method that adds detail. The number of pixels alone doesn’t provide InDesign with enough metrics to ensure scene accuracy or provide a method to harden the data.
I have added with the previous paragraphs new parameters to determine hardness – comparison, number of measurement systems and degree of hardness. Data isn’t only hard or soft. All data has a degree of hardness. Soft data is made harder by applying various measuring systems. I demonstrated that counting coins makes soft money data harder, and a monetary system hardens the data even more. Comparison and degree of hardness are useful descriptors of image data.
InDesign knows a few things about the image:
1.The number of rows and columns of the image.
2.The dimensions of the image in a measurement system.
3.The page measurement system.
4.The user’s desired physical size of the image.
5.The tristimulus coordinates of each pixel.
6.The source color space of each pixel.
7.The output device color space.
This is enough information for InDesign to make the image as hard as current technology allows. There isn’t enough to determine the accuracy of the image to the scene or anything about its content. InDesign makes hard data, but to many photographers, not hard enough.
The distance between pixel coordinates in a color space is also a measurement system. Color science has standard formulas to determine the distance from the coordinates of one color to another in a three-dimensional plot. The distance is described as a number -- a Delta-E. The greater the Delta-E value the longer the distance and the greater the difference in color and the softer the data. This distance measurement system is used to help harden data.
Why does the Delta-E value indicate amount of hardness? Transform the pixel from the source color space to the output color space and back to the source. Mapping pixels described in a large color space to a smaller color space, called gamut mapping, is just one of many factors that add to the complexity and determine hardness. The distance between the position of the pixel color before and after the transformations is an indicator of the softness of the pixel. If a pixel does not change its position then the pixel data is hard. In this case the measure of color change is also a measure of hardness.
The International Color Consortium, ICC, has provided enough direction to allow a determination of the Delta-E. The ICC is a data hardening organization.
At the end of Part 2 of my soft and hard data blog I ask, Does color management harden data? An image generator, e.g. a camera, provides semi-hard data. The output raster of this device contains pixel values, a color space and a number of rows and columns.
(Photographers are left out of the hardening process. It may be impossible to ever quantify the relationship of the colors in an image to the colors in the scene. I was involved a philosophical conversation when I was at Xerox many years ago about this pixel/scene relationship. Would it be possible to signify pixel accuracy to the scene? We decided that the complications were too great for today’s technology. The standard observer’s visual characteristics, the spectral power distribution, the illuminance, the surround, the mental image processing, etc. are not all available, and therefore there will always be an amount of softness in the pixel data.)
InDesign manages color-coordinate changes from the source color space to the output color space. The camera provides the source color space. The output color space is that of the monitor or printer. There are other types color spaces, but in this blog I will stick to only device color spaces.
I make the assumption that all ICC measurements, distance formulas, and color space transformations are accurate. There is one important exception -- the tristimulus color-space transformation to a CMYK color space. Black is the problem. Tristumulus color space transformations are inverse functions. Hard data doesn’t soften after the transformation. In Part 1, I described that any measurement system can be used to harden data. Likewise a color space is a sufficient measuring system. The problem is where does black fit within the three-dimensional color space?
The CMY color space is a tristimulus coordinate system. The cyan, magenta and yellow are the names of the three axes. Grayscale colors lie on the neutral axis and are described with three coordinates. Black is described with one number. Gray colors are described in three dimensions when C=M=K and is a shade of gray from 0, 0, 0, to 255, 255, 255. A mid-range gray is described as 128, 128, 128. Black and its shades of gray are described in a one-dimensional curve from 0 to 255. Mid-tone gray is 128. So there are two ways to describe gray within the same image. Since any measurement will harden soft data, either the black one-dimensional system or the neutral three-dimensional system will suffice. The black one-dimension could replace the neutral three-dimensions. The problem is when to use black and when to use C=M=Y? There are only vague guidelines based on the Standard Observers’ determination of gray, hence CMYK is a data softener.
Black could be substituted for the neutral axis in a CMY space. The neutral axis is the line that connects all C=M=Y. Black could be a fourth dimension. Neural gray: 128, 128, 128, 0 is equal to black: 0,0,0,128. (It is impossible for me to image what a four-dimensional space would look like. I assume that the neutral gray and black points intersect!)
There are three generally accepted methods to performed RGB color transformations to coordinate positions in CMYK color space.
1.A three to four coordinate matrix with black as the fourth dimension.
2.A three to four position database to find black as described as the fourth number in a four-number system.
3.A three to three matrix conversion plus a one-dimensional curve to replace a portion of the C=M=Y curve.
The last method is frequently used to generate black using under color removal or gray component removal methods. All three methods result in a softening of hard data because any conversion back to RGB results in a large Delta-E.
To demonstrate the hardness of a typical color management workflow, I arbitrary assign a number from 1 to 10 with 10 indicating hard data. The number I assign is at best antidotal and represents only a general guideline of the relationship of one color space transformation to another.
The source image: Range: 6 to 7.
Device sensing inaccuracy and lack of image geometry are primary contributors keeping the data soft.
The output image for a web page: Range 8 to 9.
Scale changes and inaccuracies in gamut compression soften data.
The output image for a printed page: Range 4 to 6.
Black generation is inaccurate. Different spectral values based on subtractive and additive color transmission softens data.
The color workflow changes the hardness of data. Source image data is fairly hard, but various transformations soften and then harden the data. The conversion of RGB to CMYK and back to RGB results in softer data. Color management texts advise the user to start with the largest tristumulus space possible and change to the CMYK device space only when the data is at the output device in the workflow. Now we know why.