Giving Color Vision to a Machine
Scaling our Previous Approach
We ran KMeans on one of the images in our dataset to test the viability of the approach we wanted to take. The process we followed is outlined below:
- Load image from Airtable using the Imaging library from python and HTTP requests library using image URL
- Converting image into numpy array and reshaping to a format that could train a machine learning model I.e. flattening the image (rows *
cols, 3) - Initialize number of clusters
- Run clustering on the reshaped array of the image
- Get the centroids and use these to get the dominant colors using RGB representation
- Convert RGB color representation into color names using a palette of 900 colors.
The process worked seamlessly on one image. We therefore needed to replicate the same to the entire dataset. We used a loop to iterate through all the images and repeat the above process. This however proved to be a challenge due the following reasons:
- The pixel size on the images was too big and hence the algorithm was taking up too many resources and taking a long time to run (10 images running for 300 mins) – To resolve this challenge, we resized the images using openCV library in python. This significantly improved the performance, and we were able to run the algorithm on the entire dataset in 25 mins.
- Some of the images were not in the RGB format but in RGB-A format and hence we ran into errors when reshaping the image since the RGB-A format would have required (rows*cols, 4) to reshape it. – Since most of the images were in RGB and our final color representation was to be in RGB format, we converted the images which had RGB-A format to RGB format, which then cleared errors
Demonstration of Algorithm
Teaching a Machine to Compute Similar Colors
Our first challenge was: How do we teach a machine to recognize what colors are similar? Most humans can recognize shades of a color with ease; meaning we would be able to tell the difference between red and blue. However, with just a 6 alphanumeric characters (a HTML color code), how does the machine know if [#98403d] is similar in color to [#933d41]? With over 16,777,216 RGB color values, we first reduced our color palette to just 865 colors. This color system had a fair and equal representation of different color shades (red, orange, yellow, green, blue, white, brown, black).
Our first approach was to simply calculate the distance in the RGB values of the image’s color versus the RGB values of every color in our color palette. This was the simplest approach, but in some cases, we found that some colors were extremely far off from each other after being matched with a similar color. We double checked our formula and everything was correct, we had also added a square root function to the distance but still found some cases of inaccurate matching. Upon further research, we found that the human eye recognizes some colors better than others. With more extreme cases, as we know as “color blindness,” some people are not able to recognize the difference in colors. We decided to improve the algorithm by adding weights to the RGB values (30 for Red, 59 for Green, and 11 for Blue) to account for luminance.
Even applying the weights to the algorithm, we found that we got yellows when they were brown, and vice versa. We decided to switch a color
system that was a better representation of human perception. To convert the color from its original color space to CIELAB, we had used the Python
package ‘scikit-image’. Although not fully accurate, it provided much better results compared to our original approach.
In the final output (shown in the image above), we were not only able to teach the machine to recognize similar colors according to our assigned color palette, but we were also able to assign color names to the colors used in the artwork like “Ash Grey,” “Jet,” “Smokey Topaz” which give humans a better idea of the exact color shade.