This project is brought to you by the Data Science and Artificial Intelligence Division, GovTech Singapore.
Andrew Tan, Quantitative Analyst Preston Lim, Software Engineer Tan Kai Wei, Data Scientist
Have you ever looked at an old black and white photo and wondered: what did the person taking this photo actually see?
Was there something about the arrangement of colours that compelled the photographer to capture this very moment? And if so, did the photographer see something that we — modern day viewers of this black and white photo — are not privy to?
While it is impossible to replicate the exact conditions in which the original photo was taken, it is possible to add colour to the photo to help us imagine what the photographer could have seen in that instant. It is incredible — almost magical — how a little bit of colour can bring us that much closer to that specific moment in time.
And as such, for our hackathon in January, our team decided to build a deep learning colouriser tool trained specifically for old Singaporean photos.
An important note here: the point of colourisation is to generate an image with colours that are plausible. It by no means guarantees that the colourised image is an accurate representation of the actual snapshot in time.
Another note: colourisation is a field of active research and our model is by no means perfect — it works well on some images but not others.
And as such, ImageNet is unlikely to have images relevant to Singapore. What this means is that the model is unlikely to have learnt what the colours of an old Singaporean schoolyard scene could plausibly be.
We hypothesise that a tool trained on Singapore-specific historical images will produce more believable colourised old Singaporean photos than existing tools.
How does one colourise a black and white image?
Before we jump into how colourisation can be done by a computer programme, let’s first consider how colourisation is done by a human colourist.
Colourisation is an extremely time and skill-intensive endeavour. In order to create an appropriately colourised photo, an experienced human colourist has to do two tasks:
(1) do significant research on the historical, geographic, and cultural context of the photo in order to derive appropriate colours, and
(2) colour the black and white image using software tools like Photoshop.
Similarly, a computer programme needs to perform the two tasks, albeit in a slightly different manner. A programme needs to:
(1) identify objects in a black and white photo, and figure out a plausible colour for the objects given images that it has seen in the past, and
(2) colour the black and white image.
Colourisation using Generative Adversarial Networks (GANs) — a deep learning technique
To colourise black and white images, we employed a technique in deep learning known as Generative Adversarial Networks (GANs). This comprises:
A first neural network — a ‘generator’ — with many mathematical parameters (> 20 million) that tries to predict the colour values at different pixels in a black and white image, based on features in the image, and
A second neural network — the ‘discriminator’ — that tries to identify if the generated colours are photo-realistic compared to the original coloured image.
The model is trained until the generator can predict colours that the discriminator cannot effectively distinguish as fake. A simplified view of the architecture used for training is shown below:
Other steps we took to improve our model included adding images from Google’s Open Images V4, especially for body parts that our model did not seem to do too well on (e.g. hands, legs, and arms which were hard for the model to identify), and modifying learning rates and batch sizes for better results.
Deploying our deep learning model as a web application
At this point, our deep learning model lived in our office’s local GPU cluster — which meant that only our team had access to the colouriser model. In order for the colouriser to be useful to anyone outside our team, we had to deploy it on the internet.
We went with Google Cloud Platform as our cloud provider for the colouriser service. The architecture is fairly simple, with:
(1) a CDN offering DDoS protection and caching of static content,
(2) an NGINX frontend proxy and static content server,
(3) a load balancer that distributes traffic, and
(4) backend colouriser services with NVIDIA Tesla K80 GPUs that perform the actual colourisation.
The colourisation step is compute intensive and takes approximately 3 seconds to complete per image. As such, we decided to shield the backend colouriser services by using an NGINX server to queue requests to the backend. If the rate of incoming requests far exceeds the rate that our backend services can handle, the NGINX server immediately returns a status response to the client asking the user to try again later.
The key highlight of this architecture is that the colouriser service virtual machines (VMs) are autoscaled in response to how much traffic each VM has to service. This saves on cost because additional VMs are only switched on when there is demand for it.
Here are some of our favourite results using photos obtained with permission from the New York Public Library and the National Archives of Singapore. We would like to note that our sources only provided us with the black and white photos and are not in any way responsible for the colourised output created by us.
Our model performs well on high resolution images that prominently feature human subjects (images where people occupy a large portion of the image) and natural scenery.
The following images look believable (at least to us) because they contain objects that exist in sufficient examples of the training image dataset. And so the model is able to identify the correct objects in the image and colour them believably.
Funky things happen when the model does not recognise objects in the photo.
Take for example the following image, “Japanese Surrender at Singapore”. The colouriser colours one — and only one — of the soldiers’ fist red. But it gets the rest of the soldiers’ fists correct. This happens because the model is unable to tell that the clenched fist is actually a fist from the angle the photo was taken. And so the colouriser makes its best guess but didn’t quite get it right.
This happens again in the following image, “Minister of Finance Dr. Goh Keng Swee arrives at opening of Bata shoe factor in Telok Blangah”. The face of the man on the right of the photo is coloured a ghastly grey because half of the man’s face is hidden from view, and so the model is unable to identify the object in the photo.
This phenomenon is known asocclusion — one of the major challenges in computer vision, where object recognition algorithms have trouble identifying objects that are partially covered.
More cool results
Here are more cool results from the colouriser. Because, why not?
10 Feb 2019, one week after the release of Colourise.sg
Everyone is welcome to read and comment on the contents posted to this site,
However, do not get offended if any part of the contents affects you or your
business, just write in and request us to remove them....we will be most
obliged to do so without any question asked.
Hope you have a great time reading them and of course we hope you are
inspired by some of our thoughts.