Wouldn’t it have been great to hire Pablo Picasso, Gustav Klimt, Suzuki Harunobu, Camille Corot or Jackson Pollock as art consultants for a game? Now, thanks to deep neural networks you can borrow some of their skills to help with things like Concept Art.
Less than two weeks ago, a new research paper was published entitled “A Neural Algorithm for Artistic Style” and lets you transfer aspects from one image to another. Generally speaking, it works by leveraging the neural network to find patterns in the images at multiple levels (e.g. grain, texture, strokes, elements, composition), then using an optimization process to help generate a new image from scratch. Using this technique, it’s possible to combine “style” (as understood by the neural network) with “content” from another image. Neural networks have become very good at recognizing patterns in images, so the results are higher quality than you might expect.
The rest of this article shows how screenshots from Quake I were combined with paintings from famous artists, along with an analysis and answers to common questions at the bottom. Why Quake? Because it’s (in)famous for its brownish colors, and maybe a little creative input could help improve that without going too far outside of color limits of the time… Let’s see how it turned out!
Frequently Asked Questions
- How did you pick the reference paintings?
- What’s the implementation you used to generate these images?
- Can this be made into a real-time effect via a shader?
- How did you adjust the parameters for the algorithm?
- This looks pretty great! Can I do it myself?
- These look like a terrible Photoshop filter. What’s going on?
- How is this going to affect the artistic process in the future?
Table Of Contents
Q: How did you pick the reference paintings?
A: Given a specific painter name, their paintings were chosen automatically by a piece of code written for a bot called @DeepForger that applies a “style” to photos you submit by Twitter. The code searches through a large collection of paintings, and matches those paintings based on similarity (various metrics) with the original painting. Thanks to this code, there was no iteration on the painting selection; the first hit was the one I chose!
The process of finding paintings is very important to get good results from the original algorithm. If you browse the bot’s replies on Twitter you’ll see how certain results from images with extreme mismatches don’t always work out without a few iterations and many parameter tweaks.
Q: What’s the implementation you used to generate these images?
A: Even though the algorithm was only announced a few weeks ago, there are already multiple implementations of the algorithm: the first available, the most used, and one in Python (no doubt more). All of these are powered by CUDA for additional performance, though a CPU-based fallback is available when there’s not enough GPU memory (e.g. for high-resolution rendering).
Q: Can this be made into a real-time effect via a shader?
A: The images above are generated in two steps, first with 400 optimization steps on the GPU based on a randomly initialized image, at a resolution of roughly 720×486. (The exact resolution depends on the size of the reference painting, because it all has to fit in memory.) The GPU is a GTX970 with 4Gb of RAM, and renders those images in about 6 minutes. Second, for the benefits of this article, a post-process on the CPU at resolution around 1140×770 was also used to make the article look cool. (Again, the exact resolution depends on memory, this time 32Gb of system memory limit.) Some images were rendered at full 1280×720 while other were 1024×692 and scaled up. The CPU post-process is seeded with the results from the GPU, and runs for 200 iterations only, but takes just over an hour!
As it is now, this technology is not ready for realtime, and it’s likely a new algorithm is needed for this to run at 60 FPS anytime soon. Alternatively, you can just wait for Moore’s law to catch up!
Q: How did you adjust the parameters for the algorithm?
A: The default values from the code in repositories work well for their test images, but often require customizing to get specific results (portraits in particular take a couple iterations). Watching the output from the bot for a week provides a good sense of what works and what doesn’t!
In this case, all the images were generated with the same parameters. The goal was to bias the screenshot generation to feature more content from Quake and de-emphasize the often extreme style of the chosen artists. If you’ve used the bot, this is equivalent to the parameter ratio=2/1 which add emphasis on the content (from Quake) twice as much as the style (from painter).
Q: This looks great! Can I do it myself?
A: You can either submit images to the bot and have it process the results for you (NOTE: each image takes a while to compute so there’s a queue). Or you can setup the same Open Source projects and have it run on your GPU if you have at least 4Gb. The setup itself is a bit challenging, both because of the use of GPU (and CUDA) and libraries in out-of-the-ordinary languages that require downloading and compiling.
The other challenges involve finding reference images and tuning parameters, as mentioned above. With a bit of practice and watching the bot in action, you’ll get there relatively quickly. Having a large library of reference art certainly helps, and that’s partly what the bot is there for!
Q: These look like a terrible Photoshop filter. What’s going on?
A: If you’ve seen a Photoshop filter that can output the images above with the exact same parameters each time, and provide results in 6 minutes or less, then we want to know! Of course it’s possible to improve each of these images, but they’re still useful as inspiration and an insight where the technology is going.
As far as the algorithm itself, it has a few deficiencies. It’s the first know general “style transfer” algorithm and a novel application of neural networks, so it’s safe to expect many improvements in the future. In particular, the various layers of patterns found by the neural network (e.g. grain, strokes, elements, composition) are optimized separately from each other when in fact the strokes should depend on higher-level patterns too. In short, the semantic information already existing in the neural network should be put to use better.
Q: How is this going to affect the artistic process in the future?
A: It’s still very early and it’s hard to say. However, the quality of the output from this tool, given the time taken to compute the results (minutes) makes it extremely valuable already as a source of ideas. In the near future these techniques may be able to re-purpose and restyle existing textures, so switching from photo-realistic to a cartoon style could be a matter of spawning a few cloud instances and generating new textures.
The future powered by machine learning looks bright, and any tool that can improve the creativity and productivity of artists is (mostly) very welcome!