Recent advances in machine learning have opened Pandora’s box for creative applications! Find out about the technology behind #NeuralDoodle and how it could change the future of content creation. (This research was possible thanks to nucl.ai .)
Last year, with Deep Dream (June 2015) and Style Networks (August 2015), the idea that deep learning may become a tool for Art entered the public consciousness. Generative algorithms based on Neural Networks so far haven’t been the most predictable or easiest to understand, but when they work — by combination of skill or luck — the quality of the output is second to none!
It took until 2016 for those techniques to be turned into tools that are useful for artists, starting with this paper we call Neural Patches (January 2016) that lets the algorithm process images in a context-sensitive manner. Now, when style transfer techniques are extended with controls and annotations, they can process images in a meaningful way: reducing glitches and increasing user control. This is our work on Semantic Style Transfer (March 2016), which can be applied to the same applications as before, as well as generating images from rough annotations—a.k.a. doodles.
A significant amount of work in deep learning today is spent on command lines and editing text files. That’s how the tools operate and where we spend most of our time too! Now with algorithms like #NeuralDoodle becoming useful as tools, it’s time to start thinking more broadly to make them more widely accessible.
This prototype video shows how such algorithms could be integrated into common image editing tools, in this case GIMP. This tool doesn’t exist (yet), it’s there to help you visualize how these tools could evolve.
See for yourself in this hybrid system where the human makes doodles and the machine paint high-quality version at regular intervals on request.
This workflow mockup was created by a human working in GIMP using reference art from one of Renoir’s paintings, as you see in the video. However, the image was regularly saved then processed with our implementation of semantic style transfer. The resulting output from the tool was then edited back into the screen capture.
It’s not real yet, but students are already looking into this integration! With machine learning, the future is always closer than you think ;-)
ITERATION THAT WORKS (SLOWLY)
As always with advanced tools, things may not work out as expected the first time. Thanks to the annotations in the semantic maps, however, it becomes possible to iterate to get the desired results. Currently it takes 3-5 minutes to generate a single HD 720p image from scratch, depending on the combination of images.
There’s certainly a lot of room for improvement in performance—after all, the underlying algorithm is a brute force matching of neural patches—but the basic workflow is in place. When reusing previous images, and only repairing select parts of the image, things may speed up to the point of being almost realtime. Also, expect advances in machine learning to speed up the computations and reduce the workload required over the next year or two.
As for the iteration that’s already possible (slowly), here’s an example based on a Monet painting from l’Étretat.
At each stage the algorithm is doing the same thing, transferring style from one image to another on demand using annotations. Only the inputs to the algorithm changes, in this case, the doodles. Result gets better as the human addresses problems with the previous iteration, and the synthesized image converges to something that can be painted successfully based on the reference material.
You may notice a few things from these prototype images, which reveals fascinating insights about the algorithm and how it operates:
- The first image (top left) was generated from incorrect annotations: sky patches incorrectly render above every patch of sand. The second image tries to remove the left sandy ledge.
- In the third image (top row), the top arch is blurred by sky texture. This can happen if the source painting doesn’t have any reference material showing how to paint this.
- The fifth image (bottom row) removes the arch for better results, but the left cliff looks rather bland with a repeating texture—similar to the original painting.
- The last image fixes this by adding some darker rock patterns, and also removes the sand at the base of the arch in the sea to increase the feeling of depth.
Here’s the final set of images for this particular synthesized image; (left) the annotations for the painting, (middle) Monet’s original painting, and (right) the doodle for desired image.
QUALITY & CONSISTENCY
Beyond the workflow, it’s important to emphasize the benefits of having a tool that can consistently generate images at this level of quality from so little input. It may not match the original Renoir, but as placeholder art for many games, simulations, visualizations this is more than acceptable. It may even be good enough to ship ;-)
This is a 720p rendering generated by the implementation behind @DeepForger—the first online service to offer both Style Networks and now Neural Patches to end-users on social media. It’s quickly become of our favorite renderings of all time!
Again, when errors occur, it’s no longer a problem: it’s possible to extend the source material so the Neural Patches algorithm can find a match while painting, or use some annotations and manually fix the results thanks to semantic style transfer.
Deep learning and neural networks are going to fundamentally change the way create, and the tools we interact with will become smarter too. It’s a guessing game to predict exactly how, but here you saw a mockup of a tool that would be possible to build within the next year only!
In the meantime, here are some great places to continue learning about the topic:
- Find and star the neural-doodle repository on GitHub. The code is well commented ;-)
- Go and read our research paper on arXiv and dig into the technology in more depth.
- Visit us in Vienna for the nucl.ai Conference on July 18-20! First speakers will be announced soon.
 This research was funded out of the marketing budget for the nucl.ai Conference 2016, our event dedicated Artificial Intelligence in Creative Industries. It’s better this way, right? ;-)