Minecraft, ENHANCE!
Neural Networks to Upscale & Stylize Pixel Art

Blog / Minecraft, ENHANCE!
Neural Networks to Upscale & Stylize Pixel Art

How about taking pixelated graphics and using a neural network to increase their resolution, using example photos or textures? We attempted it for Minecraft with an open source project… (This research project was possible thanks to nucl.ai [1].)

Just over a month ago, we released Neural Doodle: a deep learning project to transfer the style from one image onto another. The script allows anyone to reuse existing Art and Photos to improve their two-bit doodles. There’s now a growing number of developers experimenting and posting their results — which inspired the work in this article!

Neural Doodle is in fact a very simple project with 550 lines of code, powered by deep neural network libraries like Lasagne and Theano. However, the algorithm can be used in a variety of different ways: texture synthesis, style transfer, image analogy, and now another: example-based upscaling.


Lets start with some examples! The following 512x512 textures were generated by running Neural Doodle on the GPU. Here’s the core command line:

python3 doodle.py --style Example_Stone.jpg --seed Minecraft_Stone.jpg \
                  --iterations=100 --phases=1 --variety=0.5

The input examples are a variety of textures collected by Image Search. They are not shown in full due to Copyright questions, but you can find them again yourself easily!

After about five to ten minutes — depending on the style and the speed of your GPU — the script will output the following images. (CLICK & HOLD THE THUMBNAILS TO COMPARE.)

Using the exact same code, you can also do the same for dirt textures. Here are the example photos which were also found from Image Search.

Again, after running during your lunch break or overnight, you’ll end up with synthesized textures like this. (CLICK & HOLD THE THUMBNAILS TO COMPARE.)

Note that these dirt textures are more organic than the stones and it’s harder to see the original pixels in the final images. This is done on purpose, see alternative images below that have less variety but more visible pixels.


Under the hood, Neural Doodle is an iterative algorithm, which means it performs many incremental refinements to an image that we call frames; each one gets a step closer to the final desired output. Each step, the algorithm matches “neural patches” from the desired style image and nudges the current image in that direction (i.e. gradient descent, for those of you familiar with optimization). You can stop the process at any stage if you’re happy with the quality — but it usually requires 100 steps.

Depending on how you use the script, it starts with different types of seed images: random noise, the target image, or hand-crafted seed. In the case of Minecraft, the pixelated art is taken as the seed at the target resolution (e.g. 512x512) and the optimization adjusts each pixel towards the target style—neural patch by neural patch.

Neural networks have the advantage of better understanding image patterns, gained from learning to classify images. In this case, we use a convolution network called VGG by the University of Oxford. It was trained over millions of images, and thanks to this, can blend patches better than an algorithm operating on individual pixels.

» The code is open source and available on GitHub; the main script is around 550 lines. It’s well commented too to help you figure out what’s going on!


The algorithm is not perfect and obviously has some failures too… Here are three of the main ones!

1) Reasonable Textures, Unsuitable Results

Some of the failures depend entirely the input images: if the style is inappropriate for the problem at hand, nothing will fix it. These particular textures look good, but don’t match very well with the pixelated input texture. This chosen style didn’t match very well with the original structure of the image. In this case, the only way to fix it is to go back to find better reference textures!

2) Repeated Patterns (Before Fix)

The problem of repeated patterns is often cited as a flaw with the original algorithm that we call Neural Patches. In the process of rendering these images, in particular the grass, we fixed this in neural-doodle: you can now use the --variety parameter to encourage the code to use a wider diversity of patches.

All of the images rendered here had some additional variety, typically 0.5 or even 1.0. Usually, the optimization is seeded with random noise, which helps encourage using a wider variety of patches. When using pixelized Minecraft textures as the seed image there’s less randomness, so you need this extra parameter for the results to shine!

3) Physically Implausible Sections

The original patch-based image processing algorithms (see this overview) tend to exhibit glitches either when patches don’t match very well or patches are blended awkwardly. Using Neural Networks helps blend the patches in a more sensible fashion, but doesn’t help (yet) when patches are missing and the match quality is low.

An answer to this may be using a pair of neural networks that are called generative adversarial networks. One network would learn more about the image patches (trying to predict plausible image sections) and the other is used to detect if those patches are plausible enough. See this very recent paper on the topic!

Visualizing Patch Variety

The top images are the original images, generated by matching only the nearest neural patches and then generating the image based on those. The bottom images are generated by forcing the algorithm to pick a wider variety of patches, which means the results are mare organic and creative—but at the cost of the image looking different.

The patch diversity code works by measuring how similar the style patches are to the current image, then giving the worst-matching patches a boost while best-matching patches are punished. This levels the playing field so a bigger variety of patches are selected — depending on the user specified parameter. We think it looks good!


The applications for these techniques are already very promising! But as these algorithms improve, you’ll be able to apply upscaling and stylization to entire screens or world maps that mix a variety of different source textures. For this, the implementation needs a few changes to split up the image and patches into chunks—especially for those with 1Gb or 2Gb GPUs. This would also allow scaling up to larger textures efficiently without requiring top of the range 12Gb cards! (Watch this GitHub Issue for progress.)

I’m sure you’ll agree it’s an incredible time to be involved in Creative AI. The core algorithm that generated all these images was published in January 2016, our version has been open source and generally usable for over a month, and significant improvements we made to the output are dated just yesterday! There’s so much low hanging fruit, things are moving incredibly fast and it’s inspiring.

» Want to learn more or join the community, see you in Vienna at nucl.ai Conference, July 18-20. Also feel free to post a comment or question on CreativeAI.net!

Alex J. Champandard

[1] This research was funded out of the marketing budget for the nucl.ai Conference 2016, our event dedicated Artificial Intelligence in Creative Industries. It’s more constructive this way, right? ;-)