hybrid encoding for generating large scale game level

17
1 Hybrid Encoding For Generating Large Scale Game Level Patterns With Local Variations Using a GAN Jacob Schrum, Benjamin Capps, Kirby Steckel, Vanessa Volz, and Sebastian Risi Abstract—Generative Adversarial Networks (GANs) are a powerful indirect genotype-to-phenotype mapping for evolution- ary search, but they have limitations. In particular, GAN output does not scale to arbitrary dimensions, and there is no obvious way to combine GAN outputs into a cohesive whole, which would be useful in many areas, such as video game level generation. Game levels often consist of several segments, sometimes repeated directly or with variation, organized into an engaging pattern. Such patterns can be produced with Compositional Pattern Producing Networks (CPPNs). Specifically, a CPPN can define latent vector GAN inputs as a function of geometry, which provides a way to organize level segments output by a GAN into a complete level. However, a collection of latent vectors can also be evolved directly, to produce more chaotic levels. Here, we propose a new hybrid approach that evolves CPPNs first, but allows the latent vectors to evolve later, and combines the benefits of both approaches. These approaches are evaluated in Super Mario Bros. and The Legend of Zelda. We previously demonstrated via divergent search (MAP-Elites) that CPPNs better cover the space of possible levels than directly evolved levels. Here, we show that the hybrid approach can cover areas that neither of the other methods can and achieves comparable or superior QD scores. Index Terms—Indirect Encoding, Generative Adversarial Net- work, Compositional Pattern Producing Network, Procedural Content Generation via Machine Learning. I. I NTRODUCTION G ENERATIVE Adversarial Networks (GANs [1]), a type of generative neural network trained in an unsupervised way, are capable of reproducing certain aspects of a given training set. For example, they can generate diverse high- resolution samples of a variety of different image classes [2]. Several recent works have shown that it is possible to learn the structure of video game levels using GANs [3], [4], [5], [6], but these approaches only generate small level segments. In contrast, complete game levels can consist of many segments, sometimes repeated, often with variation. A valid question regarding such GAN-based approaches is whether they can scale to generate arbitrarily large artefacts that have a modular structure, such as these complete game levels. To scale up GAN-based generation techniques from generat- ing level segments to generating complete levels, we combine Compositional Pattern Producing Networks (CPPNs [7]) with GANs. A CPPN is a special type of neural network that gen- erates patterns with regularities such as symmetry, repetition, J. Schrum, B. Capps, and K. Steckel are from Southwestern University in Georgetown, TX USA. J. Schrum is an Associate Professor, B. Capps is a current undergraduate student, and K. Steckel is a recent graduate. V. Volz is an AI researcher at modl.ai and an Honorary Lecturer at Queen Mary University London, UK, and S. Risi is both a co-founder of modl.ai and a Full Professor at ITU Copenhagen in Denmark. and repetition with variation. CPPNs have succeeded in many domains [8], [9], [10], [11], [12], [13]. Our approach, CPPN2GAN, was first introduced in 2020 [14]. It is a doubly-indirect encoding. An evolved CPPN represents a pattern on the geometry of level. The CPPN maps coordinates of level segments to vectors in a latent space, where the next level of indirection occurs. These vectors are fed as inputs to a pre-trained GAN, which outputs the level segment that belongs at the specified location (Fig. 1). This paper goes further by hybridizing CPPN2GAN with directly evolved latent vector inputs, a method we called Di- rect2GAN. The new approach takes inspiration from HybrID [15], and begins evolution with CPPN genomes, but lets them transition into latent vector genomes to gain the benefits of both approaches. In fact, starting with a CPPN-based focus on global patterns, and only later switching to a vector encoding that allows for local variations, can discover solutions that would not be discovered by either approach in isolation. This combination approach is called CPPNThenDirect2GAN. We evaluate the three methods using several different eval- uation metrics proposed in previous work, and consider both performance as well as their ability to cover the space of possible solutions. We thus consider the ability of the different methods to optimize for specific characteristics (controllabil- ity), as well as to generate diverse levels, both important qual- ities of level generators [16]. The Quality Diversity algorithm MAP-Elites [17] is used to conduct these evaluations. Our ex- periments reaffirm our previous results [14] with CPPN2GAN and Direct2GAN in Super Mario Bros. and The Legend of Zelda 1 , and we add results both with an additional diversity characterization for each game, and with the newly proposed hybrid CPPNThenDirect2GAN approach. Results show that Direct2GAN is usually inferior to CPPN2GAN, and always inferior to the new CPPNThenDirect2GAN approach in terms of level quality and coverage of design space. CPPNThen- Direct2GAN is as good or better than CPPN2GAN depending on the game and diversity characterization. Ultimately, CPPN2GAN and CPPNThenDirect2GAN could be relevant not only for game levels, but other domains re- quiring large-scale pattern generators, e.g. texture generation, neural architecture search, computer-aided design, etc. II. PREVIOUS AND RELATED WORK This paper combines Generative Adversarial Networks (GANs) and Compositional Pattern Producing Networks (CPPNs) into a new form of Latent Variable Evolution (LVE). 1 Code available at https://github.com/schrum2/GameGAN arXiv:2105.12960v1 [cs.NE] 27 May 2021

Upload: others

Post on 13-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hybrid Encoding For Generating Large Scale Game Level

1

Hybrid Encoding For Generating Large Scale GameLevel Patterns With Local Variations Using a GAN

Jacob Schrum, Benjamin Capps, Kirby Steckel, Vanessa Volz, and Sebastian Risi

Abstract—Generative Adversarial Networks (GANs) are apowerful indirect genotype-to-phenotype mapping for evolution-ary search, but they have limitations. In particular, GAN outputdoes not scale to arbitrary dimensions, and there is no obviousway to combine GAN outputs into a cohesive whole, which wouldbe useful in many areas, such as video game level generation.Game levels often consist of several segments, sometimes repeateddirectly or with variation, organized into an engaging pattern.Such patterns can be produced with Compositional PatternProducing Networks (CPPNs). Specifically, a CPPN can definelatent vector GAN inputs as a function of geometry, whichprovides a way to organize level segments output by a GAN into acomplete level. However, a collection of latent vectors can also beevolved directly, to produce more chaotic levels. Here, we proposea new hybrid approach that evolves CPPNs first, but allowsthe latent vectors to evolve later, and combines the benefits ofboth approaches. These approaches are evaluated in Super MarioBros. and The Legend of Zelda. We previously demonstrated viadivergent search (MAP-Elites) that CPPNs better cover the spaceof possible levels than directly evolved levels. Here, we show thatthe hybrid approach can cover areas that neither of the othermethods can and achieves comparable or superior QD scores.

Index Terms—Indirect Encoding, Generative Adversarial Net-work, Compositional Pattern Producing Network, ProceduralContent Generation via Machine Learning.

I. INTRODUCTION

GENERATIVE Adversarial Networks (GANs [1]), a typeof generative neural network trained in an unsupervised

way, are capable of reproducing certain aspects of a giventraining set. For example, they can generate diverse high-resolution samples of a variety of different image classes [2].

Several recent works have shown that it is possible to learnthe structure of video game levels using GANs [3], [4], [5], [6],but these approaches only generate small level segments. Incontrast, complete game levels can consist of many segments,sometimes repeated, often with variation. A valid questionregarding such GAN-based approaches is whether they canscale to generate arbitrarily large artefacts that have a modularstructure, such as these complete game levels.

To scale up GAN-based generation techniques from generat-ing level segments to generating complete levels, we combineCompositional Pattern Producing Networks (CPPNs [7]) withGANs. A CPPN is a special type of neural network that gen-erates patterns with regularities such as symmetry, repetition,

J. Schrum, B. Capps, and K. Steckel are from Southwestern University inGeorgetown, TX USA. J. Schrum is an Associate Professor, B. Capps is acurrent undergraduate student, and K. Steckel is a recent graduate. V. Volzis an AI researcher at modl.ai and an Honorary Lecturer at Queen MaryUniversity London, UK, and S. Risi is both a co-founder of modl.ai and aFull Professor at ITU Copenhagen in Denmark.

and repetition with variation. CPPNs have succeeded in manydomains [8], [9], [10], [11], [12], [13].

Our approach, CPPN2GAN, was first introduced in 2020[14]. It is a doubly-indirect encoding. An evolved CPPNrepresents a pattern on the geometry of level. The CPPN mapscoordinates of level segments to vectors in a latent space,where the next level of indirection occurs. These vectors arefed as inputs to a pre-trained GAN, which outputs the levelsegment that belongs at the specified location (Fig. 1).

This paper goes further by hybridizing CPPN2GAN withdirectly evolved latent vector inputs, a method we called Di-rect2GAN. The new approach takes inspiration from HybrID[15], and begins evolution with CPPN genomes, but lets themtransition into latent vector genomes to gain the benefits ofboth approaches. In fact, starting with a CPPN-based focus onglobal patterns, and only later switching to a vector encodingthat allows for local variations, can discover solutions thatwould not be discovered by either approach in isolation. Thiscombination approach is called CPPNThenDirect2GAN.

We evaluate the three methods using several different eval-uation metrics proposed in previous work, and consider bothperformance as well as their ability to cover the space ofpossible solutions. We thus consider the ability of the differentmethods to optimize for specific characteristics (controllabil-ity), as well as to generate diverse levels, both important qual-ities of level generators [16]. The Quality Diversity algorithmMAP-Elites [17] is used to conduct these evaluations. Our ex-periments reaffirm our previous results [14] with CPPN2GANand Direct2GAN in Super Mario Bros. and The Legend ofZelda1, and we add results both with an additional diversitycharacterization for each game, and with the newly proposedhybrid CPPNThenDirect2GAN approach. Results show thatDirect2GAN is usually inferior to CPPN2GAN, and alwaysinferior to the new CPPNThenDirect2GAN approach in termsof level quality and coverage of design space. CPPNThen-Direct2GAN is as good or better than CPPN2GAN dependingon the game and diversity characterization.

Ultimately, CPPN2GAN and CPPNThenDirect2GAN couldbe relevant not only for game levels, but other domains re-quiring large-scale pattern generators, e.g. texture generation,neural architecture search, computer-aided design, etc.

II. PREVIOUS AND RELATED WORK

This paper combines Generative Adversarial Networks(GANs) and Compositional Pattern Producing Networks(CPPNs) into a new form of Latent Variable Evolution (LVE).

1Code available at https://github.com/schrum2/GameGAN

arX

iv:2

105.

1296

0v1

[cs

.NE

] 2

7 M

ay 2

021

Page 2: Hybrid Encoding For Generating Large Scale Game Level

2

CPPN2GAN

x

y

x

GAN GeneratorGenerator

32 z

4 x 4 x 256 8 x 8 x 128 16 x 16 x 64

32 x 32 x 10

conv conv

conv

Discriminator

1

10 z

ry Presence, etc.

CPPN evolved by NEAT

-1 0 1

-1

0

1

GAN GeneratorGenerator

32 z

4 x 4 x 256 8 x 8 x 128 16 x 16 x 64

32 x 32 x 10

conv conv

conv

Discriminator

1

10 z

r

Presence, etc.

Direct2GAN

Evolve CPPN

(a) CPPN2GAN (b) Direct2GAN

(c) CPPNThenDirect2GAN

Evol

ved

Vect

or

0.32

0.1

0.04

-0.2

0.6

0.93

1.0

-0.2

0.8

0.1

-0.4

0.5

0.2

0.34

… 0.32 0.1 0.4 0.8 -0.2 0.93 1.0 -1.0 …

Evolve Vector

MutationCPPN-generated Vector

32 x 32 x 3

32 x 32 x 3

Fig. 1: Three Genotype Encodings Applied to Zelda. (a) In the CPPN2GAN approach, the CPPN takes as input theCartesian coordinates of a level segment (x and y) and its distance from the center r, and for each segment produces adifferent latent vector z that is then fed into the generator of a GAN pre-trained on existing level content. The CPPN alsooutputs additional information determining whether the room should be placed, how its doors connect to other rooms, andother miscellaneous information. The approach captures patterns in the individual level segments, but also creates completemaps with global structure, i.e. imperfect radial symmetry as above. (b) In the Direct2GAN approach, one real-valued genomevector is evolved, which is chopped into individual vectors to generate each tile. A part of the vector (of size 10) is input intothe GAN, while additional values specify the presence of a room and additional room-specific properties. (c) Finally, in theCPPNThenDirect2GAN approach, vectors are first generated by evolving CPPNs (CPPN2GAN), facilitating the discovery ofglobal patterns, and then later switched to evolving a vector encoding (Direct2GAN) that allows for local variations.

A. CPPNs

Compositional Pattern Producing Networks (CPPNs [7]) areartificial neural networks with varying activation functions pernode. They are repeatedly queried across a geometric space ofpossible inputs and are thus well suited to generate geometricpatterns. For example, a CPPN can generate a 2D image bytaking pixel coordinates (x, y) as input and outputting intensityvalues for each corresponding pixel.

CPPNs typically include many different activation functionsbiased towards specific patterns and regularities. For example,a Gaussian function allows a CPPN output pattern to be sym-metric, and a periodic function such as sine creates repeatingpatterns. Other patterns, such as repetition with variation (e.g.fingers of a hand) can be created by combining functions (e.g.sine and Gaussian). CPPNs have been adapted to producea variety of patterns in domains such as 2D images [8],musical accompaniments [9], 3D objects [10], animations [13],physical robots [11], particle effects [12], and flowers [18].

CPPNs are traditionally optimized through NeuroEvolutionof Augmenting Topologies (NEAT [19]). NEAT starts with apopulation of simple neural networks: inputs are directly con-nected to outputs. Throughout evolution, mutations add nodesand connections. NEAT also allows for efficient crossover be-tween structural components with a shared origin. The benefitof NEAT is that it optimizes both the neural architecture andweights of the network at the same time. More recently, CPPN-inspired neural networks have also been optimized throughgradient descent-based approaches [20], [21].

While CPPNs can create patterns with complex regularities,

training CPPNs to recreate particular images is difficult [22].However, GANs do not share this weakness.

B. Generative Adversarial Networks

The training process of Generative Adversarial Networks(GANs [1]) is like a two-player adversarial game in which agenerator G and a discriminator D are trained at the sametime by playing against each other. The discriminator D’s jobis to classify samples as being generated (by G) or sampledfrom the training data. The discriminator aims to minimizeclassification error, but the generator tries to maximize it.Thus, the generator is trained to deceive the discriminator bygenerating samples that are indistinguishable from the trainingdata. After training, the discriminator D is discarded, and thegenerator G is used to produce new, novel outputs that capturethe fundamental properties present in the training examples.Input to G is some fixed-length vector from a latent space.

Generating content by parts, as is done by our CPPN2GANapproach, has also been investigated with conditional GANs(COCO-GAN [23]). However, in contrast to our approach,in which the CPPN can learn how to map coordinates ofsegments to latent vectors (facilitating generating patterns withregularities), COCO-GAN is conditioned on fixed coordinateswith a common latent vector to produce segments of an image.

For a properly trained GAN, randomly sampling latentvectors produces outputs that could pass as real images orlevels. However, to find content with certain properties (suchas a specific game difficulty, number of enemies), the latentspace needs to be searched, as described in the next section.

Page 3: Hybrid Encoding For Generating Large Scale Game Level

3

C. Latent Variable Evolution

The first latent variable evolution (LVE) approach was byBontrager et al. [24], who trained a GAN on real fingerprintimages and used evolutionary search to find latent vectors thatmatch with subjects in the dataset. Our previous work [14]introduces the first indirectly encoded LVE approach. Insteadof searching directly for latent vectors, parameters for CPPNsare sought. These CPPNs can generate a variety of differentlatent vectors, conditioned on the locations of level segments.

III. VIDEO GAME DOMAINS

The games in this paper rely on data from the Video GameLevel Corpus (VGLC [25]). Specifically, GAN models forSuper Mario Bros. and The Legend of Zelda are trained.

A. Super Mario Bros.

Super Mario Bros. (1985) is a platform game that involvesmoving left to right while running and jumping. Levels arevisualized with the Mario AI framework2.

The tile-based level representation from VGLC uses aparticular character symbol to represent each possible tile type.The encoding is extended to more accurately reflect the datain the original game, for example by adding different enemytypes. The representation of pipes is adjusted to avoid thebroken pipes seen in previous work [3]. Instead of usingfour different tile types for a pipe, a single tile is usedas an indicator for the presence of a pipe and extendedautomatically downward as required. A detailed explanationof all modifications made to the encoding can be found inwork by Volz [26, Chap. 4.3.3.2].

B. The Legend of Zelda

The Legend of Zelda (1986) is an action-adventure dungeoncrawler. The main character, Link, explores dungeons fullof enemies, traps, and puzzles. In this paper, the game isvisualized with an ASCII-based Rogue-like game engine usedin previous work [6]. Details on the mapping between originalVGLC game tiles and Rogue-like tiles can also be found there.

Previous work [6] reduced the large set of tiles inherentto Zelda to a smaller set based on functional requirements.Some Zelda tiles differ in purely aesthetic ways, and othersrely on complicated mechanics not implemented in the Rogue-like. The reduced set of tile types is as follows: regular floortiles, impassable tiles, and tiles that enemies can pass, but Linkrequires a special item to pass (raft item to cross water tiles).Enemies are not represented in the VGLC data because itsauthors did not include them3.

IV. APPROACH

The novel approach introduced in this paper is a hybrid ofCPPN2GAN and Direct2GAN, so each is explained in detailbefore explaining CPPNThenDirect2GAN. All approaches de-pend on a GAN trained on data for the target game.

2https://github.com/amidos2006/Mario-AI-Framework3VGLC erroneously refers to statues that occupy some rooms as enemies,

but they are simply impassable objects. Other enemies are absent from VGLC

A. GAN Training Details

The Mario and Zelda models are the same as used in ourprevious work [14] but details of their training are repeatedhere. Both models are Wasserstein GANs [27] differing onlyin the size of their latent vector inputs (10 for Zelda, 30 forMario), and the depth of the final output layer (3 for Zelda, 13for Mario). Their architecture otherwise matches that shownin Fig. 1. Output depth corresponds to the number of possibletiles for the game. The other output dimensions are 32 × 32,which is larger than the 2D region needed to render a levelsegment. The upper left corner of the output region is treatedas a generated level, and the rest is ignored.

To encode levels for training, each tile type is representedby a distinct integer, which is converted to a one-hot encodedvector before being input into the discriminator. The generatoralso outputs levels in the one-hot encoded format, which isthen converted back to integers. Mario or Zelda levels in thisinteger-based format are sent to the Mario AI framework orRogue-like engine for rendering.

GAN input files for Mario were created by processing all12 overworld level files from VGLC for Super Mario Bros.The GAN expects to always see a rectangular input of thesame size, so each input was generated by sliding a 28 (wide)× 14 (high) window over the level from left to right, one tileat a time, where 28 tiles is the width of the screen in Mario.

GAN input for Zelda was created from the 18 dungeons inVGLC for The Legend of Zelda, but the training samples arethe individual rooms in the dungeons, which are 16 (wide) ×11 (high) tiles. Many rooms are repeated within and acrossdungeons, so only unique rooms were included in the trainingset (only 38 samples). Training samples are simpler than theraw VGLC rooms because the various tile types are reducedto a set of just three as described in section III-B. Doors weretransformed into walls because door placement is not handledby the GAN, but rather by evolution, as described next.

B. Level Generation: CPPN2GAN

To generate a level using CPPNs and GANs, the CPPN isgiven responsibility for generating latent vector inputs for theGAN as a function of segment position within the larger level.

For Mario, the only input is the x-coordinate of the segment,scaled to [−1, 1]. For a level of three segments, the CPPNinputs would be −1, 0, and 1. CPPN output is an entire latentvector. Each latent vector is fed to the GAN to generate thesegment at that position in the level.

Zelda’s 2D arrangement of rooms is more complicated. Forthe overall dungeon shape to be interesting, some rooms inthe 2D grid need to be missing. Also, dungeons are typicallymore interesting if they are maze-like, so simply connectingall adjacent rooms would be boring. How maze-like a dungeonis also depends on its start and end points. These additionalissues are global design issues, and therefore are handled bythe CPPN (Fig. 1), which defines global patterns, rather thanthe GAN, which generates individual rooms.

Thus, CPPNs for Zelda generate latent vector inputs andadditional values that determine the layout and connectivityof the rooms. Zelda CPPNs take inputs of x and y coordinates

Page 4: Hybrid Encoding For Generating Large Scale Game Level

4

scaled to [−1, 1]. A radial distance input is also included toencourage radial patterns, which is common in CPPNs [8].For each set of CPPN inputs, the output is a latent vectoralong with seven additional numbers: room presence, rightdoor presence, down door presence, right door type, downdoor type, raft preference, and start/end preference.

Room presence determines the presence/absence of a roombased on whether the number is positive. Similarly, if a roomis present and has a neighboring room in the given direction,then positive right/down door presence values place a door inthe wall heading right/down. Whenever a door is placed, a dooris also placed in the opposite direction within the connectingroom, which is why top/left door outputs are not needed.For variety, the right/down door type determines the types ofdoors, based on different number ranges for each door type:[−1, 0] for plain, (0, 0.25] for puzzle-locked, (0.25, 0.5] forsoft-locked, (0.5, 0.75] for bomb-able passage, and (0.75, 1.0]for locked. Puzzle-locked doors are a new addition to thispaper which were absent in the initial paper on CPPN2GAN[14]. A puzzle-locked door can only be opened by findingand pushing a special block in the room in a certain direction.The remaining door types were present in previous work [14]:Soft-locked doors only open when all enemies in the room arekilled, bomb-able passages are secret walls that can be bombedto create a door, and locked doors need a key. Enough keys topair with all locked doors are placed at random locations inrooms of the dungeon. However, to assure that the genotypeto phenotype mapping is deterministic, the pseudo-randomgenerator responsible for placing keys is initialized using thebit representation of the corresponding right or down doortype output as a seed. When present, puzzle blocks are placedpseudo-randomly in the same manner.

Raft preference is another addition absent in the previousCPPN2GAN paper [14], though rafts were included in theGraph GAN paper [6] that initially introduced this Rogue-likeZelda domain. A raft is a special item that allows Link totraverse a single water tile. This item is only available in oneroom in each dungeon, and must be retrieved before it canbe used. The generated room with the highest raft preferencevalue is the one in which the raft is pseudo-randomly placed.The addition of the raft item can greatly increase the amountof back-tracking required in some dungeons.

The final output for start/end preference determines whichrooms are the start/end rooms of the dungeon. Across allrooms in the dungeon, the one whose start/end preferenceis smallest is the player’s starting room, and the one withthe largest output is the final goal room, designated by thepresence of a Triforce item.

This approach to generating complete levels is comparedwith the control approach described next.

C. Level Generation: Direct2GANDirect2GAN evolves levels consisting of S segments for a

GAN expecting latent inputs of size Z by evolving real-valuedgenome vectors of length S×Z. Each genome is chopped intoindividual GAN inputs at level generation.

This approach requires a convention as to how different seg-ments are combined into one level. For Mario’s linear levels,

adjacent GAN inputs from the combined vector correspond toadjacent segments in the generated level. The combined vectoris processed left to right to produce segments left to right.

To generate 2D Zelda dungeons, individual segments of thelinear genome are mapped to a 2D grid in row-major order:processing genome from left to right generates top row fromleft to right, then moves to next row down and so on. Forfair comparison with CPPN2GAN, each portion of a genomecorresponding to a single room contains not only the latentvector inputs, but the seven additional numbers for controllingglobal structure and connectivity: room presence, right doorpresence, down door presence, right door type, down doortype, raft preference, and start/end preference. Therefore, aM×N room grid requires genomes of length M×N×(Z+7).

Such massive genomes induce large search spaces that aredifficult to search, but they have the benefit of easily allowingarbitrary variation in any area of the genome. Therefore,CPPNThenDirect2GAN was developed to take advantage ofthe benefits of both CPPN2GAN and Direct2GAN.

D. Level Generation: CPPNThenDirect2GAN

This hybrid approach is inspired by the HybrID algorithm[15]. HybrID was used with HyperNEAT [28], an indirectencoding that evolves CPPNs, which in turn define the weightsof a pre-defined neural network architecture. The benefit ofevolving with CPPNs is that they easily impose global patternsof symmetry and repetition. However, localized variation isharder for a CPPN to represent. In contrast, localized variationis easy to produce given a directly encoded genome that simplyrepresents each network weight individually. HybrID beginsevolution using CPPNs, so that useful global patterns canbe easily found. However, at some point during evolutionthe CPPNs are discarded, leaving only the directly encodedcollection of weights they produced to evolve further.

CPPNThenDirect2GAN works in a similar fashion. Atinitialization, all individuals in the population are describingCPPNs and they are evolved as in CPPN2GAN. An additionalmutation operator is introduced that switches a given individ-ual to the Direct2GAN representation, which is just the outputof the CPPN queried at all coordinates (as described above).This individual is then further evolved as described in theDirect2GAN approach. This operation has a low probability,since genomes can never switch back to an indirectly encodedCPPN once they switch to a direct vector format.

In HybrID [15], the transition from indirect to direct encod-ing occurred at a particular generation. In contrast, our hybridapproach switches individuals based on random chance andthus allows for mixed populations, which makes more sensewhen using MAP-Elites to evolve a diverse population.

V. EXPERIMENTS

Demonstrating the expressive range of new game levelencodings is important. Therefore, the Quality Diversity al-gorithm MAP-Elites [17], which divides the search space intophenotypically distinct bins, is used with each approach.

Page 5: Hybrid Encoding For Generating Large Scale Game Level

5

A. MAP-Elites

Instead of only optimizing towards an objective, as in stan-dard evolutionary algorithms, MAP-Elites (Multi-dimensionalArchive of Phenotypic Elites [17]) collects a diversity of qual-ity artefacts that differ along N predefined dimensions. MAP-Elites discretizes the space of artefacts into bins and, givensome objective, maintains the highest performing individualfor each bin in the N -dimensional behavior space.

Our implementation starts by generating an initial popula-tion of 100 random individuals that are placed in bins basedon their attributes. Each bin only holds one individual, soindividuals with higher fitness replace less fit individuals. Oncethe initial population is generated, solutions are uniformlysampled from the bins and undergo crossover and/or mutationto generate new individuals. These newly created individualsalso replace less fit individuals as appropriate, or end upoccupying new bins, filling out the range of possible designs.Our experiments generate 100,000 individuals per run after theinitial population is generated.

There are many ways of defining a binning scheme to char-acterize diversity in a domain. The initial study introducingCPPN2GAN [14] used one binning scheme with each game,but because the performance of a given encoding dependson the binning scheme, this paper uses two different binningschemes for each game to better explore the trade-offs betweenthe encodings being studied. A fitness/quality measure is alsorequired, but is kept consistent in each of the games studied.

B. Dimensions of Variation Within Levels

Two binning schemes were used on Mario levels. One fromprevious work [14] and one new to this article. Both are basedon measurements of three quantities: decoration frequency,space coverage, and leniency. These measures were inspiredby a study on evaluation measures for platformer games [29].Each measure expresses different characteristics of a level:

• decoration frequency: Percentage of non-standard tiles4

• space coverage: Percentage of tiles Mario can stand on5

• leniency: Average of leniency values6 across all tiles.Enemies/gaps are negative, power-ups are positive.

All measures focus on visual characteristics, but also relateto how a player can navigate through a level. The previousbinning scheme [14] calculated scores for individual segments(10 per level) and then summed across the segments.

Preliminary experiments uncovered reasonable ranges forbinning, allowing each dimension to be discretized into 10equally sized intervals. Leniency has both negative and posi-tive values, so its scores are evenly divided into 5 negative binsand 5 non-negative bins. Negative bins correspond to greaterchallenges, and non-negative bins correspond to easier levels.This binning scheme is referred to as Sum DSL.

However, one way to hit a target bin in Sum DSL is torepeat a segment with appropriate properties without variation10 times. This type of level might be boring to play and

4breakable tiles, question blocks, pipes, all enemies5solid and breakable tiles, question blocks, pipes and bullet bills61: question blocks; -0.5: pipes, bullet bills, gaps in ground; -1: moving

enemies; 0: remaining

CPPNs have an advantage at generating this type of level, so analternate binning scheme encouraging more segment diversitywithin levels was developed: Distinct ASAD.

Distinct ASAD uses alternating space coverage, alternatingdecoration frequency, and the number of distinct segments,which is a count of segments that skips repeats. Distinctnessdirectly encourages variation, though two segments are con-sidered distinct for just a single different tile. The alternatingdimensions measure how much their quantities fluctuate fromsegment to segment. Specifically, if S(i) calculates a givenscore value (e.g. space coverage) for the i-th segment in thelevel, then the alternating version of that score is:

AS =

9∑i=1

|S(i− 1)− S(i)| (1)

Using this formula to distinguish levels from each otherencourages more variation in the segments within each level.Both alternating scores are discretized into 10 intervals.

For both binning schemes, fitness is the length of theshortest path to beat the level. Maximizing path length favorslevels that require jumps, the main mechanic of the game. If nopath can be found the level is deemed unsolvable and receivesa fitness of 0. To determine the path, A* search is performedon the tile-based representation of the level, with a heuristicencouraging heading to the right.

Zelda also uses a binning scheme from previous work [14]and a new one. The old scheme, WWR, is based on wa-ter tile percentage, wall tile percentage, and the number ofreachable rooms. A room is reachable if it is the start room,or a door connects it to a reachable room. This definitionis cheap to compute, but ignores how single rooms can beimpassable. Water and wall tile percentages are calculatedonly with respect to reachable rooms, and only for the 12× 7floor regions of rooms (surrounding walls ignored). Bins forthese dimensions are divided into 10% ranges (10 bins perdimension). Some bins are impossible to fill, because the sumof water and wall percentages must be less than 100%. Floortiles occupy additional space. For the number of reachablerooms, there is a bin for each number out of 25 (maximumpossible number in a 5×5 grid). Note that in the previous paper[14], experiments were conducted with 100-room dungeons ina 10×10 grid. The experiments in this paper were repeated dueto changes in the Zelda level-generation approach mentionedin Section IV-B, but were done with smaller dungeons toreduce computational cost. However, all results presented hereare consistent with those from the previous paper.

The new binning scheme for Zelda is Distinct BTR. Itencourages more variety in the rooms within each dungeon,and encourages different types of paths through dungeons. Thespecific bin dimensions are the number of distinct rooms, thenumber of backtracked rooms in the A* solution path, andthe number of reachable rooms (as before). The backtrackingdimension deserves some elaboration.

In commercial games, some dungeons can be easily tra-versed in a single pass, but others require the player to returnto previously visited areas multiple times. For example, theplayer may need to find a key before backtracking to a room

Page 6: Hybrid Encoding For Generating Large Scale Game Level

6

with a locked door, or find a special item (like the raft) beforetraversing a particular obstacle. Some players enjoy this typeof backtracking, while others can find it frustrating, whichmakes it a good dimension of variation.

The backtracking score is calculated by following the A*path and marking each room the player exits. Exiting a roommeans another room is being entered. Whenever a newlyentered room exists in the set of previously exited rooms, acounter is incremented to measure the amount of backtrackingrequired by the optimal path. If the path revisits a roommultiple times, each revisit increments the backtracking count.

The fitness for Zelda dungeons is the percentage of reach-able rooms traversed by the A* path from start to goal. Theobjective is to maximize the number of rooms visited, asexploring is one of the main mechanics of the game. If no pathcan be found, the dungeon is deemed unsolvable and receives0 fitness. The A* heuristic used is Manhattan distance to thegoal. Since the inclusion of keys makes the state space verylarge, there is a computation budget of 100,000 states.

C. Evolution Details

CPPN2GAN levels are evolved with a variant of NEAT[19] (Section II-A), specifically MM-NEAT7. Because CPPNsare being evolved, every neuron in each network can have adifferent activation function from the following list: sawtoothwave, linear piecewise, id, square wave, cosine, sine, sigmoid,Gaussian, triangle wave, and absolute value.

Whenever a new network is generated, is has a 50% chanceof being the offspring of two parents rather than a clone. Theresulting network then has a 20% chance of having a newnode spliced in, 40% chance of creating a new link, and a30% chance of randomly replacing one neuron’s activationfunction. There is a per-link perturbation rate of 5%.

For Direct2GAN, real-valued vectors are initialized withrandom values in the range [−1, 1]. When offspring areproduced, there is a 50% chance of single-point crossover.Otherwise, the offspring is a clone of one parent. Either way,each real number in the vector then has an independent 30%chance of polynomial mutation [30].

When using CPPNThenDirect2GAN, all genomes start asCPPNs. When bins are randomly sampled to generate off-spring, a CPPN or directly encoded vector could be selected.CPPNs have a 30% chance of being converted into directlyencoded vectors. This procedure generates a directly-encodedgenome that represents the exact same level previously en-coded by the CPPN. However, the newly generated vectorgenome immediately undergoes the mutation for real-valuedvectors described above, so it will only persist in the archive ifthe resulting level is an elite. Genomes that are not convertedare exposed to the standard mutation probabilities for theirencoding as described above. Whenever crossover occurs,parents mate in the usual fashion if they are of the same type,but if two parents of different types are selected, then thecrossover operation is cancelled and the first parent is simplymutated to create a new offspring.

7https://github.com/schrum2/MM-NEAT

VI. RESULTS

This section highlights our most relevant results, butan online appendix contains additional result figures andsample evolved levels: https://southwestern.edu/∼schrum2/SCOPE/cppn-then-direct-to-gan.php

Fig. 2 shows the average QD score across 30 runs ofeach genome encoding for each possible binning scheme inthe two games. QD score [31] is the sum of the fitnessscores of all elites in the archive, and gives an indicationof both the coverage and quality of solutions. Fig. 8 (ap-pendix) shows the average number of bins filled by eachencoding, and is qualitatively similar. In Mario, CPPN2GANand CPPNThenDirect2GAN are statistically tied in terms offilled bins and QD score, but are significantly better thanDirect2GAN (p < 0.05) for both binning schemes. Perfor-mance in Zelda depends more on the binning scheme. In theWWR scheme from previous work, CPPN2GAN is slightlybetter than CPPNThenDirect2GAN in terms of filled bins andQD score (p < 0.05), though CPPNThenDirect2GAN almostcatches up in terms of QD score. Both are far better thanDirect2GAN in both metrics (p < 0.05). For the new DistinctBTR scheme, CPPNThenDirect2GAN is significantly betterthan both CPPN2GAN and Direct2GAN in terms of filledbins and QD score (p < 0.05). CPPN2GAN and Direct2GANare statistically tied in terms of filled bins, but Direct2GANactually has the better QD score (p < 0.05).

Fig. 3 analyzes the final archives for Sum DSL. Theseresults reaffirm the dominance of CPPN2GAN over Di-rect2GAN [14], and demonstrate the qualitative similaritybetween CPPN2GAN and the new CPPNThenDirect2GAN.For this binning scheme, CPPNs are generally superior, but thenew hybrid CPPNThenDirect2GAN approach is not hinderedby also producing directly encoded genomes. Average archiveheat maps for each method are in Fig. 9 (appendix).

Fig. 4 shows results for Distinct ASAD. Direct2GAN per-forms even worse in terms of coverage, and there is no differ-ence in bin occupancy between CPPN2GAN and CPPNThen-Direct2GAN. However, CPPNThenDirect2GAN and plain Di-rect2GAN more frequently produce the highest quality levelas the number of distinct segments increases. Average archiveheat maps are in Fig. 10 (appendix).

Fig. 5 shows results for WWR. Although these dungeonsare smaller than those from previous work [14], the resultsare consistent with the previous paper: CPPN2GAN is bet-ter than Direct2GAN. CPPNThenDirect2GAN is comparableto CPPN2GAN, but there are bins with many reachablerooms where CPPN2GAN is represented and CPPNThen-Direct2GAN is not. Furthermore, in some bins with fewreachable rooms, CPPNThenDirect2GAN is not represented.Direct2GAN is the method most underrepresented across allbins, but sometimes has the highest quality solutions. However,CPPNThenDirect2GAN generally has the most best solutionsin bins with a large number of reachable rooms. Average heatmaps in Fig. 11 (appendix).

Finally, Fig. 6 compares methods using Distinct BTR.This comparison shows that although CPPNThenDirect2GANoccupies more bins than either Direct2GAN or CPPN2GAN,

Page 7: Hybrid Encoding For Generating Large Scale Game Level

7

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

50000

100000

150000

200000

0 25000 50000 75000 100000Generated Individuals

QD

Sco

re

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(a) Mario: Sum DSL

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0e+00

1e+05

2e+05

3e+05

0 25000 50000 75000 100000Generated Individuals

QD

Sco

re

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(b) Mario: Distinct ASAD

●●

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0

100

200

300

400

0 25000 50000 75000 100000Generated Individuals

QD

Sco

re

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(c) Zelda: WWR

●●

●●

●●

●●

●● ● ● ● ● ● ● ● ●

0

1000

2000

3000

0 25000 50000 75000 100000Generated Individuals

QD

Sco

re

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(d) Zelda: Distinct BTR

Fig. 2: Average QD Score Across 30 Runs of MAP-Elites. For the two MAP-Elites binning schemes used in Mario andZelda, plots of the average QD Score with 95% confidence intervals demonstrate the comparative performance of the threeencoding schemes. CPPNThenDirect2GAN is always the best or statistically tied for best (eventually catching up to CPPN2GANin WWR). Direct2GAN is always worst, except for in Distinct BTR, where it beats CPPN2GAN, but is still inferior toCPPNThenDirect2GAN.

Leniency Bin: 0 Leniency Bin: 1 Leniency Bin: 2 Leniency Bin: 3 Leniency Bin: 4

Leniency Bin: −5 Leniency Bin: −4 Leniency Bin: −3 Leniency Bin: −2 Leniency Bin: −1

Decoration Frequency Bin

Spa

ce C

over

age

Bin

Occupied by: All No Direct2GAN Only CPPNThenDirect2GAN Only CPPN2GAN None

(a) Difference in Bin Occupancy

Leniency Bin: 0 Leniency Bin: 1 Leniency Bin: 2 Leniency Bin: 3 Leniency Bin: 4

Leniency Bin: −5 Leniency Bin: −4 Leniency Bin: −3 Leniency Bin: −2 Leniency Bin: −1

Decoration Frequency Bin

Spa

ce C

over

age

Bin

Highest Quality: CPPN2GAN CPPNThenDirect2GAN CPPN Tie None

(b) Best Occupant

Fig. 3: MAP-Elites Archive Comparisons Across 30 Runs of Evolution in Mario Using Sum DSL. Each sub-grid represents adifferent leniency score. Within each sub-grid, summed decoration frequency increases to the right, and summed space coverageincreases moving up. (a) Color coding shows whether any of the 30 runs of each method produced an occupant for each bin.Direct2GAN leaves many bins completely absent, but CPPN2GAN and CPPNThenDirect2GAN have similar coverage. (b) Theaverage bin quality scores across 30 runs of each method were calculated, and the method with the best average quality isindicated for each bin. Direct2GAN was never the best, though CPPN2GAN and CPPNThenDirect2GAN have a comparablenumber and spread of best bins, and tie for best in some cases (indicated by “CPPN Tie”).

each method occupies some bins that the others do not.However, CPPNThenDirect2GAN quality is superior toCPPN2GAN in nearly every bin where both are present, andsuperior in quality to Direct2GAN in some cases as well.Average heat maps in Fig. 12 (appendix).

Among the generated levels, the global patterns in Zeldadungeons from CPPN2GAN and CPPNThenDirect2GANstand out immediately. There is often a symmetrical or re-peating motif in these dungeons that is missing in results fromDirect2GAN. Even when all the rooms in a CPPN-generateddungeon are distinct, there is often a theme throughout thedungeon, such as a water-theme, or reuse of maze-like wallobstacles. Some examples are in Fig. 7, and many more arein the appendix (Fig. 13).

Mario levels produced by CPPNs tend to have too muchrepetition, except when the number of distinct segments isan explicit dimension of variation. When encouraged to bedistinct, CPPN levels have more variety, yet maintain a themeof similar elements. An example of a repeated theme isan arrangement of blocks that presents a particular jumpingchallenge, but requires a slightly different approach on eachoccurrence due to small variations. See appendix (Fig. 14).

VII. CONCLUSIONS AND FUTURE WORK

Generative Adversarial Networks (GANs) have shown im-pressive results as generators for high-quality images. How-ever, combining multiple GAN-generated patterns into a cohe-sive whole, which is especially relevant for level generation ingames, was so far an unexplored area of research. Our previouswork [14] presented the first method that can create large-scale game levels through the combination of a CompositionalPattern Producing Network and a GAN. One of the maininsights is that there is a functional relationship between thelatent vectors of different game segments, which the CPPNis able to exploit. In comparison to a direct representationof multiple latent vectors, the CPPN2GAN approach cangenerate a larger variety of different complete game levels.CPPNThenDirect2GAN goes one step further to incorporatestrengths of both approaches.

Our original work showed that CPPN2GAN results of-ten contained repeated segments, and those results are re-peated here. However, the additional experiments in this paperdemonstrate that CPPN2GAN can generate diverse segmentswithin a single level if distinct segments are a dimensionof variation for MAP-Elites. CPPN2GAN levels can easily

Page 8: Hybrid Encoding For Generating Large Scale Game Level

8

Distinct Segments: 6 Distinct Segments: 7 Distinct Segments: 8 Distinct Segments: 9 Distinct Segments: 10

Distinct Segments: 1 Distinct Segments: 2 Distinct Segments: 3 Distinct Segments: 4 Distinct Segments: 5

Alternating Decoration Bin

Alte

rnat

ing

Spa

ce C

over

age

Bin

Occupied by: All No Direct2GAN None

(a) Difference in Bin Occupancy

Distinct Segments: 6 Distinct Segments: 7 Distinct Segments: 8 Distinct Segments: 9 Distinct Segments: 10

Distinct Segments: 1 Distinct Segments: 2 Distinct Segments: 3 Distinct Segments: 4 Distinct Segments: 5

Alternating Decoration Bin

Alte

rnat

ing

Spa

ce C

over

age

Bin

Highest Quality: CPPN2GAN CPPNThenDirect2GAN Direct2GAN CPPN Tie None

(b) Best Occupant

Fig. 4: MAP-Elites Archive Comparisons Across 30 Runs of Evolution in Mario Using Distinct ASAD. Methods arecompared as in Fig. 3, but with Distinct ASAD. Each sub-grid now represents the count of distinct segments. In eachsub-grid, alternating decoration score increases to the right, and alternating space coverage score increases moving up. Theupper-left grid is mostly empty, since it corresponds to levels with only one repeated segment. When the same segment isrepeating, decoration and space coverage scores cannot alternate. (a) Direct2GAN is missing from many bins, but CPPN2GANand CPPNThenDirect2GAN have identical coverage. (b) The highest scoring levels for fewer distinct segments mostly comefrom CPPN2GAN, though some CPPNThenDirect2GAN results are mixed in. As the number of distinct segments increases,CPPNThenDirect2GAN and then Direct2GAN become more prominent.

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Water Percentage Bin

Wal

l Per

cent

age

Bin

Occupied by: AllNo CPPNThenDirect2GAN

No Direct2GANOnly CPPN2GAN

Only Direct2GANNone

(a) Difference in Bin Occupancy

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Water Percentage Bin

Wal

l Per

cent

age

Bin

Highest Quality: CPPN2GANCPPNThenDirect2GAN

Direct2GANCPPN Tie

Not Hybrid TieAll Tie

None

(b) Best Occupant

Fig. 5: MAP-Elites Archive Comparisons Across 30 Runs of Evolution in Zelda Using WWR. The three methods arecompared the same way as in Mario. Each triangular grid corresponds to levels with a particular number of reachable rooms(top-right of the grid). Wall percentage increases upwards and water percentage increases to the right. Grids are triangularbecause there is a trade-off between water percentage and wall percentage (sum cannot exceed 100%). (a) Direct2GAN cannotproduce levels with only one reachable room, but otherwise has high representation when the number of reachable roomsis smaller and less representation when the number is larger. CPPN2GAN is mostly comparable to CPPNThenDirect2GAN.However, CPPNThenDirect2GAN fails to reach certain bins that CPPN2GAN reaches when the number of reachable rooms ishigh, and is even sometimes beaten by Direct2GAN when the number of reachable rooms is small. (b) CPPNThenDirect2GANmakes up for less coverage with higher quality scores in many bins, particularly as the number of reachable rooms grows.Bins with a “Not Hybrid Tie” represent a tie between CPPN2GAN and Direct2GAN. For certain small numbers of reachablerooms there are bins where all three methods tie.

Page 9: Hybrid Encoding For Generating Large Scale Game Level

9

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Backtracked Rooms

Dis

tinct

Roo

ms

AllNo CPPNThenDirect2GAN

No CPPN2GANNo Direct2GAN

Only CPPNThenDirect2GANOnly CPPN2GAN

Only Direct2GANNone

Occupied by:

(a) Difference in Bin Occupancy

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Backtracked Rooms

Dis

tinct

Roo

ms

CPPN2GANCPPNThenDirect2GAN

Direct2GANCPPN Tie

Direct TieNot Hybrid Tie

All TieNone

Highest Quality:

(b) Best Occupant

Fig. 6: MAP-Elites Archive Comparisons Across 30 Runs of Evolution in Zelda Using Distinct BTR. Methods arecompared as in Fig. 5, but with Distinct BTR. The number by each sub-grid still represents the number of reachable rooms.Backtracking increases to the right and distinct rooms increases moving up. When there are fewer reachable rooms, lessbacktracking is possible, hence the changing sub-grid width. (a) Dark blue regions show bins that only CPPNThenDirect2GANreaches, though each method has some bins to themselves. Direct2GAN backtracks well in dungeons with many distinct rooms,but CPPN2GAN and CPPNThenDirect2GAN dominate bins for smaller numbers of distinct rooms. (b) CPPN2GAN almostnever produces the best result for a bin. There are regions where Direct2GAN does best. However, CPPNThenDirect2GAN isa clear winner in terms of the number of best bin occupants, especially as the number of reachable rooms increases. In somerare bins with a “Direct Tie” Direct2GAN and CPPNThenDirect2GAN are both the best.

(a) Direct2GAN (b) CPPN2GAN (c) CPPNThenDirect2GAN

Fig. 7: Evolved Zelda Levels. (a) Direct2GAN generates levels with unique rooms and structure, but struggles to form a cohesivepattern. (b) CPPN2GAN generates levels with symmetry or other global patterns. (c) CPPNThenDirect2GAN generates levelswith interesting overall patterns like CPPN2GAN, but can tweak individual rooms like Direct2GAN to increase overall fitness.

evolve any number of distinct segments within a level, butDirect2GAN has trouble repeating segments within a level.

The new hybrid approach introduced in this paper isCPPNThenDirect2GAN. First, CPPN2GAN introduces globalstructure and imposes regular patterns in the level. Aftermutation, Direct2GAN can take over to fine-tune levels and/orintroduce more local variety. For the binning schemes inMario, this approach is comparable to CPPN2GAN, whichalready fills many bins in the archive. However, CPPNThen-Direct2GAN produced interesting results in Zelda, particularly

in the new Distinct BTR scheme where it outperformedboth Direct2GAN and CPPN2GAN. Specifically, CPPNThen-Direct2GAN filled many bins that neither of the other methodsfilled, and had the highest quality levels in many bins as well.

However, Direct2GAN still managed to reach some binsthat CPPNThenDirect2GAN could not. These bins could per-haps be reached if CPPNThenDirect2GAN genomes were notforced to start out as CPPNs. Starting as a CPPN may intro-duce a bias towards patterns that is so strong that subsequentdirect vector manipulation has trouble breaking the patterns.

Page 10: Hybrid Encoding For Generating Large Scale Game Level

10

Therefore, the initial population could simply be a combinationof CPPN and Direct genomes. However, there would alwaysbe a risk of one type replacing the other in any given bin. Thisis problematic if it eliminates useful stepping stones. Certainbins might only be easily reachable via a Direct or CPPNgenome. Our results further show that it is the hybridizationthat makes some levels discoverable, as they are not present inthe combined archives of either CPPN2GAN or Direct2GAN.

Unfortunately, there is no easy way to convert a Directgenome back into a CPPN genome. Furthermore, Directgenomes lose the ability to scale to arbitrary sizes; a majorbenefit of CPPNs. In fact, CPPN2GAN could enable levelsin which components are lazily generated as needed in levelsthat never stop growing. This would be especially useful forexploration games, as new segments (e.g. dungeon rooms)can be served by CPPN2GAN whenever new areas of themap are discovered by the player. Evaluating this specialscenario, and finding a way to incorporate the benefits ofdirectly encoded components, is an interesting area for futurework. Further work is also planned on characterizing differentbinning schemes and their relationship to the performance ofdifferent level generators.

Besides different combination approaches, modifications tothe training process are also possible. Here, the GAN waspre-trained and only the CPPNs were evolved. However, itshould be possible to make use of a discriminator to alsodecide whether global patterns (such as Zelda room structure)are similar to the original levels. This would mean adversar-ial training of the complete CPPN2GAN network against adiscriminator. Training could be end-to-end or by alternatingbetween the CPPN and the generator. The resulting samplesshould be able to reproduce both global and local patterns incomplete game levels rather than just in individual segments.

ACKNOWLEDGMENT

The authors thank Schloss Dagstuhl and the organisers ofDagstuhl Seminars 17471 and 19511 for hosting productiveseminars. The authors also thank the SCOPE program atSouthwestern University for supporting continuation of thisresearch with undergraduate researchers.

REFERENCES

[1] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Nets,”in Advances in Neural Information Processing Systems, 2014.

[2] A. Brock, J. Donahue, and K. Simonyan, “Large Scale GAN TrainingFor High Fidelity Natural Image Synthesis,” arXiv:1809.11096, 2018.

[3] V. Volz, J. Schrum, J. Liu, S. M. Lucas, A. M. Smith, and S. Risi,“Evolving Mario Levels in the Latent Space of a Deep ConvolutionalGenerative Adversarial Network,” in Genetic and Evolutionary Compu-tation Conference. ACM, 2018.

[4] K. Park, B. W. Mott, W. Min, K. E. Boyer, E. N. Wiebe, and J. C.Lester, “Generating Educational Game Levels with Multistep DeepConvolutional Generative Adversarial Networks,” in Conference onGames. IEEE, 2019, pp. 1–8.

[5] R. R. Torrado, A. Khalifa, M. C. Green, N. Justesen, S. Risi, andJ. Togelius, “Bootstrapping Conditional GANs for Video Game LevelGeneration,” in Conference on Games. IEEE, 2020.

[6] J. Gutierrez and J. Schrum, “Generative Adversarial Network Roomsin Generative Graph Grammar Dungeons for The Legend of Zelda,” inCongress on Evolutionary Computation. IEEE, 2020.

[7] K. O. Stanley, “Compositional Pattern Producing Networks: A NovelAbstraction of Development,” Genetic Programming and EvolvableMachines, vol. 8, no. 2, pp. 131–162, 2007.

[8] J. Secretan, N. Beato, D. B. D’Ambrosio, A. Rodriguez, A. Campbell,J. T. Folsom-Kovarik, and K. O. Stanley, “Picbreeder: A Case Study inCollaborative Evolutionary Exploration of Design Space,” EvolutionaryComputation, vol. 19, no. 3, pp. 373–403, 2011.

[9] A. K. Hoover, P. A. Szerlip, M. E. Norton, T. A. Brindle, Z. Merritt, andK. O. Stanley, “Generating a Complete Multipart Musical Compositionfrom a Single Monophonic Melody with Functional Scaffolding,” inInternational Conference on Computational Creativity, 2012.

[10] J. Clune and H. Lipson, “Evolving Three-dimensional Objects with aGenerative Encoding Inspired by Developmental Biology,” in EuropeanConference on Artificial Life, 2011, pp. 141–148.

[11] D. Cellucci, R. MacCurdy, H. Lipson, and S. Risi, “1D Printing ofRecyclable Robots,” IEEE Robotics and Automation Letters, vol. 2,no. 4, 2017.

[12] E. J. Hastings, R. K. Guha, and K. O. Stanley, “Automatic ContentGeneration in the Galactic Arms Race Video Game,” IEEE Transactionson Computational Intelligence and AI in Games, vol. 1, no. 4, 2009.

[13] I. Tweraser, L. E. Gillespie, and J. Schrum, “Querying Across Time toInteractively Evolve Animations,” in Genetic and Evolutionary Compu-tation Conference. ACM, 2018.

[14] J. Schrum, V. Volz, and S. Risi, “CPPN2GAN: Combining Composi-tional Pattern Producing Networks and GANs for Large-scale PatternGeneration,” in Genetic and Evolutionary Computation Conference.ACM, 2020.

[15] J. Clune, B. E. Beckmann, R. T. Pennock, and C. Ofria, “HybrID:A Hybridization of Indirect and Direct Encodings for EvolutionaryComputation,” in European Conference on Artificial Life, 2009.

[16] N. Shaker, G. Smith, and G. N. Yannakakis, “Evaluating contentgenerators,” in Procedural Content Generation in Games: A Textbookand an Overview of Current Research, N. Shaker, J. Togelius, and M. J.Nelson, Eds. Springer, 2016, pp. 215–224.

[17] J. Mouret and J. Clune, “Illuminating Search Spaces by Mapping Elites,”arXiv:1504.04909, 2015.

[18] S. Risi, J. Lehman, D. B. D’Ambrosio, R. Hall, and K. O. Stanley,“Combining Search-Based Procedural Content Generation and SocialGaming in the Petalz Video Game,” in Artificial Intelligence andInteractive Digital Entertainment, 2012.

[19] K. O. Stanley and R. Miikkulainen, “Evolving Neural Networks ThroughAugmenting Topologies,” Evolutionary Computation, 2002.

[20] D. Ha, A. Dai, and Q. V. Le, “HyperNetworks,” in InternationalConference on Learning Representations, 2017.

[21] C. Fernando, D. Banarse, M. Reynolds, F. Besse, D. Pfau, M. Jaderberg,M. Lanctot, and D. Wierstra, “Convolution by Evolution: DifferentiablePattern Producing Networks,” in Genetic and Evolutionary ComputationConference, 2016, pp. 109–116.

[22] B. G. Woolley and K. O. Stanley, “On the Deleterious Effects of APriori Objectives on Evolution and Representation,” in Genetic andEvolutionary Computation Conference. ACM, 2011, pp. 957–964.

[23] C. H. Lin, C.-C. Chang, Y.-S. Chen, D.-C. Juan, W. Wei, and H.-T.Chen, “COCO-GAN: Generation by parts via conditional coordinating,”in IEEE/CVF International Conference on Computer Vision, 2019.

[24] P. Bontrager, J. Togelius, and N. Memon, “DeepMasterPrint: GeneratingFingerprints for Presentation Attacks,” arXiv:1705.07386, 2017.

[25] A. J. Summerville, S. Snodgrass, M. Mateas, and S. Ontanon, “TheVGLC: The Video Game Level Corpus,” in Procedural Content Gener-ation in Games. ACM, 2016.

[26] V. Volz, “Uncertainty Handling in Surrogate Assisted Optimisation ofGames,” Ph.D. dissertation, TU Dortmund University, Germany, 2019.

[27] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein Generative Ad-versarial Networks,” in International Conference on Machine Learning,2017.

[28] K. O. Stanley, D. B. D’Ambrosio, and J. Gauci, “A Hypercube-basedEncoding for Evolving Large-scale Neural Networks,” Artificial Life,2009.

[29] A. Summerville, J. R. H. Marino, S. Snodgrass, S. Ontanon, and L. H. S.Lelis, “Understanding Mario: An Evaluation of Design Metrics forPlatformers,” in Foundations of Digital Games. ACM, 2017.

[30] K. Deb and R. B. Agrawal, “Simulated Binary Crossover For ContinuousSearch Space,” Complex Systems, vol. 9, no. 2, pp. 115–148, 1995.

[31] J. K. Pugh, L. B. Soros, P. A. Szerlip, and K. O. Stanley, “Confrontingthe Challenge of Quality Diversity,” in Genetic and Evolutionary Com-putation Conference. ACM, 2015, p. 967–974.

Page 11: Hybrid Encoding For Generating Large Scale Game Level

11

APPENDIX

●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

100

200

300

400

500

600

0 25000 50000 75000 100000Generated Individuals

Num

ber

of F

illed

Bin

s

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(a) Mario: Sum DSL

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0

200

400

600

800

0 25000 50000 75000 100000Generated Individuals

Num

ber

of F

illed

Bin

s

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(b) Mario: Distinct ASAD

●●

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

200

400

600

0 25000 50000 75000 100000Generated Individuals

Num

ber

of F

illed

Bin

s

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(c) Zelda: WWR

●●

●●

●●

●●

● ● ● ● ● ● ● ● ● ●

0

1000

2000

3000

0 25000 50000 75000 100000Generated Individuals

Num

ber

of F

illed

Bin

s

CPPNThenDirect2GAN

CPPN2GAN

Direct2GAN

(d) Zelda: Distinct BTR

Fig. 8: Average Filled Bins Across 30 Runs of MAP-Elites. For the two distinct MAP-Elites binning schemes used inboth Mario and Zelda, plots of the average number of bins filled with 95% confidence intervals demonstrate the comparativeperformance of the three encoding schemes. For both binning schemes used in Mario, CPPN-based approaches are vastlysuperior to Direct2GAN. CPPN2GAN and CPPNThenDirect2GAN are statistically tied, though CPPNThenDirect2GAN haslarger confidence intervals for the Distinct ASAD binning scheme. There is a starker contrast in the Zelda results. PlainCPPN2GAN is slightly superior to CPPNThenDirect2GAN for the WWR binning scheme, though both approaches are vastlysuperior to Direct2GAN. For the Distinct BTR scheme, CPPNThenDirect2GAN is the best, with plain CPPN2GAN andDirect2GAN tied for worst. Although the different approaches have their strengths and weaknesses, which are dependent onthe binning scheme, it is clear that using some form of CPPN-based encoding leads to the best results.

Page 12: Hybrid Encoding For Generating Large Scale Game Level

12

Leniency Bin: 0 Leniency Bin: 1 Leniency Bin: 2 Leniency Bin: 3 Leniency Bin: 4

Leniency Bin: −5 Leniency Bin: −4 Leniency Bin: −3 Leniency Bin: −2 Leniency Bin: −1

Decoration Frequency Bin

Spa

ce C

over

age

Bin

0 100 200 300 400 500Solution Path Length

(a) Direct2GAN

Leniency Bin: 0 Leniency Bin: 1 Leniency Bin: 2 Leniency Bin: 3 Leniency Bin: 4

Leniency Bin: −5 Leniency Bin: −4 Leniency Bin: −3 Leniency Bin: −2 Leniency Bin: −1

Decoration Frequency Bin

Spa

ce C

over

age

Bin

0 100 200 300 400 500Solution Path Length

(b) CPPN2GAN

Leniency Bin: 0 Leniency Bin: 1 Leniency Bin: 2 Leniency Bin: 3 Leniency Bin: 4

Leniency Bin: −5 Leniency Bin: −4 Leniency Bin: −3 Leniency Bin: −2 Leniency Bin: −1

Decoration Frequency Bin

Spa

ce C

over

age

Bin

0 100 200 300 400 500Solution Path Length

(c) CPPNThenDirect2GAN

Leniency Bin: 0 Leniency Bin: 1 Leniency Bin: 2 Leniency Bin: 3 Leniency Bin: 4

Leniency Bin: −5 Leniency Bin: −4 Leniency Bin: −3 Leniency Bin: −2 Leniency Bin: −1

Decoration Frequency Bin

Spa

ce C

over

age

Bin

0 100 200 300 400 500Solution Path Length

(d) Max of Direct2GAN and CPPN2GAN

Fig. 9: Average MAP-Elites Bin Fitness Across 30 Runs of Evolution in Mario Using Sum DSL. The first three figuresaverage fitness scores across 30 runs within each bin. Each sub-grid represents a different leniency score. Within each sub-grid,summed decoration frequency increases to the right, and summed space coverage increases moving up. (a) Direct2GAN leavesmany bins completely absent. (b) CPPN2GAN and (c) CPPNThenDirect2GAN are comparable to each other, and fill in manyof the bins left empty by Direct2GAN. (d) The final figure shows the result of combining Direct2GAN with CPPN2GAN andtaking the best average result for each bin. The combined results mostly depend on CPPN2GAN, which is very similar toCPPNThenDirect2GAN.

Page 13: Hybrid Encoding For Generating Large Scale Game Level

13

Distinct Segments: 6 Distinct Segments: 7 Distinct Segments: 8 Distinct Segments: 9 Distinct Segments: 10

Distinct Segments: 1 Distinct Segments: 2 Distinct Segments: 3 Distinct Segments: 4 Distinct Segments: 5

Alternating Decoration Bin

Alte

rnat

ing

Spa

ce C

over

age

Bin

0 100 200 300 400 500Solution Path Length

(a) Direct2GAN

Distinct Segments: 6 Distinct Segments: 7 Distinct Segments: 8 Distinct Segments: 9 Distinct Segments: 10

Distinct Segments: 1 Distinct Segments: 2 Distinct Segments: 3 Distinct Segments: 4 Distinct Segments: 5

Alternating Decoration BinA

ltern

atin

g S

pace

Cov

erag

e B

in

0 100 200 300 400 500Solution Path Length

(b) CPPN2GAN

Distinct Segments: 6 Distinct Segments: 7 Distinct Segments: 8 Distinct Segments: 9 Distinct Segments: 10

Distinct Segments: 1 Distinct Segments: 2 Distinct Segments: 3 Distinct Segments: 4 Distinct Segments: 5

Alternating Decoration Bin

Alte

rnat

ing

Spa

ce C

over

age

Bin

0 100 200 300 400 500Solution Path Length

(c) CPPNThenDirect2GAN

Distinct Segments: 6 Distinct Segments: 7 Distinct Segments: 8 Distinct Segments: 9 Distinct Segments: 10

Distinct Segments: 1 Distinct Segments: 2 Distinct Segments: 3 Distinct Segments: 4 Distinct Segments: 5

Alternating Decoration Bin

Alte

rnat

ing

Spa

ce C

over

age

Bin

0 100 200 300 400 500Solution Path Length

(d) Max of Direct2GAN and CPPN2GAN

Fig. 10: Average MAP-Elites Bin Fitness Across 30 Runs of Evolution in Mario Using Distinct ASAD. All three methodsare compared as in Fig. 9, but using Distinct ASAD. Each sub-grid now represents the count of distinct segments in thelevel. Within each sub-grid, alternating decoration score increases to the right, and alternating space coverage score increasesmoving up. The upper-left grid is mostly empty, since it corresponds to levels with only one repeated segment. When the exactsame segment is repeating, it is impossible for decoration or space coverage scores to alternate from segment to segment. (a)Direct2GAN leaves even more bins unoccupied with this scheme, particularly those with a small number of distinct segments.Quality scores also decrease at the number of distinct segments decreases. (b) CPPN2GAN and (c) CPPNThenDirect2GANare once again nearly identical, but CPPNThenDirect2GAN has some brighter areas for larger numbers of distinct segments.(d) The combination of Direct2GAN and CPPN2GAN has the coverage of CPPN2GAN, but includes the brighter cells fromDirect2GAN when there are 9 or 10 distinct segments.

Page 14: Hybrid Encoding For Generating Large Scale Game Level

14

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Water Percentage Bin

Wal

l Per

cent

age

Bin

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(a) Direct2GAN

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Water Percentage Bin

Wal

l Per

cent

age

Bin

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(b) CPPN2GAN

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Water Percentage Bin

Wal

l Per

cent

age

Bin

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(c) CPPNThenDirect2GAN

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Water Percentage Bin

Wal

l Per

cent

age

Bin

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(d) Max of Direct2GAN and CPPN2GAN

Fig. 11: Average MAP-Elites Bin Fitness Across 30 Runs of Evolution in Zelda Using WWR. The three methods arecompared the same way as in Mario. Each distinct triangular grid corresponds to levels with a particular number of reachablerooms (top-right of the grid). Grids are triangular because there is a trade-off between water percentage and wall percentage.It is impossible for this sum to exceed 100%, and likely for it to be much less because some percentage of each room must bededicated to empty floor tiles. (a) Direct2GAN has trouble filling bins with just one reachable room, and generally has troublediscovering levels in bins near the upper-right edge of each triangle, especially as the number of reachable rooms increases. (b)CPPN2GAN and (c) CPPNThenDirect2GAN are both better in this regard, though even for these methods it becomes harderto reach all rooms as the number of reachable rooms increases (bin brightness fades). (d) The combined archive is closer toCPPN2GAN than Direct2GAN, and has a little bit more coverage than CPPNThenDirect2GAN.

Page 15: Hybrid Encoding For Generating Large Scale Game Level

15

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Backtracked Rooms

Dis

tinct

Roo

ms

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(a) Direct2GAN

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Backtracked Rooms

Dis

tinct

Roo

ms

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(b) CPPN2GAN

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Backtracked Rooms

Dis

tinct

Roo

ms

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(c) CPPNThenDirect2GAN

1

6

11

16

21

2

7

12

17

22

3

8

13

18

23

4

9

14

19

24

5

10

15

20

25

Backtracked Rooms

Dis

tinct

Roo

ms

0.00 0.25 0.50 0.75 1.00Percent Rooms Traversed

(d) Max of Direct2GAN and CPPN2GAN

Fig. 12: Average MAP-Elites Bin Fitness Across 30 Runs of Evolution in Zelda Using Distinct BTR. Methods arecompared as in Fig. 11, but with Distinct BTR. The number beside each sub-grid still represents the number of reachablerooms. Backtracking increases to the right and distinct rooms increases moving up. When there are fewer reachable rooms, thenumber of rooms through which one can backtrack is reduced, hence the changing width of each sub-grid. (a) Direct2GAN hastrouble creating levels with a low percentage of distinct rooms, especially as the number of rooms increases. (b) CPPN2GANhas trouble creating lots of backtracking. (c) CPPNThenDirect2GAN is better than CPPN2GAN at backtracking for all numbersof distinct rooms, though Direct2GAN is still best at backtracking for dungeons with many distinct rooms. (d) CombiningDirect2GAN and CPPN2GAN fills bins in the upper-right of some sub-grids that CPPNThenDirect2GAN does not reach, butCPPNThenDirect2GAN generally fills out more bins in the lower-right of sub-grids in the bottom two rows.

Page 16: Hybrid Encoding For Generating Large Scale Game Level

16

(a) Direct2GAN with Distinct BTR (b) Direct2GAN with WWR

(c) CPPN2GAN with Distinct BTR (d) CPPN2GAN with WWR

(e) CPPNThenDirect2GAN with Distinct BTR (f) CPPNThenDirect2GAN with WWR

Fig. 13: Dungeons From Each Binning Scheme and Encoding for Zelda.

Page 17: Hybrid Encoding For Generating Large Scale Game Level

17

(a) Direct2GAN with Distinct ASAD

(b) Direct2GAN with Sum DSL

(c) CPPN2GAN with Distinct ASAD

(d) CPPN2GAN with Sum DSL

(e) CPPNThenDirect2GAN with Distinct ASAD

(f) CPPNThenDirect2GAN with Sum DSL

Fig. 14: Levels From Each Binning Scheme and Encoding for Mario.