How to Create Gigantic Universes

Introduction

In Infiniverse, the whole initial universe is saved in one integer number. How is this achieved? This article answers the question without going into details about how each different feature is implemented.

Pseudorandom Number Generator

The most important thing in procedural generation is a (pseudo) random number generator, RNG. It is an algorithm that produces seemingly random numbers from a seed value. However, closer observation reveals that the number sequence is always the same when seeding again with the same seed.

That is a very useful feature as one can build an algorithm that creates, for example, a cityplan and all that is needed to reproduce that generated feature is the seed value. So, instead of storing, say, a 1024*1024 map (depending on the complexity, taking multiple bytes per cell), only one number needs to be saved - that is extremely memory efficient. On the downside, if the algorithm is very complex and/or the area/size of thing to generate very large, it might take longer time than to load the data from a more conventional storage format.

Generating Seeds

Okay, so we can generate cities, planet surfaces or anything, but we need unique seeds for them, so it would seem that if we have billions of areas, the storing of all those seeds is going to require quite a bit of hard-drive.

Solution is simple, let's just generate the seeds also. Perhaps the most simple and efficient method is to construct the seed from the position of the area. So if we have a city in planet's surface coordinates (123,456) we could seed the RNG with 123456, for instance. Perhaps we would also like to append the planets seed to that so that two places in the same coordinates but in different planets doesn't look the same.

Another method for "zooming in" is to generate a new area by interpolating between the features in the higher level and adding some noise. For example, if we had a general view of an area and wanted to create a more detailed close-up, we pick the place we want to zoom into and check its surroundings. If there is, say, hills in the east and forrest in the west, we make a gradient of those to so that in the middle there are both types of terrain. That in its own would look quite bad, but then we apply noise(s) (e.g. white noise for roughness and perlin noise for curved lines) and some extra detail features like rocks and other stuff. That starts to look a lot better, but there are still some complications: some types of terrains need to be handled differently as for example it might be realistic for a forrest/hill transition to have scattered lone trees in the middle, but it wouldn't look good if the trees were water.

Real-time Generation

When it comes to huge continous areas that we cannot generate to memory (too big) we can do two things. It is possible to build an algorithm that is inputed with coordinates to a point and it returns a value (graphics tile perhaps) calculated by using the inputs as sort of seeds. Noise functions (e.g. Perlin noise) come very handy. In this way, we don't need to generate anything in advance, just pull out values as needed and discard them when they become obsolete. (This is a little like the shaders in modern graphics cards work; e.g. take the position of the pixel-to-be-drawn and transform/color it).

Other method, that can be used in conjunction with the first one, is to create a buffering system, where a reasonable area around the player is generated to a memory buffer and when the player comes closer to the edges, a new buffer, with better suiting tiles are created.

The problem with the first method is that if the algorithms are complex, its going to require a lot of computation power to display large playing screen. The second runs smoother, but when generating the next area to the buffer, there can be a small irritating loading break. I personally ended up using (with large areas) continuous buffering, meaning I calculate the visible area into a buffer and as the player moves, update the buffer by calculating the new tiles that come into view. That way I only need to calculate a fraction of the tiles visible each frame, but I don't need stop for loading breaks.

Conclusion

Generating stuff randomly (but deterministically) saves a lot of storage space, but can be computationally expensive. In addition, you can have practically infinite amount of whatever you are generating, but although for large quantities it is far easier than doing everything manually, coming up with algorithms that create interesting and varying things is non-trivial.

Update 2009-12-20: Added subheadings and conclusion.