2D Map Representation as 1D array

mattstyles · March 13, 2019

Is representing a 2D map using an array of arrays `[][]` really that nasty?

My knee-jerk reaction is that it is pretty nasty and I tend to always represent maps as one-dimensional arrays, however, whilst there are some advantages there are also some disadvantages, mostly around access to the structure as to get an [x, y] location involves knowing the dimensions of the structure i.e.

var size = [2, 2]
var data = [1, 2, 3, 4]

var get = (x, y) => (y + size[0]) + x

This is fine, but any time I want to use the map I need to also know the intended dimensions, which is just another thing to carry and pass around.

So... what things are nasty using nested arrays, my requirements seem to suggest it would be fine:

* I am not converting the map i.e. I am not rotating, shearing or skewing the map in any way

* My map dimensions are known ahead-of-time and never change during the life of the map

* I am generating the data myself, so working with a pre-existing structure/representation is not a concern

Using nested arrays mean I can access via [x, y] trivially and dimensions of the map are easy to ascertain by querying on array.length. So, should I get over my disgust at using `[[],[]]` nested arrays and just get on with using the right tool for the job?

grelf · March 13, 2019

I think there are two main issues:

- Maintainability: using multiple indices, as in map [x][y] will be much easier to comprehend if you come back to the program years later to correct or enhance it.

- Performance. There is no doubt that allocating a single large array is faster than allocating hundreds or perhaps thousands of smaller ones, which is what happens in the 2-index case*. Likewise the garbage collector has much more work to do in the latter case but if your arrays remain permanently through your program (do not go out of scope) that is not a consideration.

In drawing the scenes in my forest I do allocate and release a 2D array every time (up to 400 x 400 so it can comprise up to 401 1D arrays). The time taken to redraw the same scene in my program is very variable and think this is probably due to garbage collection. I have been aware from the outset that performance might be improved by redesigning this but it is not my highest priority.

* According to the JavaScript standard because there is no way to say new Array (m, n) to indicate 2D (instead you would get the 1D array [m, n] in this case).

Edited March 13, 2019 by grelf
Numerical error

mattstyles · March 13, 2019

I hadn't considered the recreating of arrays, I will effectively be recreating the array each operation, although I may be able to optimise that if it becomes a problem.

For my current project, like yours, it isn't a top priority, but might be in subsequent ones.

b10b · March 13, 2019

Yeh, I think the disgust is warranted Sweep it under the carpet - I prefer Object Oriented for map data structures.

Adding the interface and hiding the 2D array adds value soon enough. A simple example getCell(x,y) avoids that common gotcha of cells[y][x]. Another example might be something like getCellAbove(x,y) where all the range checking can be neatly concealed. More elaborate examples might return a Cell object with such relationships prebaked, or allow partitioning or redimensioning of the data set. And that's the upside, the interface can evolve with the requirements. Whether the underlying data structure is a 2D array, vs a 1D array, vs a string, vs another structure can be changed or upgraded later based on performance tests (these could even be determined per device).

All this being said I prefer to save myself minutes of coding rather than a ms at runtime, so I favour patterns over performance for the majority of tasks - I understand that's not everyone's preference or appropriate for certain projects.

grelf · March 14, 2019

Matt: you prompted me to reexamine my scene drawing code with a view to possibly improving its performance.

It is even more surprising that I get the performance that I do, because the following is the gist of what happens with an array called "about" which is created every time a scene is drawn. Its purpose is to hold information gathered about the ground immediately around the observer (object called "me") out to a distance which is a user-alterable range (up to 400 metres, 1 ground unit per metre).

var mex = Math.round (me.x), mey = Math.round (me.y);
// Fractional array indices would not work

var xyLen = 2 * this.range_m + 1; // max 801
this.around = new Array (xyLen);
// "this" is the scene object, we are in its draw() method

for (var xScan = mex - this.range_m; xScan <= mex + this.range_m; xScan++)
{
    this.around [xScan] = new Array (xyLen);
   // ... start filling this.around by scanning both x & y around me

My point is that this.around [xScan][yScan] is being indexed by ground coordinates which can be huge. I recently discovered that the width and height of my map are each about 1.8E15 metres (it could not be a planet in our universe). The code fails beyond that range but performance does not fall off before that!

So I get away with it because of JavaScript's very flexible way of indexing arrays (and evidently very efficient in current browser implementations).

If I were instead to allocate (the largest possible version of) the array just once when the scene object is constructed I would have to calculate indices in my own code so that they range from 0 upwards. I expect performance could easily be worse then.

mattstyles · March 18, 2019

On 3/13/2019 at 7:05 PM, b10b said:

I prefer Object Oriented for map data structures

Yeah, only issue I have with that is that I only pass data around, not classes, and I'll have to sacrifice quite a lot of good stuff to be able to do it in a different way.

I kind-of end-up in a similar place though with utility functions, but they generally run to the signature of:

getCell(data, meta, position)

// Could be curried:
getCell(data, meta)(position)

// Data could include meta (height, width etc for example)
getCell(data, position)
getCell(data)(position)

The only difference is that I have to specifically pass in the data and meta for the map, rather than rely on `this` and grab it from the instance.

It's a good point though, and there are multiple advantages for the class-based approach. I _think_ a good half-way house for me would be to expand the `data` object shape to include any meta-data about the map structure (I currently end up passing the raw data array—or array of arrays, whatever—and also supply meta, which is, to be honest, a bit annoying).

On 3/14/2019 at 11:55 AM, grelf said:

My point is that this.around [xScan][yScan] is being indexed by ground coordinates which can be huge.

My current patch code is reliant on being a 2d array, but I could get rid of that with a few more helpers and passing more information in via the `data` variable:

/**
 * Gets a patch of tiles based on an initial [x, y] location
 * @param data <Array<Array[Tile]>> the 2d raw map
 * @param position <Array[Number, Number]> [x, y] position to center on
 * @param dim <Array[Number, Number]> patch size. Defaults to [-1, 1], can be
 * used to alter the center.
 * @returns <Array<Array[PositionTile]>>
 */
export const getPatch = (data, position, dim = [-1, 1]) => {
  const [w, h] = [data[0].length, data.length]
  const [x, y] = position
  const [min, max] = dim

  const patch = []
  for (let i = y + min; i <= y + max; i++) {
    for (let j = x + min; j <= x + max; j++) {
      const tile = getPatchTile(data, [j, i], [w, h])
      if (tile) {
        patch.push({
          ...tile,
          x: j,
          y: i
        })
      }
    }
  }

  return patch
}

So I think it works pretty similar to your method, albeit without classes and instances being involved.

There are a couple of improvements I can see immediately, I'm effectively passing too much to `getPatchTile`, which also does bounds checking, and I don't like imperatively using nested for loops (which litter the codebase) so they could be generalised in to a helper function also.

In any case, I'm happy with scrappy code as this is mostly a POC but it is interesting as it exposes that there are indeed many different ways to skin this cat, and many of those ways aren't clearly better or worse than the other ways. Trade-offs and decisions.

b10b · March 18, 2019

@mattstyles thanks for explaining, is this because of stateless / functional reactive? Interesting, to me it feels almost like turning the same problem inside out ... with data needing to become another function (to abstract away from array directly)?

mattstyles · March 19, 2019

@b10b yep, thats it exactly. I have a central 'store', which is just a big object, technically you could pass class instances (or functions) through as well but I'm not sure the exact downsides of that (easy serialisation is one, but could be worked around easily enough).

Re-rendering is a side effect of changes to that central object and will only occur when changes are detected (standard React rendering, which is the same as many true FP languages).

You could have both systems side-by-side and have a more manual system to 'tell' certain UI elements to that they need to re-render, I've tried that too with varying success.

I think, in my case at least, I could get a lot of benefits from memoizing some transform functions as I'm effectively running the same thing several times each render tick.

grelf · April 16, 2019

I have managed to rewrite my scene drawing (in www. myforest.uk) so that it does not reallocate two huge arrays and fill them with freshly created objects every time. This rewriting reduces the scene drawing time by about 30%. There is also less variability in the time taken from one scene to another which tends to confirm that garbage collection and reallocation were taking a lot of the time. Reallocation would be variable because it depends so much on what contiguous chunks of memory are available.
I have written the following to show some things that developers may need to consider in their own code.
My terrain is generated by a function of (x, y) position (in fact a method of an object of type Terra). It is a complicated function so it does not want to be done repeatedly for each position; instead its results must be held in arrays for subsequent reference during scene building. Two big arrays hold the results of that around the (moving) observer's current position to enable the scene to be drawn. For each position the arrays refer to an object of type ScenePoint which contains the distance and bearing of the point from the observer and other things (such as terrain details and amount of fogging for distant points). Each ScenePoint object has 10 properties so requires at least 80 bytes, probably more like 100 bytes allowing for the Object structure itself (I think that varies from browser to browser).
One of the big arrays, around [x][y], goes out to the current view range which is user-selectable and can be up to 400 meters, so the array may have to be 801 x 801 to surround the observer. This is now allocated at the start of the program. Each array value will be a reference to a ScenePoint object so the array itself takes 8 x 801 x 801 = 5.1 megabytes. Then the 801 x 801 ScenePoint objects, also now allocated once at the start instead of freshly in each call to Scene.draw (), occupy about 100 x 801 x 801 = 64 megabytes.
Another array, ahead , contains references to the same ScenePoint objects but is sorted during scene drawing so that the most distant points come first. This array does not contain references to objects behind the observer and so it is allocated as new Array (400 x 801). It requires a mere 8 x 400 x 801 = 2.6 megabytes.
So my latest version of The Forest, v19.4.10 (see www.myforest.uk ) allocates just over 70 Mbytes when it starts and then no longer has to reallocate this (in many pieces) every time a scene is drawn.
I still find it remarkable that my scene drawing is done in a couple of seconds even for the 400 metre horizon range. During that time the 'ahead' array is sorted too!
When starting to draw a scene 'ahead' has to have all its elements set to undefined because as the observer moves and turns a varying number of points can lie ahead. I rely on the specification for Array.sort() which says that undefined elements get sorted to the end of an array.
To support all this I have written a new JavaScript file to create one object of type Around. People may be interested to see it (below) but first
Matt: Having made this change it was then easy to experiment with making 'around' a 1D array instead of 2D, calculating the index from x and y myself. I found no difference in performance (in FireFox).

// Part of The Forest www.grelf.net (www.myforest.uk)
// Copyright (c) Graham Relf, UK, 2019

/** One object of this type is constructed at the start to avoid reallocating
* big arrays every time a scene is drawn */
function Around ()
{
var el = document.getElementById ("range");
this.aMid = 0; // The middle index of each x/y array

// Find largest range user may select:
for (var i = 0; i < el.options.length; i++)
{
var r = parseInt (el.options .value);
if (r > this.aMid) this.aMid = r;
}

var wd = 2 * this.aMid + 1;
this.aheadChange (this.aMid);
this.around = new Array (wd);

for (var x = 0; x < wd; x++)
{
this.around [x] = new Array (wd);

    for (var y = 0; y < wd; y++)
    {
      this.around [x][y] = new ScenePoint (0, 0, 0, 0, 0);
    }
}
}

/** Use at the start of drawing a new scene
* NB: mex, mey are rounded observer coordinates */
Around.prototype.init = function (mex, mey)
{
this.xOffset = this.aMid - mex;
this.yOffset = this.aMid - mey;

// Clear previous scene:
for (var i = 0; i < this.nAhead; i++) this.ahead = undefined;

this.nAhead = 0;
};

/** Only used if user changes the range (and in Around constructor) */
Around.prototype.aheadChange = function (range)
{
this.ahead = new Array (2 * range * range);
this.nAhead = 0;
};

/** Add a ScenePoint object reference to the scene ahead */
Around.prototype.aheadPush = function (sp)
{
this.ahead [this.nAhead] = sp;
this.nAhead++;
};

/** How many active ScenePoint objects are currently ahead */
Around.prototype.aheadLength = function ()
{
return this.nAhead;
};

/** Get a reference to the ith ScenePoint object ahead */
Around.prototype.aheadGet = function (i)
{
return this.ahead ;
};

/** Sort the ahead array in descending order of distance */
Around.prototype.aheadSort = function ()
{
this.ahead.sort (ScenePoint.prototype.sort);
};

/** Get a reference to the ScenePoint object at (x, y).
* This is so that extra properties can be added to it. */
Around.prototype.aroundGet = function (x, y)
{
return this.around [x + this.xOffset][y + this.yOffset];
};

/** Set the fields of the ScenePoint at (x, y) as if freshly constructed */
Around.prototype.aroundSet = function (distance, bearing, x, y, odd)
{
var sp = this.around[x + this.xOffset][y + this.yOffset];
sp.fogNo = 0;
sp.tr = undefined;
sp.building = undefined;
sp.drawn = undefined;
sp.clear = undefined;
sp.d = distance;
sp.b = bearing;
sp.x = x;
sp.y = y;
sp.o = odd;
};

Note that in my previous version of Scene.draw () each ScenePoint object was freshly constructed and subsequently might or might not get extra properties added (such as .tr or .drawn). That was poor practice really. For easier maintenance it should be clear in the constructor as to what properties an object can have.

Sign In

2D Map Representation as 1D array

Recommended Posts

mattstyles

Link to comment

Share on other sites

grelf

Link to comment

Share on other sites

mattstyles

Link to comment

Share on other sites

b10b

Link to comment

Share on other sites

grelf

Link to comment

Share on other sites

mattstyles

Link to comment

Share on other sites

b10b

Link to comment

Share on other sites

mattstyles

Link to comment

Share on other sites

grelf

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Browse

Activity