Skip to content

Comparison to similar systems

Ervin Ruci edited this page Aug 27, 2018 · 54 revisions

While we were busy re-inventing the wheel, we did have a look around for other geocode systems and even took a few for a spin - Revisited from [PlusCodes].

An Evaluation of Location Encoding Systems

Let’s go over some existing solutions and compare them to our geocode system.

How should a geolocation code (geocode) be?

Generally the requirements for a geocode system are:

  • Geocodes must be as short as possible. (to be memorized, or save space when stored in a database.)

  • A Geocode must preserve the location proximity information of its corresponding latitude,longitude. (To give context, latitude,longitude pairs that are geographically close, should correspond to geocodes with many significant digits in common - or with a significant name in common in the case of name based systems)

  • A latitude,longitude pair should have only one geocode, and vice versa. (Obviously)

  • A geocode should represent a unique latitude,longitude pair up to the 5th decimal point. (for our purposes)

  • Computing a geocode should not cause loss of information contained in the original latitude,longitude representation.

  • A geocode should represent a point up to an acceptable max error contained in the latitude,longitude representation. (1 meter in our case)

  • There are 720 trillion latitude,longitude points limited to the 5th decimal, if we allow latitude range to be [90.00000,-90.00000] and longitude [180.00000,-180.00000]. That, is the problem space. In theory we can not represent each such point with an alphabet of 36 alphanumeric numbers with less than 10 digits. So, that’s what we are aiming for.

  • A geocode should be deterministically generated offline via a simple algorithm in the public domain that can be widely adopted.

  • We will aim for the optimal implementation in terms of speed.

Some implementations also work around requirements such as: "Codes should not contain easily confused characters or generate dirty words in any language"[PlusCodes]. We do not take such requirements into serious account. Easily confused characters can be easily distinguished in the online world, and as to the dirty words - we do not care. The three geoname code is composed of words that are already used as location names worldwide - if some of those words also happen to be swear words in some language, that’s not our problem either. One can not easily account for all the different nuances of the thousands of human languages and dialects, removing vowels from the alphabet does not remove the possibility of spelling such a word. So, we’ll just let them be.

After all, dirty is not such a dirty word.

Latitude and Longitude or Longitude and Latitude

Latitude and Longitude are signed decimal numbers, generally displayed up to the 6th or 5th decimal place, depending on the application.

According to [geoaccuracy]:

The sixth decimal place is worth up to 0.11 m: you can use this for laying out structures in detail, for designing landscapes, building roads. It should be more than good enough for tracking movements of glaciers and rivers. This can be achieved by taking painstaking measures with GPS, such as differentially corrected GPS.

The fifth decimal place is worth up to 1.1 m: it distinguishes small trees from each other. Accuracy to this level with commercial GPS units can only be achieved with differential correction.

For geocodes, a latitude,longitude pair of numbers up to the 5th decimal place will suffice.

Geohash

Geohash codes represent areas. Truncating a code reduces the precision, expanding the area with the expanded area containing the original point. This is a nice feature of the geohash algorithm which geocode also has.

There are many geohash shortcommings however. Edge cases will cause points within a meter of each other to have completely different Geohashes.

For example Geohashes of (45.00001,-64.36000) and (44.99999,-64.36000) are f840p2n2p3 and dxfpzryrzq respectively, although the points are only 1 meter apart. (see http://geohash.org/f840p2n2p3 and http://geohash.org/dxfpzryrzq )

Our algorithm solves this problem with Geocodes at borderline areas sharing most of the significant digits. * (45.00001,-64.36000) → EHB105754C → HALIFAX-GAZAH-DOMOU * (44.99999,-64.36000) → EHB1056SH4 → HALIFAX-GAZAH-NDITI

Another problem with Geohashes is that the hash code depends on the number of decimal places of the coordinates. This leads to long geohashes such as: http://geohash.org/eyckqv01zy2v. Geocodes on the other hand are always exactly 10 bytes long.

Geohashes do not guarantee a one-to-one mapping, one of the requirements mentioned above. For eg "c216ne4" and "c216new" (and others) all decode to (45.37 -121.7).

Of all the systems we have evaluated the Geohash system is the most similar to our alphanumeric Geocode system.

Geohashes may spell out dirty words and contain easily confused digits (digits "0" and "1" with a similar appearance to "0" and "I") - and the Geocode system has the same "problem".

Geohash uses an algorithm based on the Z-order curve, and Geocode does too, but our algorithm does not apply the binary interleaving algorithm directly to the latitude,longitude values, but rather to two linear curves generated by the latitude,longitude values. This solves the Geohash problem of discontinuities at latitude and longitude 0 as well as other edge cases like the one above.

In short, Geocode is the same or of shorter length than a Geohash, avoiding many of the Geohash pitfalls, while still allowing for truncation (with a twist: the untruncated Geocode is point instead of an area, while the truncated Geocodes represent polygons that are not perfect squares). Therefore the main difference is that Geohashes define rectangular areas, while Geocodes define single points with a higher precision than a similar length Geohash, while truncated Geocodes define areas that are simple polygons.

There are other versions of the Geohash algorithm such as Geohash-36. They also have the same shortcommings as the main algorithm but they also allow for encoding elevation. This is a feature we are looking to add in the next version of Geocode.

MapCode

MapCode codes are similar to Geocodes in the sense that they similar in length, represent points and cannot be truncated.

The MapCode system has accuracy problems though.

An example MapCode looks like this: NB VXWL.5Y84

In many cases though, the MapCode identifies the wrong province/country name causing ambiguities.

For example: (45.00001,-64.36000) on http://www.mapcode.com/getcoords.html?Submit=Interactive+map gives the MapCode as Context: New Brunswick V81.2LD or alternative mapcodes in New Brunswick NB 8VGR.S6V NB VXWL.5Y84 Note: The location also has mapcodes for other territories.

This is clearly a problem, not to mention the fact that the location in question is not in New Brunswick, but in the middle of Nova Scotia.

MapCode also supports non-ASCII character sets, but that is more a bug than a feature, as this can further enhance the ambiguity of MapCodes.

What3words

There are a number of companies that attempt to build a business model around licensing a grid system, with what3words being one of the most successful ones. According to public sources they have raised around 20 million dollars in seed funding to develop their system. (see https://en.wikipedia.org/wiki/What3words )

Quite impressive, considering the fact that you can’t really copyright a grid system, since they have been around for a long time. According to their website they use around 40,000 English words to assign a triple word code to grid areas of 3 by 3 meters anywhere in the world (they also name locations in Oceans this way).

"Addressing the world" is their slogan, but a random triple word name can hardly qualify as an address. An address is a textual description of a place in context, such as 34 Squire st, Sackville NB. What3words offers no context whatsoever with their random assignment of words to coordinates.

They seem to apply some sort of on-demand coding, a three word code does not exist until someone submits a coordinate that happens to be in a new grid cell.

Their system is closed because they plan to make money by telling you that for eg 44.99999,-64.36 is useless.blog.livery, so it is doubtful it will ever gain wide use.

There is no geographic proximity for what3word codes, the grid square next to useless.blog.livery is lock.accounted.ambulances, and livery.blog.useless in 2000 km away.

Triple Name Geocode solves this problem, with locations close together sharing the first two names, with the first name being the most prominent geoname in the area. And unlike what3words, it is an open system, you do not have to pay to translate between latitude,longitude and triple geoname and back.

What3words' commercial appeal is still puzzling to me.

It is probably due to the limited commercial success of what3words that Google decided to throw its hat in the rink with Plus Codes.

Plus Codes

Plus Codes (also known as Open Location Codes) is Google supported [plcg] grid system.

Plus Codes are 10 to 11 characters long for a grid accuracy of 14x14 meters. Or 12-14 characters long for higher accuracy of 3x3 meters. They can be shortened by appending the city name next to the code (within 50km of the place), in which case the code becomes 7 characters long plus the length of the city name for the total length of the plus code.

They claim plus codes to be easy to memorize due to the break in the code provided by the plus sign, eg: 42852FW2+RG or 852FW2+RG Chatham Islands, New Zealand for (-43.95296,-176.54867). They also use a 20 letter alphabet avoiding easily confused letters/numbers and letters that could spell out swear words.

Overall it is a good system, but at 11 digits with 14x14 meter accuracy it does leave some big gaps. While it might be good for identifying soccer fields and large houses, it might get tricky to telling tiny shacks in the slums from each other.

After all, such location encoding systems are most useful in the developing world, where most dwellings have less than 14 meters of space between them.

Plus codes do address this issue by adding a few more digits to the code, but then the code looses some of its easy to remember attributes.

Another issue they also point out is with the non-contiguous character set they use. Manually comparing codes is difficult, as one has to remember whether there are characters between 9 and C in order to tell if 8FV9 is next to 8FVC

The nice thing about plus codes is that they can be encoded and decoded offline and their algorithm is in the public domain.

Zippr

Zippr is a personal location short-code which can be shared with people you know, It is a four character, four number unique identifier such as XYZR1234 for physical locations, used by both merchants and friends alike to deliver a service or find you. That’s what they say. Now let’s have a close look.

This is yet another commercial grid system, which unlike many others that have gone bankrupt in the last few years, is still very much alive.

Apparently Zippr codes are created on-demand - so it is safe to assume that once they run out of 8 digit codes to assign, they will start adding a few more and call it Zippr+ or something like that. That might not happen though, because if it does, that would mean Zippr being a smashing commercial success used by hundreds of billions of people..

They aim to make money by licensing their API, which just like what3words, is a non-deterministic mapping of latitude,longitude to an alphanumeric code list, hence you need to use their API since such systems can not run offline. They also sell what they call a premium Zippr code on their website, which is a string you can pick yourself unless it has already been assigned.

According to their FAQ, a premium Zippr is a custom location code that you can request such as VJAY8080. It can be personalized to your name, or consist of letters and numbers of personal significance. However, you cannot create Zipprs which may be four letter dictionary words, too common, or conflict with stock symbols. A code currently being used by another user will not be available. You own your custom code and you can always reuse it to point to another location.

So, there you have it. A DNS registry for locations.

Both alphanumeric geocodes and triple name geocodes have little in common with zippr, other than they both map to the same latitude,longitude points.

The list goes on

There is a large number of other location code systems. Some are mentioned in the bibliography at the bottom of the wiki.

Geocode is an open source system for mapping latitude,longitude points to 10 byte alphanumeric strings or 3 geo names. [README](https://github.com/eruci/geocode/blob/master/README.md)

References