Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

Stratoscope

One task where GeoJSON falls down is simplification of a group of polygons with common boundaries, e.g. the 48 conterminous US states. If you start with a highly detailed set of polygons, you need to simplify them for practical display in an online map.

GeoJSON doesn't encode the fact that the boundary points are common between adjacent polygons. When you simplify those polygons, each one is handled separately, and you end up with "slivers" where the boundaries are misaligned:

https://www.bing.com/images/search?q=map+slivers+betwen+poly...

TopoJSON solves this by encoding each such boundary only once. So when you simplify the polygons, they are all done together, and the same simplification applies to adjacent polygons. No more slivers!

https://github.com/topojson/topojson

https://github.com/topojson/topojson-simplify

Demiurge

Is this actually GeoJSON falling down, or decades of convention extended to JSON? Topology is great, but it is sidestepped by Shapefile/WKT/WKB/etc, in favor of independent primitives like POINT, LINE, POLYGON. If GeoJSON did not exist as a new JSON GIS data format encoding these primitives, TopoJSON would not have "replaced" it, due to the added mis-match with other non-topological formats.

From what I can tell, the top criticism of GeoJSON is the under-enforced winding order specification, and crossing the antemeridian.

jvanderbot

Right. Encoding a union algorithm into the data structure just introduces the reverse problem: Selecting a subset now requires extra logic beyond jq.

Stratoscope

Similarly, typical map APIs like the Google Maps API accept GeoJSON and not TopoJSON. I was not suggesting TopoJSON as a replacement for GeoJSON, but as a complement to it. With the tools on the TopoJSON GitHub, you can have GeoJSON input and output, but convert to TopoJSON for the simplification step to avoid the "slivers" problem.

NelsonMinar

I like TopoJSON and have used it in projects. But it's weird to set it up as opposition to GeoJSON. It's a complement. GeoJSON is a general data format meant to replace uses of ESRI Shapefiles and other complex formats. TopoJSON is more of a solution for a particular application need.

Is there much work developing or using TopoJSON these days? I haven't seen much about it in a few years.

Stratoscope

To be clear, I'm not suggesting TopoJSON as an alternative to GeoJSON. I like GeoJSON and was loosely involved with the working group that created and updated its spec.

I'm just saying that for the specific task I mentioned GeoJSON or any format such as shapefiles that store polygons individually naturally leads to the "sliver" problem.

A nice processing pipeline is:

1. Convert GeoJSON to TopoJSON.

2. Run the simplification on the TopoJSON.

3. Convert the resulting TopoJSON back to GeoJSON.

The TopoJSON GitHub has tools for each of these steps.

pramsey

GeoJSON is not TopoJSON. Saying that is "falling down" is like criticizing a zebra for not being a giraffe. GeoJSON is a mapping of the (non-topological) "simple features" model into JSON, full stop. It does that fine.

Stratoscope

Yes, the same "slivers" problem occurs when you try to simplify features in any format that uses individual polygons, such as shapefiles or whatnot. That's the only case I was referring to.

I don't think I would trust a zebra or a giraffe for this task either.

echoangle

How is that a geojson problem? If your dataset is correct, adjacent borders will just use the same points and will match exactly.

sdenton4

The problem is simplification. Suppose two regions share a border with some nonlinear points a, b, c, d. Simplifying the polygon for the first region might yield a, b, d while the second yield a, c, d. This creates gaps or overlaps between the two regions.

qurren

But what is the border? Set the border to what it actually is, not a simplification of it. The state of Colorado is formally a 697 sided polygon, don't simplify it to a rectangle.

echoangle

So don’t simplify the shapes on their own. Geojson is a storage and exchange format, you can still convert it to other formats if you want to modify it.

Waterluvian

I’ve applied GeoJSON (among many other GIS tech) for mapping and monitoring tens of thousands of warehouse robots. It works great as long as you squint just a bit, ignoring that it generally calls for long,lat and is designed with the assumption of a world CRS.

The dangerous part is that some tools fully assume this and will completely screw with calculations if you’re assuming a flatland CRS. So you’ve got to be careful in checking and setting those parameters.

One nice thing is that the structure of GeoJSON works incredibly well in typescript. It has discriminated unions built in so you can walk entire geodatasets in a pretty comfortable way.

papercrane

> It works great as long as you squint just a bit, ignoring that it generally calls for long,lat and is designed with the assumption of a world CRS.

I thought the spec allowed you to specify the CRS, but I just checked the RFC and they removed that from the 2016 specification and WGS84 is specified. It does allow for alternative CRS with prior arrangement, but like you said that does require a lot of care.

drewda

Yup, technically speaking if the coordinates aren't in WGS84, it isn't GeoJSON

matt-p

OK, I had not considered just using GeoJSON for my flatland CRS (indoor routing). Quite obvious in hindsight, thank you.

sam_lowry_

> tens of thousands of warehouse robots

Sounds like Amazon

Waterluvian

Definitely not Amazon. Yuck.

DarkNova6

I’ve had nothing but problems using GeoJson. The specification has limitations everywhere and doesn’t even support z + m values at the same time.

But thankfully there is also the SQLite backed GeoPackage, which is not only more flexible but also much smaller. It takes some extra steps to get testing teams working due to it’s binary nature, but other than that it is the best format in geospatial data analysis.

Long live SQLite!

cr125rider

Made by Sean Gillies and a few others. Back when mapbox was doing all sorts of great open source stuff. Legends

https://github.com/sgillies

nobleach

We used this extensively when I worked in this space (2010 - 2014). My favorite addition was using https://github.com/topojson/topojson to add arcs. That cut down on quite a bit of points to represent curves.

jtbaker

Dang, fun memories of when I was first getting in to geo/data stuff and doing a lot of web mapping stuff with D3, Leaflet and friends. Seems as tools like Vector tiles/PMTiles have supplanted topojson for a lot of visualization oriented use cases.

nobleach

I'm gonna have to dive into a rabbit-hole! I was working on an ESRI Shapefile to GeoJson converter back in those days. But D3 and Leaflet were such cool tech! MapBox too. Linking SagaGIS with PostGIS to do pre/post wildfire analysis was my jam.

ragebol

Have been using GeoJSON, very handy and human-readable, but we recently switched to GeoPackage files, as it allows for different layers, each with a different schema for additional data.

GeoPackages also allow to set a proper CRS, which is not as easy in GeoJSON IIRC.

Getting your CRSes wrong is fun...

jackconsidine

GeoJSON is super useful. At Getcho (delivery, logistics) we use zip code GeoJSON encodings to draw polygons on zone maps and quickly generate rates. This has been a persistently annoying thing to do until we discovered this format. If you're curious, someone made a repo with all the 2010 census zips a while back [0].

[0] https://github.com/OpenDataDE/State-zip-code-GeoJSON/blob/ma... although you can generate newer versions from the last census.

korkoros

About 25% of ZIP codes don't have a corresponding Census Bureau ZCTA, for example 10118. Do you end up needing special handling for those cases? Or has it not yet come up in practice?

jackconsidine

Excellent question it certainly does come up. Practically speaking the more populous zip codes are all accounted for and that’s where the vast majority of deliveries go to. For example I took the census zip code data 150 miles (crow flies) outside Philly and found virtually 100% coverage.

For missing ones you have to fall back to distance based estimates and in my business that means you’re quote may be off and you’re exposed

ryandrake

No shade whatsoever at you or your business: I'll say upfront that you certainly made the right practical decision for the goal of running a business.

That said, this is a textbook example of what I have always found so infuriating, personally, about working on commercial software, and one of the many reasons I ultimately moved into a non-software-writing role. The (very sensible and practical) shortcuts and tradeoffs that are commonly made due to time and cost constraints. The attitude of "well the vast majority of our use cases work, so we're done." I've always thought edge cases must be addressed. Something in my brain hurts when I knowingly release something where only 99% of cases work.

I can imagine this is probably the same thing some artists feel when they are commissioned to produce (in their view rushed, flawed, or incomplete) artwork for business purposes.

I only write software at home, as a hobby now, and this gives me the outlet to follow my heart around edge cases!

michaeljhg

thibautg

And with PostgREST [0], you can automatically convert any PostGIS table (with geometry or geography column) to GeoJSON by using an "Accept: application/geo+json" header in the request.

[0] https://docs.postgrest.org/en/v14/how-tos/working-with-postg...

pramsey

At the SQL level, the ST_AsGeoJSON(record) variant will convert a tuple that includes a geometry and any combination of other columns into a GeoJSON output.

steve-chavez

Many thanks for your work pramsey. We use that exact function [1], do you have any plans for a similar function for TopoJSON? One that also has a record parameter? [2].

[1]: https://github.com/PostgREST/postgrest/blob/f1d0e8ea2266077d...

[2]: PostGIS has https://postgis.net/docs/AsTopoJSON.html but it doesn't take a record.

Zambyte

Also https://github.com/timescale/timescaledb

I've found it very useful for storing geospatial data over time.

pramsey

MobilityDB might also be of interest, for people handling trajectories.

layer8

Somewhat related: Falsehoods Programmers Believe About Map Coordinates: https://news.ycombinator.com/item?id=24659039

cogman10

Interesting but, IMO, probably one of the worst uses of JSON. The data you would want to consume is already not "human readable" so it instead introduces a lot of bloat for really no benefit.

If you have a non-insignificant amount of data points to track this is going to eat just a ton of memory while also being pretty slow to encode/decode.

Imagine, for example, if we encoded this as a binary. First 2 bytes for the feature type, second 2 bytes for the geometry type, 3 bytes for a fixed point x, 3 bytes for a fixed point y, and you could optionally provide the properties as a json blob in a trailing string. That's 10 bytes for all the coordinate stuff. Less bytes than what currently stores the `"type": "Feature"` string.

doginasuit

Do you mean geocoordinates when you say not human readable? Those are obviously at the heart of geospatial information but there is quite a bit more to the spec that does benefit from being human readable, and I'd include longitude/latitude among them. There are also solutions like cbor which allow them to be transferred and decoded/encoded from binary. For performance critical data you can also use something like protobuf, but it would be a huge pain to handle everything that way. Json is a great choice as a general spec.

undefined

[deleted]

kitd

There's a map facility not linked here that allows you to build GeoJSON graphically:

https://geojson.io/#map=12.42/51.50593/-0.13003

phillc73

GeoJSON is not just for geographical features! Shapes of any kind work just as well.

QuPath[1], a tool for digital pathology whole slide image analysis, can export annotations in GeoJSON format (and import too I suppose).[2] This makes it really very easy to make annotations transportable between tooling.

[1] https://qupath.github.io/

[2] https://github.com/qupath/qupath-docs/blob/main/docs/advance...

tomtomtom777

The spec states:

> The coordinate reference system for all GeoJSON coordinates is a geographic coordinate reference system, using the World Geodetic System 1984 (WGS 84) [WGS84] datum, with longitude and latitude units of decimal degrees.

So that seems to be a misuse of the format. Using a geojson library for this may get you into trouble with ranges or antimeridian cutting.

Daily Digest email

Get the top HN stories in your inbox every day.