Archive for July, 2007

Well Known Vs. Rational Vector Delivery

July 17, 2007

I am in love with MapShaper. Go play with it right now. The “Simplification Level” slider on the bottom lets us get a great feel for what fast vector simplification algorithms can do. Watching Brazil drop from 33,000 vertices to a few hundred in n*log(n) time is simply sexy.

For those of us concerned with thin-client vector rendering speeds, its a short mental jump to realize the utility of simplified geometries. Simply put, non-optimized geometries make for slow or dead thin clients. Traditional web mapping has addressed this by not addressing it — simplifying vectors to the ultimate level of screen pixels. However, with rasterization we lose the ability to dynamically interact with vectors on the client — often a choice we’d prefer not to make.

Given a bird’s eye zoom-level, its perfectly rational to cull vertices when rendering vectors. However, rendering performance is only one part of the equation. MapShaper pushes ~10 bytes per vertex. With a 500×500 pixel map, 5 vector layers, and 2% on-screen data-density per layer, I can imagine a reasonably performant ~250kB vector map. The problem, then, are those 1mB+ vector datasets, where MapShaper must first push the entire dataset to the client before simplification. Practical use of bandwidth is an issue here.

One casual solution, then, is to have a server-side process simplify the geometry. Yes! Go ahead and load that 1mB+ dataset into your webserver’s RAM and crunch the hell out of it! Enh… no. Follow the pipeline up, and you’ll find your WKB/WKT database sitting there, trying to look innocent. This is a data storage question; and raster tile-caches can offer some guidance.

This is also a data storage question with well defined solutions. Vector quad-trees make huge sense, especially if coordinated with the levels-of-detail in an accompanying raster tile-cache. An end-to-end quadtree approach is intriguing not only for display, but because it intrinically optimizes various spatial operations. The downside to quad-trees, however, is a lack of robust existing solutions, and my lack of precision education on the subject. Those who can, do; those who can’t blog (troll?) about it.

Lacking the formal knowledge, I have devised a primitive test for my database theories. Instead of using a quad-tree approach, I’ll assign scale-levels via Douglas-Peucker line simplification. The following brain-dead data structure will be used to house a polygon dataset: { feature_id, point_order, x, y, scale_lvl }. Using plain SQL, I should be able to query into my dataset ala “SELECT … WHERE x between(…) and y between(…) and scale_lvl … ORDER BY point_order”. I’ll plan on using MonoGIS and NetTopologySuite to read in a shapefile and perform the line simplification. We can expect this match-off to last at least a couple of rounds.

Winners: MapShaper, if only for its good looks.

Losers: Your everyday spatial database. To my knowledge, they’re just not built to auto-magically return simplified geometries.