Radius limited searching with the ORM

A handy abstraction inside GeoDjango is the D object. It converts from more units than you knew existed into other units. (complete list) We’ll be using it quite a bit as we do various queries.

>>> from django.contrib.gis.measure import D
>>> D(mi=5).km
8.0467200000000005

Another type of search (and the one you’ve most likely seen all over the internet) is the radius search. If we want to search in a given radius – say the user has requested only results within 20 miles, we can do it a couple ways.

>>> Campground.objects.filter(point__distance_lte=(lawrence, D(mi=20)))
[<Campground: Bloomington Pubilc Use Area - Clinton Lake>, <Campground: Clinton State Park>, <Campground: Rockhaven - Clinton Lake>, <Campground: Lone Star Lake Park>, <Campground: Perry State Park>, <Campground: Slough Creek - Perry Lake>]

[This query took 5.529ms.]

But if you know the area you’ll notice this isn’t in any particular order, and doesn’t provide us with the distance figure. All we know is it’s less than 20 miles. So we go back to the method from yesterday, .distance() on the queryset to ask for that information, and order by it.

>>> Campground.objects.filter(point__distance_lte=(lawrence, D(mi=20))).distance(lawrence).order_by('distance')
[<Campground: Clinton State Park>, <Campground: Bloomington Pubilc Use Area - Clinton Lake>, <Campground: Rockhaven - Clinton Lake>, <Campground: Lone Star Lake Park>, <Campground: Slough Creek - Perry Lake>, <Campground: Perry State Park>]

[Adding distance took us to 6.3ms, and then ordering we stayed at 6.3ms]

But again at larger data sets this becomes slow. A better method is instead to create a geometry and go back to using the __within query.

Take our origin point and let’s transform it into a different coordinate system, 900913. This is the “Google Maps mercator” projection but the important thing is unlike WGS84 and its degree unit of measure that’s inconsistent, a degree one way does not equal a degree in the other, this mercator is in meters and completely consistent. Next we buffer (increase, and in the case of a point turn into a circle) the geometry into a polygon with a 20 mile radius.

>>> lawrence.transform(900913)
>>> lawrence_poly = lawrence.buffer(20 * 2172.344)

Now pass that polygon into our queryset, and we get our results back [0.125ms!] – again not in any order, so we apply distance and ordering to them.

>>> Campground.objects.filter(point__within=lawrence_poly)
[<Campground: Bloomington Pubilc Use Area - Clinton Lake>, <Campground: Clinton State Park>, <Campground: Rockhaven - Clinton Lake>, <Campground: Lone Star Lake Park>, <Campground: Old Town - Perry Lake>, <Campground: Perry State Park>, <Campground: Slough Creek - Perry Lake>]
>>> Campground.objects.filter(point__within=lawrence_poly).distance(lawrence).order_by('distance')
[<Campground: Clinton State Park>, <Campground: Bloomington Pubilc Use Area - Clinton Lake>, <Campground: Rockhaven - Clinton Lake>, <Campground: Lone Star Lake Park>, <Campground: Slough Creek - Perry Lake>, <Campground: Perry State Park>, <Campground: Old Town - Perry Lake>]

And yikes, is this method fast. 6ish ms to 0.125.

There’s a problem though. Fans of the metric system are gathering outside my door with pitchforks and torches – “a mile is 1609 kilometers! what’s with this 2172.344 bullcrap?”

I wish I had an answer for them. The truth of the matter is 900913 isn’t all that accurate at some things, but we need a coordinate system in meters. Doing it the RIGHT way with D(mi=20).m gives me back TWO fewer results than I know I should have, one at 15.9 and another at 17.9 mi. Those two are returned (plus another record, at 20.7 mi) by the 2172.344 number.

Google searching around the number doesn’t bring up much, but I have a gut feeling it came from a conversation a long time ago in the GeoDjango IRC room (#geodjango on irc.freenode.net if you’re interested) and has been in my notes ever since. It’s close enough for me, and many others.

EDIT: Marc has sent in a good comment with a good ESRI post on the problem (thanks Marc!). While I knew it had to do with optimization for things OTHER than measuring in the coordinate system it goes into much more depth. The bad part is since we’re constructing and not measuring here transforming into a different coordinate system after construction doesn’t help (tried it many times). The REAL answer is to use a better coordinate system but I try to trade off absolute accuracy for being more capable across the globe.

And that’s what neogeography is all about. Things that would make a surveyor have a fit but work for us, and our users. (as long as it’s not intentionally used to mislead. My degree’s in Journalism, so I take that element of telling the best truth with the numbers I can seriously)

Leave a Reply

Your email address will not be published. Required fields are marked *