Getting the data in – Shapefiles with LayerMapping

There’s a lot of fun geospatial data out there once you start looking, and the biggest format you’ll find (particularly when dealing with government sources) is the Shapefile.

Shapefiles are a proprietary but documented standard created by ESRI, the giant of geospatial software. Since they’re so common, they’re well supported in just about everything and Django is no exception here.

So let’s take another model, from geodjango-tigerline – the US state model and populate it with data from the US Census Bureau’s TIGER/LINE product. You can find TIGER/LINE at http://www.census.gov/geo/www/tiger/shp.html and the state file we need is named tl_2010_us_state10.zip at ftp://ftp2.census.gov/geo/tiger/TIGER2010/STATE/2010/.

To do the import we can use my reusable app https://github.com/adamfast/geodjango-tigerline.

In models.py:

class State(models.Model):
    fips_code = models.CharField(‘FIPS Code’, max_length=2)
    usps_code = models.CharField(‘USPS state abbreviation’, max_length=2)
    name = models.CharField(max_length=100)
    area_description_code = models.CharField(max_length=2)
    feature_class_code = models.CharField(max_length=5)
    functional_status = models.CharField(max_length=1)
    mpoly = models.MultiPolygonField()

 

    objects = models.GeoManager()

 

    def __unicode__(self):
        return self.name

 

In load.py:

def state_import(path=’/root/tiger-line/’):
    state_mapping = {
        ‘fips_code’: ‘STATEFP10’,
        ‘usps_code’: ‘STUSPS10’,
        ‘name’: ‘NAME10’,
        ‘area_description_code’: ‘LSAD10’,
        ‘feature_class_code’: ‘MTFCC10’,
        ‘functional_status’: ‘FUNCSTAT10’,
        ‘mpoly’: ‘POLYGON’,
    }
    state_shp = os.path.join(path, ‘tl_2010_us_state10.shp’)
    lm = LayerMapping(State, state_shp, state_mapping)
    lm.save(verbose=True)

 

The dictionary defines (on the left side) the name on the model, and the name inside the shapefile’s metadata (on the right side). If you don’t want to import a field, leave it out. If you need to do calculation or processing into a field on the model based on incoming data, set another field not coming from the shapefile, or something else advanced, you need to define a pre-save signal as there’s no way to do this inside a layer map. https://docs.djangoproject.com/en/dev/ref/signals/#pre-save If you run into invalid characters, save() on the layer mapping does take an encoding kwarg. Verbose=True as here will print a line for each object it creates (and any failures).

Now let’s look at creating a new one. Relevant Django docs (which are quite good so a lot of this is copy/paste/adapt): https://docs.djangoproject.com/en/1.3/ref/contrib/gis/layermapping/#example This is all inside of a shell.

from django.contrib.gis.gdal import DataSource

ds = DataSource(‘tl_2010_us_state10.shp’)

print ds[0].fields

I’ll get back this:

[‘REGION10’, ‘DIVISION10’, ‘STATEFP10’, ‘STATENS10’, ‘GEOID10’, ‘STUSPS10’, ‘NAME10’, ‘LSAD10’, ‘MTFCC10’, ‘FUNCSTAT10’, ‘ALAND10’, ‘AWATER10’, ‘INTPTLAT10’, ‘INTPTLON10’]

Based on the documentation the census bureau publishes, we can figure out what each element is and decide on a case-by-case basis what we need. I’m personally not very consistent in how I name fields, unfortunately – in some cases it’s by what the file calls it, in others by what’s actually there. Don’t be like me, decide one way and stick to it.

As the docs say, if you’re unsure the data type ds[0].geom_type will tell you what the shapefile provides. The attribute for your layer mapping dictionary will always be the OGC name: POLYGON, POINT, etc. (there are other geometries such as LINESTRING we haven’t discussed.)

But what’s OGC? It’s the Open Geospatial Consortium, a group advocating for open source in this field. They define the standards and support the community.

 

Leave a Reply

Your email address will not be published. Required fields are marked *