From Cory Eicher

Saturday, May 27, 2006

Lessons learned from building an SDE data loader

It's been awhile since my last post. What can I say, I've been busy. Had some time today though, so I dug 'Pet Sounds' out of my cd collection and decided to write a post.

I do actually have some useful content (at least I think so), so another good reason for posting.

Lessons Learned - architecting a system

I recently wrapped up a project where I helped a client build an ArcSDE data loader. The basic goal was to take as input some 'feature data' (geometry and attributes). This data is stored in a database in a non-ESRI format. There was already code in place to take the data out of the database and put it into a proprietary dataset format. My code was to read data from this dataset, translate the geometry into ArcGIS geometry, handle coordinate reprojection and datum transformations, and load the data into ArcSDE feature classes.

At a high level, the solution I architected for this was: an SDE loader object to do the actual data loading, a coordine transformer object to handle projection/datum transformations, and a geometry converter object.

Conclusions:

  1. The coordinate transformer object involved the most work. Part of the job of this object was to interpret coordinate information for the incoming data (call this the 'from coordinate system'). This was defined in a non-ESRI format. I needed it in an ESRI format to use ArcObjects to reproject coordinates from the 'from coord. system' to the 'to coord system desired for the output SDE feature classes. This was a lot of work.
  2. The SDE loading logic wasn't to difficult. The trickiest thing here was to decide on the 'optimum' method to do things like: load data efficiently to a feature class, and delete data efficiently from a feature class. AO provides multiple ways to do things. See my previous post for more discussion about this part of the architecture.
  3. The geometry translator object's job was simply to translate geometry from the non-esri format to the ESRI format. This was many many times more straightforward than the handling coordinate transformation. This makes sense I suppose. Even accounting for multipart geometries, polygons with holes, etc., the domain of what you need to handle for 2d geometry is just so much smaller than all of the potential projected coordinate systems, non-projected coordinate systems (aka geographic coordinate systems), and all of 'junk' beneath the gcs level. Also, anyone that's dug into coordinate system 'stuff' in AO knows that there are many ways to skin a cat there, and also to skin yourself! In the end, things in that part of the AO system are very logically put together, but still, as I said, because of the topic, that still makes things, well, 'non trivial' as a univ. professor of mine used to like to say.

Lessons Learned - exception handling

I understand exception/error handling, and I've employed various strategies over the years to employ it in my code... but, I had always been a bit frustrated/confused about how best to go about it. Should every sub handle errors? What should it do with an error? Message box? Bubble the error up to the calling sub? Some of you might be laughing a bit about this, but I think that people coming from the 'GIS programmer' mold might find this interesting and useful.

In the above project I was introduced to an exception handling design pattern that I really liked. It's logical, consistent, and does exactly what I want it to do. I was working in C#, but it could work just as easily in other languages.

There are two parts to this: 1) how a sub handles exceptions, and 2) how you call that sub.

It's pretty simple, and I ended up following this pattern with all of my methods.

First, how a sub handles exceptions:

bool mySub (out string statusMessage, out ILayer myOutLayer, int InParam1, string InParam2)
{

bool bStatus = true; // return
statusMessage = ""; // out
myOutLayer = null; // out

try
{
// do stuff
// set out param
}
catch (Exception E)
{
statusMessage = E.ToString();
bStatus = false;
}
finally
{
// clean up
}
return bStatus;
}

Second, how to call a method

if (! mySub(out statusMessage, out myOutLayer, 1, "brian wilson"))
throw new ApplicationException(statusMessage)

So, by following this pattern, every sub returns true if it runs succesfully, or fail if not. If it fails, the statusMessage tells us what the error is. We throw a new exception, which is caught by the current sub's exception handler, and the process continues. The end result is that the calling sub always gets a useful error message, and call stack if there's an error.

The statusMessage can also be used to return information about successes. For me, only my 'top level' subs returned non-empty successful string statusMessages. If you want to 'bubble up' successful statusMessages from lower level subs, you'll probably want to modify the above pattern, e.g. not automatically initializing statusMessage to "" at the beginning of every sub.

Talk to you soon,

-Cory