Your data is important and XSD is your friend

Whether you have just started building web services or whether it is old hat by now, it is important to take a step back and examine how you have been doing things. Although it is initially cool to quickly generate a service and accompanying WSDL, maybe some client code for testing, and expose your business logic to remote systems, there are more important things involved, like data integrity. It doesn’t matter how quickly you can interact with remote systems if you are getting nothing but junk from the interaction. The old adage “Garbage In, Garbage Out” certainly applies.

How do you know if you are building a robust web service that delivers real value while maintaining data integrity? The answer to this can be found by examining how you handle remote input. Do you assume that everything is OK if an initial test run doesn’t throw an exception? Do you validate input? Do you throw away part of your data and continue processing in spite of it?

Building a robust web service isn’t all that complicated. Here are a few tips:

  • Be specific about your inputs. Is the data element really a xsd:string or is it actually a positiveInteger? If a numeric type, are you specifying ranges? If a string type, are you specifying an enumeration of valid values?
  • Fail early. The job of a web service is to enable easy integration between systems, not to try frantically to make bad data into good data. Make your interface contract clear, and leave client code no choice but to follow the rules.
  • Version your web services. Don’t let your past mistakes haunt you forever. Create new namespaces and URLs for your new, more savvy web services. Then, announce an End of Life schedule for your services. Well-defined web services that communicate in clear messages are best for everyone.

Why am I on this SOAP box? Imagine a web service that accepts every possible value imaginable for a state or province. For example, Connecticut might be Connecticut, connecticut, CT, or C.T. I saw this today. Unless the purpose of your web service is to provide mailing address normalization, this is completely out of line. The S in SOAP used to stand for simple, but if you are building this kind of logic into your web services, you missed the boat.

Services should be contract-driven and should leverage rigid XSD rules and validation before ever being passed into the business logic that makes the service worth exposing remotely in the first place. Clean data gets dirty a lot faster than dirty data gets clean. So, take the steps early on to maintain clean data in your system.

Leave a Reply

You must be logged in to post a comment.