Science

The Good, The Bad and the Ugly of Scrum

Monday, August 13th, 2007

We all hear about and we all love it: the Rugby-inspired software development methodology known as Scrum. It’s fast becoming an industry buzz-word and causing many project managers to question their Gantt charts. For all the hype, what is the reality of Scrum?

Scrum is an agile-based software development methodology for project management. It is characterized by a prioritized product backlog that lists new features. Work is completed and delivered in time-boxed iterations known as sprints (e.g., two week iterations). Scrum teams are cross-functional and typically number 3-7 people each. Iterations begin with an iteration planning meeting and end with a retrospective to review what worked and what didn’t.

During a sprint each scrum team gathers for a daily stand up, which is a short meeting where each person describes what they did since the previous meeting, what they’re planning to do now, and any impediments. The team is self-organizing leveraging our instinctive behavior to work in small groups. The Scrum process is facilitated by a Scrum Master. That title is a bit of a misnomer since the Scrum Master carries no authority and is instead responsible for blocking any distracting influences that could disrupt the teams progress.

The principles of Scrum are well defined in the wikipedia article as well as in the book Agile Project Management with Scrum by Ken Schwaber. You can also shell out ten grand for an in-person experience with Ken. There is nothing like an expert talking about the work that you should be doing. As some of my friends and co-workers like to hear me say: Get back to work!

What’s so Good about Scrum?

Delivering working software. Working software is where Scrum really shines. It’s proving to be an excellent implementation of Agile Software Development with core values such as customer satisfaction and individual interaction.

There are four core values to the Agile Manifesto:

  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation
  • Customer collaboration over contract negotiation
  • Responding to change over following a plan

These are all valid principles that are easy to ignore and have proven to be hard learned lessons despite how obvious they may seem. Projects typically fail when they ignore these principles. Good documentation has never compensated for crap software. Try telling the upset customers that you gave them exactly what they contractually signed in the Service Level Agreement. The principles of the Agile Manifesto should be held as software engineering law.

Scrum provides a very effective methodology that ensures these principles through an empirical approach to software development that embraces and encourages change.

If it’s so great, what’s the Bad news?

No one likes to admit this, especially not Scrum advocates like myself, but Scrum fundamentally conflicts with traditional PMO. It’s an interesting round of cognitive dissonance to watch a PMI-certified project manager attempt to rationalize Scrum. There are of course several ways to deal effectively with this dissonance but understand: these are fundamental differences which are not philosophically compatible.

Scrum will go far in delivering working software, but what about managing roadmaps? Better yet, what about resource allocation and God-forbid budget forecasts that are needed before a project starts? We need money and people to start a project and we’d like to know roughly how much something will cost before we agree to invest. In a perfect world we would know these answers and we wouldn’t need Scrum. But Scrum exists out of necessity from the failures of so many software development projects and reminds us that the entire enterprise of software engineering is often times more like scientific discovery than building construction (which is where PMI originated, and rightfully so).

The Ugly

Scrum delivers working software in chaotic environments. At the same time Scrum is a symptom of a larger problem in software engineering such that many software projects cannot be managed like construction projects else they face increasing technical debt, unhappy customers, declining quality, going over budget, and missing deadlines.

Migrating to a Scrum methodology typically has an effect of providing early visibility to problems. The wisdom being that it’s better to know that you’re failing one or two months into a project rather than years. While this makes sense and this transparency is a very valuable aspect of Scrum the reality is very ugly.

Transparency to critical problems often times stems from the fundamental conflict of traditional PMO and Scrum. These are problems that are unfortunately outside of the realm of Scrum or software engineering. In this situation it is easy to attack the symptom (Scrum) than it is to address the underlying issue, which is unifying project management (roadmaps, budgets, resource allocation) with the software development.

Visibility? Be careful what you ask for! If a project is going to fail, maybe it’s better to let it fail naturally than to induce ulcers in the Scrum Masters and the development staff. Remember: a good Scrum Master will be a burned out Scrum Master in most environments.

This isn’t an easy problem to solve, but done wrong Scrum can create an emergent failure. Take the following anecdotal quote from Brad Wilson:
Scrummerfall. n. The practice of combining Scrum and Waterfall so as to ensure failure at a much faster rate than you had with Waterfall alone.

On the other hand: Scrum, done right, has the potential for an emergent success given iterative and continuous improvement. A potential method for solving the real problem is to exploit the emergent behavior of a system.

Emergent behaviors are difficult to track, but analyze the existing software, processes, and development and determine whether or not they are evolving appropriately. A successful development process should continually improve the same way the code itself should continually improve. The process itself should be agile, responding to change to better produce working software.

Better yet, the individuals and interactions should be agile. It is the people that must respond to change.

Evolving Database Schemas

Monday, January 8th, 2007

Master your Domain

It amazes me that while database schema evolution is one of the most critical factors in software development it’s also one of the most ignored and least understood aspects. Pretty much every interesting software development project requires an evolving database schema. From renaming tables to modifying relationships it’s as critical as any piece of source code but is often the least cared for part of any project. Your schema is what maintains order in your data, arguably this is the most important part of any enterprise (organizing your data)!

There are mountains of excellent (and practical) resources on how to manage software development projects. I can get certified in SCRUM or nearly any Agile-based development methodology but I can barely find any useful websites on managing my database in an Agile environment. When your development process encourages change this applies to your data modelling and schema design and not just application development.

There are some excellent articles from Fowler and Ambler that provide a good starting point. However, I found a definite lack of details and practical advice in those articles. It goes without saying that I want to run regression tests and version my work, but a database is distinctly different in that I have to deal with all that existing data and can’t exactly redeploy a schema while preserving the old data (not easily anyhow).

Existing persistence strategies often fail to accommodate Agile-based development leading to poor design choices in data modeling. In this article I’d like to explore practical solutions to Agile database development. Like anything Agile, there’s no perfect solution, but this is a pervasive problem across all interesting software development projects and I hope the discussion alone will yield better solutions.

To start, there is one important observation that I’ve found to be true across various software development projects: If the code smells it’s likely that the database smells worse! Let’s examine some common database smells:

  • Inconsistent relationship strategies; when your ER diagram starts looking like spaghetti and every piece of business logic introduces a different convention you’ve got a problem. You have object tables and three possible types of relationships (1-1, 1-N, N-M), your data model is only as complicated as you make it; pick a strategy for each of the three types of relationships and stick to it. I liken this to using GOTO statements in software, it’s unacceptable.
  • Inconsistent object model strategies; this is often the impetus to change your relationship strategies, that is, when I have inconsistent strategies for object tables it leads to very confusing relationships. You’ll see one table with a varchar(20) NAME and another with a varchar(16) NAME, is this the name of the object or does the object contain a “name”? Use a consistent strategy for tables as well as concepts like status, timestamp, IDs and alternate keys (such as name).
  • Inconsistent naming conventions; STAT_DATE, STATUS_DT, or STATDATE? Pick a convention and stick with it!
  • Inconsistent usage of the same column; a common example is a varchar column named TYPE that means different things to different applications. I’ve noticed that even good data modellers make this mistake.
  • Overloading fields; things like a comma-separated list of values where only the application knows what each value means. Don’t use a relational database if this is how you model – text files may work better!
  • F normal form; Johnny just took a class on database design and learned about normalizing a database and now you have 175 tables in what he claims is 6NF! There are appropriate times to denormalize just as often as there are to normalize.
  • Know when to OLAP; why are there summary tables attached to each of my transaction tables?
  • The Cauldron of Data; this is the crux of the problem, everyone is so scared of the data that they lose control of the schema and treat it like a bubbling cauldron too paranoid to make any significant changes out of fear of breaking a legacy application. This is the end result in any application where they didn’t manage their evolving database schema.

Let’s talk about some solutions!

First of all, this is a developer problem! Don’t expect your DBA or Hibernate to fix this for you – if you’re a developer this is your problem. This leads to the central theme of how I propose database evolution to be solved: Your development methodology must cover application and database development.

If your application depends on a database, then your development methodology better cover both application and database development! I know, you like building the code and leaving the responsibility of the database to someone else. But that brings you back to the Cauldron of Data scenario where you can’t make any significant changes to a schema because it got our of your control. And if you can’t control the schema you can hardly control the application that depends on that schema!

We tend to ignore database development as a way to simplify our application development – I suggest you make the application development suffer by adhering to a development methodology that works with database development! Think of it like this: you’re going to be the databases bitch if you don’t take this responsibility.

I know, this seems like more work from the application side but like anything done right it’s hard to imagine doing it differently once you get your development methodology to cover applications and databases. That said, how does one integrate their database development into a unified development methodology?

Let’s look at the differences between application development and database development (and what needs to change in the traditional Agile-based development methodologies):

  • Databases contain data that cannot be lost; this means you have to migrate production rather than reinstall
  • Data is easy to migrate when your data is not controlling you (see the Cauldron above)
  • Rebuilding your database is like compiling and deploying your code (this sounds like a maven target)
  • Databases should have unit tests, and not just for the stored procedures (more on this later)
  • Map your database development to your project lifecycle goals exactly like you would with application development (say, in Maven 2) but introduce the migrate step in the deploy target.

If I compile and build my application why not build the database schemas at the same time, just like I would with anyother dependent artifact? So, let’s get practical and talk about things you can actually do to accomplish Agile-based evolutionary database design:

Create a Database Schema Change Policy

Keep it simple and make sure you answer how you plan to address schema migrations planned and unplanned. Your process should lend itself to an emergent property of better schema design. This by itself requires you to not only support planned and unplanned schema changes, but to encourage them. Either do a big design up front (not-agile) or encourage change in all aspects of your development (including your data model). I recommend you clearly define a process for planned migrations (migrating from one version of the schema to another) and unplanned patches (critical fixes, the kind you get in the middle of the night).

Bring DBAs in Early

You’ll need their help, and you know it, best to get friendly with them early on – give them a chance to know what you’re trying to do on their database. I argue that you’ll find more resistance to agile development from software developers than you will from DBAs. Most DBAs have been on the front-line fixing smelly database code and are likely your strongest ally. Not only can they help with the development process they can (and should) assist with design.

Use Stored Procedures

Read up (separately) on “End-to-End Architecture”, if your schema is going to change then you better clearly define your endpoints and provide an API-like package to abstract the schema completely. What’s great about stored procedures is that they can (and should) be treated like application code. I recommend two types of packages, consider using a suffix of _PKG and _API. All of your object tables will likely have GET, PUT, and DELETE procedures. These should be autogenerated, if not, write yourself a script or invest into some software to autogenerate CRUDL stored procedures. Each schema should have a _PKG with CRUDL procedures for all object tables. You also have business logic, from complicated transactions to simple procedures like authenticate(user, pass). Procedures that encapsulate business logic should be in packages with an _API suffix and follow the same rigorous design that would be employed for any application API.

The naming convention of an _API and _PKG suffix is unimportant (any convention here would be fine), but what is important is distinguishing between your CRUDL procedures and your APIs that encapsulate your business logic. Once you have a convention that cleanly separates these concepts you now have a mechanism which can completely abstract your schema from your application and best of all, you’ve likely imposed some constraints and standardization on your object tables that lend themselves to easy autogeneration of the CRUDL procedures.

Version your Schema just like you would an Application

Versioning is a given for application code, why should database code be any different? Schemas, default data, packages, grants, everything should be versioned along with ALL other application code. Applications depend on a versioned database, just like any other versioned artifact – I would expect my build to fail if the dependent database for my application doesn’t exist.

Each of your schemas is like an application, and all of the DDLs should be checked into source control and managed as applications! Check in your test data (sql inserts) and you’ll easily be able to define a database-specific unit test environment!

Use the Right Tools

You’ll need more than a modeling tool, modeling tools are great at helping you to visualize your schema, but don’t get carried away. You need to track schema AND default data! Use tools that fit your process not the other way around – write your own scripts if you need, they’re not that hard once you have a working process. Between Maven and some sqlplus scripts we’ve gotten plenty of mileage at my current job with the following scripts:

  • drop_objects.sql; loops through all of the schemas and drops everything, there’s also a delete user approach but with the drop script you don’t have to redefine your tablespace; this script is never run in production
  • create_objects.sql; loops through all of the schemas and creates all of the tables and default data; this script is never run in production
  • create_pkg_spec.sql, create_pkg_body.sql; loops through all schemas and compiles the package specs and separately the package bodies
  • run_tests.sql; loops through all schemas and runs database unit tests, it uses stored functions with setUp and tearDown procedures similar to Junit; this script is never run in production
  • migrate_objects.sql; loops through all objects and runs a per-schema migrate script which is created based on the delta between two different versions of the same schema

Localhost Development

Why else do we have fancy development workstations? Stop assuming Eclipse is allowed to eat up all of your resources – let Oracle do it! The only way to empower your developers to be agile is to give them an environment where they can easily change the database schema!

We’ve gone so far at my current job to support localhost Oracle instances where we checked Oracle into our software version control (along with Tomcat, Java, etc). We tried using the Express Edition but it didn’t support all of the PL/SQL code we were developing so we’ve got the full bloated 10g running on all of the developer workstations (takes about 30 minutes to install on a new workstation). So don’t tell me you can’t run MySQL locally!!

REFERENCES

http://www.martinfowler.com/articles/evodb.html
http://www.agiledata.org/essays/databaseRefactoring.html
http://www.agiledata.org/essays/databaseRefactoringSmells.html
http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf

Making Java less Difficult

Wednesday, May 3rd, 2006

I’m sure that you’ve heard this before, but I’ll say it anyway: Why do Java developers make simple problems difficult? There are plenty of people who have written on that topic, and I have no desire to further expound the point. I think we all get it. What I do want to talk about is how to make things less difficult, in particular how to make web application development as easy in Java as it is in Perl, PHP, Python, or Ruby.

I challenge that the answer is not to build Yet Another Crappy Web Framework ™. I for one, am terribly sick of complicated frameworks that are more work to configure than a web application is to create manually.

What I’ve noticed, is that developing in Perl is often faster than Java mostly due to the built-in regular expression handling Perl offers; and this is despite the fact Java offers a comparable regular expression engine.

Two things that slow me down as developer:

1) Staring at code that involves a StringTokenizer, while loops, and several nested substring statements where I’m counting characters on my figners. Why do Java programmers continue to write code like this? I suspect the reason is because of the next issue.

2) Wanting to use a regular expression, you first create a Pattern object, followed by a Matcher object, remember to compile your pattern, and then finally match the pattern using the Matcher object. Your matched object can use the group method, which is implemented from the MatchResult interface. I’m going to repeat myself, as this bears repeating, why do Java programmers make simple problems difficult?

These two issues cover 90% of the cause for Java development to be slower than Perl. Solving these two problems should then speed up my development in Java to be on par with my development in Perl!

Let’s look at some sample Perl code:

if (/a(.*)b/) {
  print "$1 is between a and b";
}

We could do this fairly easily in Java, but let’s be honest, the real strength of Perl is when I start doing things like:

 @bar = split(/:/, $_);
 @foo = grep(/^#/, @bar);
 print join(':', @foo)

While that example is simple to any Perl programmer, it is, admittedly less intuitive and less clean than a Java version. Although it is only three lines rather than the twenty or more it would take in Java. But what about something like this:

XYZ bar = XYZ.split(":", arg);
XYZ foo = bar.match("^#");
System.out.print( foo.join(":") );

This seems easy, and perhaps clean enough not to bewilder the Java developer the way Perl code tends to. Perhaps we can create a utility Java class that supports the needed data structure and methods to write code like that. What other methods would we need to empower Java developers to stop writing the usual tokenizer/substring mess? Consider just the basics:

  join - join the values of an array into one string
  split - split a string into parts
  match - match the parts
  matchAll - match repeated patterns
  trim - trim the ends of a string

This much is already provided in Java in one class or another. The String class already lets me match and split, but it’s not enough. I need more than just a String class, I want an array of Strings. No, I want a dynamically sized array of Strings that I can easily get and put from. While I’m at it, I want it to be associative array, where I can treat it like a normal array and it will retain index order, or I can add my own keys (similar to PHP arrays). Given a data structure like that, add in the above methods to that data structure, and I think we’d be in business.

The scripting languages provide a default context that includes regular expressions, hashtables, dynamically sized arrays and various string manipulation features. I can do all of that in Java, but it’s not conveniant creating a different object for multiple data structures and methods, especially since they’re almost always part of the same context.

Let’s create one class that contains the data structure and the methods. This should solve both of my issues, making it easy to do what once was difficult and speeding up development for most all applications.

Even better, we can leverage what Java is good at to get this done. Let’s extend a Java Hashtable which solves most of the data structure part.

public class Preg extends Hashtable {

 public Object get( int inInt ) throws NullPointerException {
  Integer cast = new Integer(inInt);
  return this.get(cast);
 }

 public void put( Object value ) {
  Integer key = new Integer( this.size() );
  this.put( key, value);
 }
}

I’ve added the ability to put values without providing a key, this will allow us to use this object as both a HashTable and a Vector (i.e. a resizable array). For those of you who are upset that this is inefficient, you’re absolutely right. Although I would argue that it’s a negligible difference. But since we’re speeding up development, this will more than compensate since I’ll actually have time to optimize my code and do some refactoring before the deadline (a luxury I rarely see Java developers having time for).

The next step is to create the methods, we should also provide static versions of the methods to gain maximum convenience. For example, I may only implement one “join” method, but I’ll provide several prototypes:

 public String join() {
  return Preg.join("", this);
 }
 public String join( String glue ) {
  return Preg.join(glue, this);
 }
 public static String join( Preg src ) {
  return Preg.join("", src);
 }
 public static String join( String glue, Preg src ) {
  ... the actual implementation
 }

After I do this for all of the methods, I’m left with a very powerful object that allows me to accomplish the previous Perl example:

Preg bar = Preg.split(":", arg);
Preg foo = bar.match("^#");
System.out.print( foo.join(":") );

Three lines of Java that for once is equal to three lines of Perl!

All the fancy trickery that we do in Perl or PHP can be done in Java with just the right mashup of HashTable, Pattern, and Matcher. You can split a string into parts, filter, trim, and arrange to your hearts desire without ever having to count a character offset for a substring!!


Note: I have all of this working for one of my projects, which will hopefully move as fast as it would in Perl or PHP (it’s a web application, so I suspect this will in fact compensate for that 90% slowdown I mentioned earlier). If there’s interest, I can provide the source and/or Javadoc.

Even with my extremely verbose Javadoc it’s surprisingly not that much code.

Creating a RESTful SOA

Thursday, April 6th, 2006

I’m going to keep this discussion purposely language agnostic, however, I currently have two projects where I’m implementing a complete RESTful SOA (Service Oriented Architecture). One in J2EE, with a Model 2 architecture using Servlets, JSP, Oracle, and XSLT for the presentation-layer transforms. The other project is LAMP-based (Linux, Apache, MySQL, and PHP) using mod_rewrite and a custom front-controller.

It’s worth pointing out that a service-oriented design is different from object oriented design; and further, these two concepts are orthogonal to RESTful web applications. The PHP application I’m working on is not object-oriented (OO) while the J2EE project is rigorously OO. Both projects follow the same RESTful SOA pattern that I am presenting here.

Both projects have elements of aspect-oriented design (AOD), such as the authorization and logging aspects. I’m going to avoid this topic, and instead focus on the specifics of a RESTful SOA, but if you’re interested it’s worth understanding AOD as a way to compliment the traditional object-oriented design principles that don’t often lend themselves to things like REST.

On with the architecture!

Context of Use

The first step in building a RESTful SOA is to understand, and even better, document your context of use. An easy way to do this is to create a spreadsheet with three columns: Roles, Environment, and Goals. Create a list answering the following questions:

Roles: Who is going to use this site?
Environment: Where are they working?
Goals: What are they trying to do?

This is a simple exercise that is often overlooked. Ignoring this will just give you more work and a crappy product. The beauty of a RESTful SOA is that it provides resources that map to the different types of users and their goals. Don’t add anything beyond what your users need to do. You’ll find yourself making simple pages that provide a positive user experience.

Resource List

Walk through your context of use and create a list of content categories that meet your users goals. Even if you’re dealing with advanced applications, don’t complicate this process, tackle each of the users goals separately with different content sections.

Create a simple file hierarchy representation of your content, mapping closesly to your users and their goals. Here’s an example snippet:

/
/support
/admin
/it
/support/cases
/support/cases/num *
/support/request
/support/metrics

The REST is easy

You’ll find that in this process you’ve listed all of the resources that are needed to fulfil the needs of your users. Don’t invest your time into a CMS or a portal, the next step is to build your own application. You’ll find that a RESTful SOA will be quicker to build than configuring “Hello World” in a jsr-168 portlet!

Front Controller

This is critical, avoid JSF, and stop inserting random code into your html via JSP or PHP or whatever. You’ll want one small program that intercepts all traffic. Every language has a different way of doing this, whether you have to use mod_rewrite or servlet-mapping I don’t care; but funnel everything into one program and grab the path:

  class FrontController {
    path = getPathInfo();
    ...
  }

The Method Matters

The beauty of REST is that we leverage HTTP, URI, and our own web-server to do most of our work. Your front controller needs to process each RESTful HTTP method (GET, POST, PUT, DELETE) separately. Also make sure a request object gets passed in (query string, post variables, cookies, etc). You should have something like this:

  class FrontController {
    path = getPathInfo();
    ...

    function handleGet(path, request) {
      ...
    }

    function handlePost(path, request) {
      ...
    }

    function handlePut(path, request) {
      ...
    }

    function handleDelete(path, request) {
      ...
    }

  }

KISS your CRUD Goodbye!

I should say CRUDL (Create, Read, Update, Delete, List), but that didn’t sound as cool. The next step is to parse your path, this varies from one application to another, but generally your path will contain what I like to call a “resource” followed by “input”. For example:

/support/cases/50

50 is not a resource! However, /support/cases most certainly is. Also, if the user goes to /support/cases, I should be showing them a list of cases rather than details about a specific case.

The simple way to solve this is that when you process your GET request and you haven’t been given any “input”, just a resource name, you’ll want to call the “List” method rather than “Read”. This keeps it simple and very, very RESTful.

Put the DAO to REST!

The astute reader may have noticed I glanced over how you’re suppose to derive a resource and an input from a sigle path. Well, REST assured, this is trivial. Create a simple data model with a Resource entity that maps to itself (one-to-many) to allow for multiple children resources. So, “/support/cases/50″ would involve two entries:

+-------------------+
+ NAME    | PARENT  |
+-------------------+
+ support | NULL    |
+ cases   | support |
+-------------------+

Obviously, you’ll want to use unique numeric IDs so you don’t have name collisions, but you get the idea. This will allow you to lookup unique resources and an optional input parameter. The “resource” entity should also contain things like “Title”.

The children resources represent the sub-elements that you may want to appear on a menu navigation. So you’ll want to keep these handy also.

Finally, you now have a unique resource, with optional input and one of four HTTP methods. This all maps nicely into one pretty package, which should be something like:

  class RESTController {
    path = getPathInfo();
    resource = getResource( path );
    input = getPathInput( path );

    function handleGet(resource, input, request) {
      if ( empty(input) )
        resource.list( request );
      else
        resource.read( input, request );
    }

    function handlePost(resource, input, request) {
      resource.update(input, request);
    }

    function handlePut(resource, input, request) {
      resource.create(input, request);
    }

    function handleDelete(resource, input, request) {
      resource.delete(input, request);
    }

  }

Presentation and RESTful Content

Probably one of the biggest architectural mistakes you’ll find with web applications is an incorrect MVC abstraction. It’s easy to say “Model, View, and Controller”, but rarely is this done right. Typically the architecture looks pretty, but on close strutiny calls for tightly coupled templates with functions that belong in the model. Do yourself a favor, don’t let your view talk to your model.

You’re already using XML

Whether you like it or not, you’re on the web, you’re producing html for your content and hopefully keeping your presentation in css. The great thing about XML isn’t that it’s easy to parse or to read. I hate XML, it’s ugly, stupid, and slow to parse. What it is good at is transforming from one schema to another (via things like XSLT).

When you ask your model for a “read” or a “list” operation, you should be getting back pure conent. Content in the form of XML. This is content without a presentation, and it belongs in your model not in your view!

Putting the Representational into REST

If your model is dumping out content in the form of XML, the only thing left to do is for your fancy front controller to transform this XML into whatever representation that the user asked for! It’s a good idea to make the default representation html. The code should now be looking like this:

  class RESTController {
    path = getPathInfo();
    resource = getResource( path );
    input = getPathInput( path );
    view = new ViewDispatcher( request.getType() );

    function handleGet(resource, input, request) {
      if ( empty(input) )
        model = resource.list( request );
      else
        model = resource.read( input, request );

      output = view.transform( model );
      print output;
    }

    ...

That’s it, you can take your presentation-less content and transform it into anything you want (html+css, flash, pdf, plain text). You could even make it a SOAP envelope, autogenerate a WSDL and now you have REST and SOAP without doing any extra work!

So there you have it, this is the basis for a RESTful SOA, that if done right, will allow you to build really cool web applications. Since it’s RESTful, every resource is accessible in a browser, or as a service to an API, or whatever REpresentation you want!

Funny thing, it seems that the whole point of REST is to do MVC the right way, and de-couple everything. I’m not sure why all the big CMSs and portals claim to follow an MVC (or Model 2) architecture but fail to provide a decoupled view.

Good Design, Bad Code

Friday, February 3rd, 2006

I’ve been reading PHP 5 Objects, Patterns, and Practice, a great book exploring modern day design practices and the conventional wisdom of good software design. For serious PHP developers it’s a great update to the “gang of four” in a PHP context and really shows how PHP stands up as a serious development environment for the enterprise.

Unfortunately, all of this presents some serious misgivings about conventional software engineering methods that this book serendipitously stumbles upon. One of the code excerpts reads:

  class Army extends Unit {
    private $units = array();

    function addUnit( Unit $unit ) {
      foreach ( $this->units as $thisunit ) {
        if ( $unit === $thisunit ) {
          return;
        }
      }
      $this->units[] = $unit;
    }
   ...

I’m sure to the hot shit J2EE/EJB software architect this is perfectly good code in the context of a composite pattern. The above case represents a composite class that holds Unit objects, and the addUnit method simply adds Unit objects to the composite. The great part is this class will rarely, if ever, need to be updated since the business logic is separated into individual leaf classes and will be easier to maintain. Java programmers would be proud to see PHP like this.

On the other hand, every 15 year old hacker will now have something in common with old-school C programmers. They will recognize that this code is shit. It’s utter crap. It’s inefficient, sloppy, and scales poorly while being difficult to debug.

Have we as software engineers become so pig-headed that we no longer care about code efficiency and happily let a big-O(1) operation become a big-O(N) operation just because it uses the right pattern? When did software engineering mean you have to be stupid? Can’t I be a software engineer that follows good, disciplined, development methodologies while still favoring good code? We’re assuming that the intelligence of our design patterns will make up for the bad code.

For the J2EE crowd, PHP arrays are all associative, similar to a Vector class. In this case you’re adding individual leaf objects to a composite class; and every time you do you’re looping through an associative array to determine if the object already exists in the array. Each time you build this composite, which in a web environment is every page load, you’re doing a mountain of extra work. This is great for a small number of objects, but what if you have thousands, or even millions of objects at hand. This is what scalability is all about, right? When did our design patterns make us stupid?

The goal of software engineering is to produce good software, good software requires good code. Disciplined software engineering may have something to learn from the hackers who would recognize the bad code excerpts in this book.

Walking in the Foreign Lands of Web 2.0

Wednesday, October 12th, 2005

You ever wonder how to monetize the long tail? Or how to realize cyberinfrastructure? Hopefully you have no idea what I’m talking about, which means you’re thankfully naive to the politics of academic supercomputing and the hyper-fluff of over-paid business leaders “revving the web” at this years Web 2.0 conference in San Francisco.

I realized from day one I was out of my element, talk of business models and ad placement left me wondering whether academic politics aren’t such a bad thing after all. But are these worlds really that far apart?

Amidst the pretenders and the venture capital monkeys the Web 2.0 conference brought together a collection of like-minded individuals with one cohesive idea: the web as an application platform.

Academics have been pushing this idea for years, albeit in a backwards and borderline retarded method of grid portals and high-latency web services. User experience is an almost alien concept in academia, leaving would-be web portals in a state of chaos and such poor usability that they’re… well… unusable.

Based on my completely unscientific and haphazard estimate, the private sector has been pushing further and faster than academia that it’s at least a few years ahead of academic research projects (especially when it comes to deploying web services and web applications). The idea of REST, RSS, ATOM, AJAX or even CSS are strangely missing from academic projects who are currently pushing such hot new technologies such as SOAP, WSDL, and the ever successful JSR-168.

Hopefully, you’re spending your thoughts on more important topics such as the flying spaghetti monster, but I’ve been up at night wondering why academic web applications are so disparate from their private sector counterparts.

The industry leaders at Web 2.0 may be motivated by money, but they’re inventing their way to successful business models based on technological innovation; building and integrating web services in novel applications creating new levels of connectedness and information sharing; something sorely lacking in mainstream academia.

So what do we do about it? We do what everyone (including Microsoft) is doing: we watch Google and copy everything they do!