Saturday, May 15, 2010

Chasing a moving target

Chasing a moving target:

A situation where you tell your customer "… I am not sure when the next version is expected or what features will be included in it but let me assure you sir, it is going to be good…" is a situation you probably want to avoid. Engineers can actually say that and feel good about themselves, but the product management, marketing and sales, well they will not tolerate it and unfortunately they can be loud...really loud.

If you can relate to the above you probably understand what we were trying to avoid in our porting project.
We were committed to move to Java and in the meantime release a C# version with new features.

In my last post I described the different options we considered for porting from C# to Java and concluded in the path we chose, the automatic conversion.
This post is about the technical process of automatic porting and the difficulties in chasing a moving target.

One of the objectives was reducing entropy, we could not commit on how long will it take us to stabilize an automatic converted project. It is actually more than stabalizing it is also filling the infrastructure gaps and fixing bugs originated in the Java-C# behavioral differences (e.g transaction management). We couldn't wait to finish the next C# version and only then start porting to Java..(the product management guys, remember..).
We got into a situation where we were working on the Java porting in parallel to developing the next C# version. In addition we obviously couldn't drop features between the two versions, if we introduce a feature in the next C# version it must be in the java version as well (the product managers can explain why, and believe me they can be very persuasive), that's how we found ourselves chasing a moving target.





We tried to figure out how can we utilize the automatic porting and keep up with the ongoing C# version, we considered 2 options, The first was to do a onetime conversion followed by a double commit (one for C# and one for Java) for each feature/fix in the product. The other was to create a recurring automatic process in which we'll be able to get for "free" all the changes committed in C#.

The onetime conversion seemed technically easier, after an initial conversion to Java we will maintain two code bases (one in C# and one in Java) and the team will double commit each feature/bug fix in the product. By taking this approach we could write features for the C# version and fill-in the gaps in the Java version. The problem was that the feature set and the bug fixes were too massive, taking this path might have ended up in missing the deadline for the C# version (the double commit is cumbersome, writing the code twice reviewing it twice testing it twice etc.), and probably not stabilizing the Java version. Another problem was that this manual process is error prone, doing the same twice... plus at least at the beginning commits to the Java code base could not be tested (The Java code did not immediately run after the conversion, there was some manual work to be done).

We had to engineer a recurring process with the following goals in mind:
1. Develop features in C#
2. (Magically) get them in the Java version
3. Have parallel work on the Java version: filling the missing infrastructure, fix conversion bugs etc.
4. Stabilize the Java version.

Our application is logically divided into Frontend and Backend (no surprise there).
This post is focused on the Backend side, we wanted to have the Backend converted to Java and working with the C# Frontend (as a stepping stone), get the C# client which is written in WPF and communicates using WCF with the C# backend, to work with a Java backend running on JBoss AS and communicate using web services.
In parallel there was another effort to port the Frontend to Java technologies.

The process we came up with was:
We had a repository for the on-going C# development, we created a branch from this repository in which we did a dumb down of the code to be able to use the automatic conversion, for example commenting out the usage of LINQ which the converter did not process well.
Here is a simplistic example for demonstrating the process, this is NOT how our RHEV-M code looks like (public data members and such).
C# using link:

public class Person
    {
        public Guid id;
        public String name;
        public List friends;

        public Person(String name)
        {
            this.name = name;
            id = Guid.NewGuid();
            friends = new List();
        }

        public void addFriend(Person p)
        {
            friends.Add(p);
        }

        public List getFriendNames()
        {
            List names = friends.Select(a => a.name).ToList();
            return names;
        }

    }
C# after dumb down:

public class Person
    {
        public Guid id;
        public String name;
        public List friends;

        public Person(String name)
        {
            this.name = name;
            id = Guid.NewGuid();
            friends = new List();
        }

        public void addFriend(Person p)
        {
            friends.Add(p);
        }

        public List getFriendNames()
        {
            List names = new List();
            foreach (Person p in friends)
            {
                names.Add(p.name);
            }
            return names;
        }
    }
Java:

public class Person
    {
        public Guid id;
        public String name;
        public java.util.ArrayList friends;

        public Person(String name)
        {
            this.name = name;
            id = Guid.NewGuid();
            friends = new java.util.ArrayList();
        }

        public final void addFriend(Person p)
        {
            friends.add(p);
        }

         public final java.util.ArrayList getFriendNames()
        {
            java.util.ArrayList names = new java.util.ArrayList();
            for (Person p : friends)
            {
                names.add(p.name);
            }
            return names;
        }
    }

The dumb down branch was the input for the converter and the output was committed to the third branch which was in Java, but did not compile (the raw output of the converter, as you can see in the example Guid is not a part of the JDK).
On top of it we created a post processing phase:
1. Files manipulation, some SED scripts to manipulate the packages of the files, add the import statements, etc.
2. Creation of a compatibility package, in which we created place holders for the System classes used in C# and that the converter did not translate into Java, for example 'Guid'/'DateTime' etc .
3. Generating the DB layer.

With the Data Access layer we had to be a little more creative, In C# we have a DBFacade class which executes stored procedure this was not translated into Java. We used the automatic conversion to generate the DBFacade class in Java but it contained only the method signatures. We wrote a java utility which actually filled in the methods by generating code which uses JDBC to execute the stored procedures. The utility used the method names to "guess" the stored procedures we wanted to execute and the method parameters to populate the stored procedure .

And then the code was committed to the last/"clean" branch.

We did this conversion process almost every week, the idea was to have an on-going 'parity' Java version.
The process is very technical and takes only a few hours from start to end. Now we have a process for getting all the code written in the C# version into Java and we could in parallel work on the gaps in the Java version.
There were some ground rules for working on the Java version, one of them was to change ONLY what must be changed, for example the naming convention was inherited from C# and is not consistent with the familiar Java naming convention, the idea was not to change that at this point, because every change we did in Java might have caused a conflict in the last step of the merging process and it was very painful, another rule was NO optimizations/cleanups in Java while we are converting, again to minimize conflicts.

There were some infrastructure gaps we worked on like the Active Directory, XML-RPC layer we used in C#, transaction management etc. and there were some technical gaps like implementing the compatibility place holders, filling the LINQ code etc.

The nice part was when we actually got the Java version running and we found bugs which were not caught in C#, we fixed them in the C# version and after about a week, when we did the conversion process, we had the fixes in Java.


Cheers,
Livnat