Posts
832
Comments
691
Trackbacks
1
March 2011 Blog Posts
Reflections on a 6 month Software Development Project

Now that I’ve successfully completed a 6+ month project, I thought I’d write down some of the things I learned from it (or some of the things that happened anyway), as much as for future reference as anything else.

Project Overview

In part because of NDA blah blah, and in part because it isn’t relevant, I’ll leave out some of the specifics.

Having said that, here’s the skinny.  The ultimate end client is a trading team, buying and selling blah blah for the people who have hired them to manage their portfolios.  In the grand scheme of things, the volume of the trading is not insignificant.  It isn’t real-time hedge fund type volume, but on a daily basis, it’s the sort of money that matters to people.  The reason why this is important to bring up is that it sets a level of risk. 

Not that your standard CRM tool isn’t important, but if you, for example, have a production outage that brings down your system for a day, not having a CRM tool for a day is less important than if you can’t trade on the market for a day.  There’s a real risk there, not only in terms of market exposure, but also in the more general sense that people aren’t going to want you to manage their money if you can’t.  So, there is a built-in nervousness about changing anything, even if the current system sucks, because a sucky system that allows people to trade (however inefficiently and painful it might be) is better than a ‘better’ system that fails.

Briefly stated, the overall goal of the project was a backend change, one of those things that is totally unglamorous, but important.  In this case, it was a change from using SQL Server as a backend store to using Oracle.  The overall system that this project was a part of actively managed trading, but the particular piece this project was trying to changed involved taking data from multiple sources, manipulating that data in certain ways, and then, ultimately, passing that data along to an external vendor.  Because it was part of a trading system, it had two main components, one that sent the important data to the external vendor overnight before the market opened, and one that sent data to the external vendor during the day while the market was open.

Besides the backend change, the project also involved changing the existing legacy system to one using more recent technology.  Now, ‘legacy’ is a term that often simply refers to any existing system that you’ve never worked on before, but in this case, it was a true legacy system in the sense that it was cobblestoned together over a long period of time (not sure how many years, but at least two or three) with no central architecture or guiding principles.  It was put together by a wide range of people, including subject matter experts who were very good business analysts, but not quite so good software developers.

In short, we needed to change a system that included a mix of C# 1, SQL Server DTS (yes, I said DTS), and VBScript (yes, I said VBScript) to C# 3. 

Don’t fix anything

Since backend change projects are of a special sort, there is a key question that needs to be addressed up front.  Should the project try to fix any of the known issues?  Every project has its own parameters, but, in general, the correct answer here is, no.  In our case, the ultimate output of the project was a set of data that was being sent to an external vendor, with all the constraints on that that it entails.  At the end of the day, you know that your backend changes are successful if you can validate the data you used to send with the data that the new system is sending.  If you begin to introduce ‘fixes’ as part of the project, then you have the problem of determining whether the data is different because you successfully introduced a fix or because you incorrectly changed the backend.

It is important to note that it is generally impossible to make the data you send 100% identical to what you sent before.  There will almost always be differences (in this case, the fact that SQL Server and Oracle act differently in certain cases introduces unavoidable differences).  But, it is important to narrow down the range of differences so that they are explainable and acceptable, and so it is almost always the best path to take to avoid trying to change the behavior of the system in question.

Accept that which you cannot change

I am a very firm believer in a small number of things when it comes to software development.  Some of them involve how code is written, some involve how projects should be managed, and any number of other things.  Other than this small number of things, I am also a very firm believer that when you are involved in a project like this, you need to determine the things you cannot change, even if you don’t like it, and learn to live with them and work with them the best that you can.

Generally speaking, I believe that certain agile practices, especially project management practices like Kanban, are best.  But, with this client, and in this situation, that wasn’t going to happen.  Waterfall and BDUF prevailed (although I found ways around that at times).

Generally speaking, I believe that as a software developer, you write the best code when you have full and free reign to use any and all software tools, libraries and techniques that you deem proper.  But, with this client, and in this situation, that wasn’t going to happen.  Certain software tools, libraries and techniques were pretty much mandated.  More importantly, certain things were excluded. 

Generally speaking, I believe that certain deployment practices are crucial to successful software projects.  But, with this client, and in this situation, that wasn’t going to happen.  Certain deployment practices were already in place and not going to change (although I found ways to cheat).

For some people, these sorts of restrictions are unacceptable.  I can understand that perspective, but I can also see it as unprofessional.  Regardless, as the consultant with this client, in this situation, I learned to discover that which I had to accept and let it go.

Detailed Design, to a point

Every experienced software developer knows that detailed designs are only as good as the paper that they are written on, which is to say, not worth much.  Unless you have an infinite amount of time, you know that no matter what you put down on paper as to the design you are going to implement, the minute you start to do actual coding, you will find out that you missed a requirement, or forgot an assumption, or what have you.

At the same time, when you are working on the sort of project that requires capital expense justification, you have to do it.  There’s no way around it.

So what you learn to do, is cheat.

I had the luxury of needing to do a ‘spike’ solution to what was going to be implemented.  As with all spike solutions, you cut corners to test an end result.  Since I knew I had to produce a detailed design to present to the (then) development PM and the rest of the team, I produced a set of documents based entirely on the spike solution, knowing full well that it would change.  I produced a nice PowerPoint presentation, I created pretty class diagrams based on the spike solution, and packaged it all up in a pretty package.

I’ve often made a joke that I have the design skills of a drunken ferret, but this worked.  The (then) development PM actually called it ‘a work of art.’ 

It wasn’t that the spike solution was totally off-base.  I fully intended to take that as my starting point, since that’s the entire point of doing a spike solution.  But, I also knew that once coding started, there would be, oh, a bit of drift.

I don’t recommend this as a practice unless you know you can pull it off.

Once coding starts, all bets are off

Shortly after the detailed design was approved, the (then) development PM left the company, but regardless, once you start coding a real implementation, you start to code without all of the known cut corners, but also, you find all of the areas where you simply got it wrong.

As a consultant, you are often working with systems that you have little to no knowledge of.  In a perfect world, you have some sort of training or ramp up on them, but often times, you really don’t.  This is part of what being a consultant involves.  What that means more often than not is that once you start coding the actual implementation, you find out all sorts of things that you never would have even thought to include in your detailed design.

You also tend to find out all of the dependencies that you didn’t anticipate.  To give one example, since this project involved working with Oracle, I had a dependency on other teams that managed the various Oracle installations.  Sounds fine, until you discover, for instance, if you need a change to an Oracle installation and you are working off-hours, then if the Oracle team isn’t tasked to work the same off-hours, you are dead in the water until they get back to work.  If your code relies on access to data from another company system, and that system goes offline because of other projects they happen to be working on at the same time, you are dead in the water.

So, you find yourself in a position where you can’t get the work you need to get done in the time frame you promised/estimated it would get done in, over and above what you end up having to rework because your initial design was wrong in the first place.

Deployment matters

Related to the above, it is important to keep in mind that deployment of your code, especially across multiple teams, matters, and matters a great deal.

Almost nothing can block progress more than a failed deployment, except maybe a deployment that doesn’t get done because the team you need to deployment either isn’t kept up to date on what you need to deploy and when, or if they are too busy because of other deployments for other projects that take priority over what you happen to be doing at the time.

You really have to keep them involved, informed and up to speed.

Burning bridges doesn’t usually help

I’ve never met a bridge I couldn’t burn, and have often done so, but it doesn’t help.

Sometimes you come across a blocking issue that is due to sheer incompetence.  The people on that other team really haven’t a clue to what they are doing.  But, often times, that isn’t really the issue.  They could be overworked.  They could have been directed by their superiors not to pay the slightest bit of attention to your project, because that other project they are working on has been given priority above everything else.  It’s really important not to take the ‘escalate to their managers’ route unless you really need to, because you will find out that the next time when you really need something, the escalation route won’t work as well.

Love your PM

When all else fails, hope that you are working on a project that has a good PM (yes, there is such a thing), or at least a PM that is listened to and respected by upper management.  As a developer, there is nigh near nothing you can’t work around if you can explain it to the PM because they can then explain it to the people up above that, at the end of the day, really matter.  This is really, really important.  The (then) development PM left at the start of this project and I was lucky enough to work with the next PM whom I’d known previously, and so was able to use him as a filter with the higher-ups to explain and remove roadblocks (amongst other things).

Summary

I’ve talked very little about code specific things there, and there’s a reason for that.

I couldn’t use NHibernate or even Entity Framework, so I figured out a different way to do data access.

Thought I don’t use mocking much anymore, I couldn’t use Rhino Mocks, but had to use NMock instead.

I couldn’t use StructureMap, but had to use Unity instead.

The reason why I haven’t talked much about those things is because they don’t really matter.  The actual writing of code is about the least important part of  a major software project, and as a senior developer, you should be able to write good code regardless of that.  What matters more is determining what you can’t change, cultivating relationships with external groups that you have dependencies on, and working with the PM to resolve blocking issues.

That’s how a 6 month software development project can end up being a success.

posted @ Sunday, March 27, 2011 2:07 AM | Feedback (1)
Road Trip

The last road trip that I took this February took me through Dallas, Albuquerque, Phoenix, Denver and Nashville.  It was highly enjoyable (especially seeing my cousin and his family in Denver, complete with kick-ass 5th row seats to watch the Pens beat the Avs), but the enjoyment was tempered by the fact that I knew when coming back, I had to launch to production a six month project that was already delayed, and with less than perfect confidence that it would be successful.  There was also the outside chance that if the production launch failed, I’d eventually get fired.

As it turned out, it was successful.  The development team was responsible for ‘Tier 1 support” for the first two weeks after launch (which for all intents and purposes meant I was personally on the line for any production issues 24/7 during that time), and I put the over/under count on the number of fixes we would need to promote at 7.  We actually passed that the first day, but they were all very minor configuration-level issues that didn’t cause any harm to the end users (and which weren’t that surprising given that the overall system of which my project launch was a part involved 6-8 other teams (depending on how you count it)).  On more than one occasion during the two-week warranty period, I asked the PM if he was sure the code was migrated.

This time around, I’m heading out on an even longer road trip.  Though there is a lot of important work that awaits my return, I had the added bonus of resolving (with assistance) of a significant production issue that threatened to put a damper on the whole thing (short story: a query went from taking 50 minutes down to 1 second.  And it was not because it wasn’t safe by default…..).

This trip will take me to, in order: Vegas, LA, San Jose, Seattle, Vancouver, Edmonton, and Calgary.

I’m hoping my new MacBook Air proves to be a good machine.  I intend to do some development work along the way (there’s a phrase I can’t remember that is relevant here.  Something about “snowball” and “hell.”  I’m blanking on it), so I hope it stands up to it.

posted @ Saturday, March 26, 2011 10:20 PM | Feedback (0)
Initial Impressions of UberProf, part 4, or “Ending with a whimper”

In the previous three posts (1, 2, 3), I had talked about some of the good and bad experiences that I had experienced with the Ling to SQL Profiler.  As it turns out, the end was pretty anti-climactic.  It’s hard to give an overall impression of the tool suite based on that, but I’ll try.

You can’t use what doesn’t work

As I had mentioned before, there was a particularly annoying problem with my attempt to use the profiler.  In essence, the profiler would simply stop profiling.  The only workaround was to essentially rebuild the solution every time I wanted to set up a new profiler session, as well as doing an iisreset and recycling the relevant app pools.

This was painful but it worked.  Until I sat down to start to optimize the pages I had prioritized.

One really annoying thing about UberProf is that it automatically tries to update itself when it launches to whatever the most recent version is.  This is bad.  The various dlls referenced in the project I was working with were of specific builds, which meant that every time a new build comes along, I have to delete and re-reference the updated dlls.

image

“Well, just choose the ‘no, I don’t want to update’ option when you launch the profiler.”  That’s great, except it doesn’t work.  It won’t try to update the profiler while you are using it for that session, but once you close the profiler, you get this:

image

You have to be quick enough to kill the download, otherwise, you lose your build version.

This is really bad.  In just about any environment, you want to be able to control the version of the software you use, not be forced to upgrade.  I understand from Ayende’s standpoint why he wants the user to be using the most recent version, but I don’t care what Ayende wants when it relates to my software.

That’s bad enough.  But, what made it even worse was that the painful workaround no longer worked.  The profiler, of whatever version I was using at the time, wouldn’t profile at all.  It was simply unusable.

One way to get around the issue, otherwise known as the ‘whimper’

The inventory for the site that I was using the profiler for was sold, and the site was decommissioned.  That certainly ‘solves’ the issue, since I no longer need to use it.

Next Test of UberProf, EFProf

However, I am working on something that will use EFProf.  I don’t know if I will experience the same painful workaround situations.  I hope not.

Current Judgement on UberProf : FAIL

It was nice that I was able to use the online forum to get Ayende to Skype into my machine.  That’s good.  After an hour of being unable to fix the issue and giving up, it really isn’t good enough.

To be fair, I didn’t press the issue after that.  But I shouldn’t have to. 

Combine that with the forced build version updates that you can’t block without manually killing a download, and I can’t say much other than that the tool is not only not enterprise ready, but generally unusable.  I’m hoping the EFProf experience is better.

Do you think this at all encourages the use of RavenDB?

posted @ Saturday, March 26, 2011 10:04 PM | Feedback (0)
Nice Job Mozilla…Not

I am something of a ‘power user’ when it comes to web browsing.  I tend to have up to ten separate windows open, each one of which could have as many as 10-15 tabs open at any one time.

I used to use separate browsers depending on the content, Maxthon for certain things, Opera for others.  I tried IE and Chrome and Safari, but finally ended up with Firefox.  Opera had been my favorite due to its ability to explicitly save multi-windows sessions, but its memory usage wasn’t great (though when you have up to 150 tabs open at once, you are definitely going to have memory problems generally anyway), and Firefox would easily (enough) remember your past session so you didn’t have to worry about losing it (which last time I checked still plagued IE and Chrome).  Memory usage still sucked, but the number of times I had to restart a session to clear it was less frequent than with Opera.

So, when Firefox finally released version 4.0, after 17 years in development, I was hopeful it would provide a better experience.

Perhaps, eventually, it will.  But, once I upgraded to the latest and greatest, Firefox started crashing, about every 10-15 minutes.  After 7 or 8 of these crashes in a row, I figured I’d roll back to the previous 3.6.whatever.  First of all, it was really hard to find, but eventually I did.

And, great, now I got the same behavior, crashes every 10-15 minutes, or if I was actively browsing, 3-5 minutes.

So now I’m in the process of starting from scratch, killed off the 10 window session, and am trying to recreate them one by one.  Very productive.

posted @ Friday, March 25, 2011 6:50 PM | Feedback (1)
Crossing over to the dark side–MacBook Air 13”

Every single member of my immediate family is a Mac user of some sort.  Some more fanatical than others.  I have not been.  Way way way back when, I remember playing LodeRunner on a Mac of a family friend for what seemed like 24 hours, but since then, I’ve used Windows machines pretty much exclusively (ignoring my first computers, which were a Sinclair ZX80 and the Timex equivalent).

On the recent road trip, my cousin asked me what was wrong with my laptop.  It’s one of those things I’ve always noticed but not paid much attention to, the fact that it was a 5+ old laptop with a fan that made it sound like a passenger jet set for takeoff.

So, since I have another 2 week trip coming up, I decided to get a new laptop.  I decided early on that I wanted a solid state drive machine, and had my eye on a high end Sony Vaio, but it was out of stock.  After some more (limited) research, I ended up with this MacBook Air.

It’s cute, it’s tiny, it’s an expensive fucker.

Since I am still mostly exclusively a .NET developer, I’m running Windows 7 Ultimate on Parallels on it.  Since I intend to do some IPad/IPhone dev this year, this machine kills that bird with that stone.

I’m not happy that you can’t (apparently) upgrade the memory on a MacBook once you get one.  I’m also not happy that the auto web registration for my AppleCare protection package was rejected, so I have to wait to see if my scanned and uploaded receipts are good enough.

That said, it’s cute and it’s tiny and it seems to work okay.  We’ll see how it goes once I start developing stuff.

Needless to say, this post is sent from the MacBook.  Cute, tiny and expensive fucker post.

posted @ Tuesday, March 22, 2011 9:08 PM | Feedback (0)
Code Reuse is Bad

One of the ‘heretical’ notions that I have been pressing in a client is something Udi Dahan wrote about some time ago.

In a particular case I experienced recently, we had a production bug based on a change to a stored procedure that wasn’t migrated from UAT to Production.

The particular example is with a stored procedure, but I don’t want to get into the whole “are stored procedures evil” debate here.  The problem could have been with a shared data access component, inline SQL, or even with something that had nothing to do with data access.  I will say that the answer to whether stored procedures are evil is, definitively, yes, unless they aren’t.

So, we had a fix that we needed to migrate to Production to update that stored procedure.  Great.  Except the immediate question arose: what other systems besides the system we were fixing used the same stored procedure?

As it happened, the two people who could verify that our Java codebase didn’t use it (as opposed to the C# codebase that needed the fix) were available, and were able to answer within minutes that we were safe.  But what if they weren’t available?  What if the fix was needed at 3 AM? 

Every piece of code should be designed to be used only by one system.  This is SRP in its essence.

BTW, this is also why SOA, unless very rigorously defined, is doomed to failure or pain or suffering. 

BTW (again), this is why optional parameters in stored procedures or managed code methods are almost always bad.  They are optional usually because they are being reused by multiple systems, which means every fricking time you need to change them for one system, you are at risk for breaking other systems.

Reuse of framework code, fine.  Reuse of business logic code, bad.

posted @ Friday, March 18, 2011 10:12 PM | Feedback (3)
BDD is to YAGNI as TDD is to BDUF

I’m going to give this analogy a 43% rating in terms of how strongly I feel about it, but for what it is worth…..

BDD/Context Specification encourages you to write the code, and only the code, that your application actually needs right now.  TDD encourages to write the code that you need right now as well as the code you think that you are going to need.

YAGNI is a concept that is simple to state, but can be difficult to understand in its implications.  This is actually true about the concept of BDUF as well.

YAGNI (“you ain’t gonna need it” is one of its colloquial forms) at the least suggests you should not attempt to solve the problems that you think you will need to solve, but instead to focus on the problems that you actually do need to solve right now.

BDUF (“big design up front”) can, at the least, be seen as the opposite of YAGNI, where you attempt to construct some grand(er) design that encompasses a wide range of future possibilities ahead of when they are actually relevant.

It’s easy to explain why these concepts can be abused or misused, and that’s always a lot of fun, but just a couple of points here.  Where BDUF in its truly negative form can be seen is best with an example.  On more than one occasion, I’ve been in situations where people have a need for logging.  Ignoring the fact that logging is a solved problem, sometimes people just think they need to create a custom logging solution.  That’s already bad enough, but off-topic.  Suppose the immediate need is that you need to log error messages relating to database calls.  An urge that some people can’t resist is to take this immediate need, and then decide that while, obviously, the custom logging system should log error messages related to database calls, it should also log error messages related to, oh, email calls, web server calls, calls to your Mom, you get the idea.  So, some grand design of an uber-logging system is, well, designed, without any actual specific cases or scenarios that are needed right now.

In the middle of one such disastrous meeting related to such a logging system design, I asked the ‘architect’ if he had ever heard of YAGNI.  He said no.  I then explained it.  He said, “Yes, I know what that is.”  And then proceeded on.  Fantastic.

(Ignoring the fact you really shouldn’t be designing a custom logging system, unless you actually need it, which you might), YAGNI counsels you to stick with the exact need, the need to log error messages related to database calls, and build the system that does that (and implicitly, does it well).  Then, later on, if you need to add other types of logging, you refactor the existing system to handle it.

If you are nuts enough to try to design a custom logging system, and apply BDUF to it, you end up with an API that makes no sense, because you end up trying to come up with some generic terms that applies to all situations, including those as yet not present but we think we will really need scenarios of which we have as of yet no concrete examples so we are pissing in the wind situations.

TDD encourages BDUF

Sort of.  TDD starts at the class level (generally), so you start to design a class outside of specific requirements.  That is, you might have specific requirements that suggests to you that you need a specific class, but once you start working on that class, it is very easy to start to write the specific design of that class outside of very specific requirements.

It’s hard to explain this well without a specific example, and it is hard to use an example that is guaranteed to be widely understood, so I’ll use one that I’ve worked on previously that isn’t completely obscure.  Suppose you are a retailer that deals with various suppliers in terms of how inventory should be managed, and you are working on a new system for a client.  If you’ve ever dealt with multiple suppliers before, you will have experience of the types of scenarios that might possibly come up.  This experience shouldn’t be ignored, but a problem that can arise is trying to design a class in a new system that handles all of those scenarios up front.  It’s a natural inclination, and, again, if you have previous experience in the domain, it makes a certain amount of sense to consider those scenarios.

The problem is that, since TDD doesn’t really give you an exact starting point besides the need to write the class(es) in question, you don’t have a guard to tell you when you are doing too much or too little.

BDD encourages YAGNI

With BDD, you start with what the system will do, from the start, with the supplier (or suppliers) you have, at the time.  What are the requirements for what I need right now?  What are the requirements that my existing suppliers have for my current need?  Since BDD style processes are supposed to involve the end user from the start, you are much more likely to have very specific requirements, than if you have developer-centric TDD requirements around a need to have some inventory management classes.

Moreover, BDD is more likely to give you hard starting and stopping points in terms of requirements and knowing when you are done.

Summary

YAGNI can be done horribly wrong, where you build systems that are so locked into current requirements that you can’t handle new ones.  And BDUF can be done right, where you reasonably design for future requirements that are not quite yet present, but reasonably expected. 

Knowing where to draw the lines involves both a certain level of experience and knowledge of the domain in question. 

I’m more and more convinced that the ‘proliferation of classes’ practice, where you, e.g. instead of designing uber inventory management classes, you design SupplierAInventoryManagement classes and then, when the requirement arises, you design SupplierBInventoryManagement classes, etc. and then integrate them, is the right way to go.

Regardless, a fundamentally fatal flaw of TDD has always been that it doesn’t give necessary guidance to developers of where to start and stop, whereas BDD gives you exactly that.

posted @ Sunday, March 06, 2011 11:32 PM | Feedback (2)
Total Nerd Post: Upgrading through every version of Windows

From ArsTechnica, this is a YouTube video of someone upgrading a virtual machine from Windows 1.0 to Windows 7.

No, I don’t know why either, other than to do it.  I leave it as an exercise for the reader to determine if is cool or scary that Windows 7 maintains backwards compatibility with early Windows programs.

posted @ Friday, March 04, 2011 5:37 PM | Feedback (0)