Effects-based programming considered harmful (Or: How I learned to stop worrying and love functional programming), part III

Part I and Part II are also available.

As I mentioned in part I, flagging a method as static in C# isn’t sufficient to eliminate all side effects.  It eliminates a single (albeit common) class of side effects, mutation of instance variables, but other classes of side effects (mutation of static variables, any sort of IO like database access or graphics updates) are still completely possible.  Effects are not a first-class concern in C#, and so the language as it stands today has no way to guarantee purity.  Thus, while it’s helpful to indicate intent and minimize your potential effect “surface area” by flagging a method as static, it’s not a guarantee that your function is pure.  Certainly, one can code in a functional style in C#, deriving many of the benefits just by indicating intent and minimizing potential effect “surface area” via flagging methods as static, but it’s still not the same as having a compiler guarantee that every method you think is pure actually is.  It’s easy for someone who doesn’t understand that a particular method was intended to be a pure function to change your static method to make a database call, and the compiler will be happy to comply.

Should C# be changed to raise effects to first-class concerns?  I don’t think most developers are used to differentiating between effects-laden code and code without effects, and may balk at any changes to the language that force them to.  Further, I fear backwards-compatibility concerns may limit what one could change about C#; instead of flagging a method as having effects (think the IO flag seen in some functional languages), it may be easier to invert the idea and flag methods as strictly pure.  Methods with this flag would be unable to access variables not passed in as input parameters and restricted in what kinds of calls they could make.

I don’t know yet what the right way to take is, and I don’t think anyone else does either.  I do know people like Joe Duffy and Anders Hejlsberg are contemplating it on a daily basis.  Something will need to be done; as developers will deal more and more with massively parallel hardware, segregating or eliminating effects will go from being a “nice to have” aspect of one’s programming style into an explicitly controlled “must have” just to get software that works.

Advertisements
Posted in Uncategorized. Tags: , , , . Comments Off on Effects-based programming considered harmful (Or: How I learned to stop worrying and love functional programming), part III

Effects-based programming considered harmful (Or: How I learned to stop worrying and love functional programming), part II

Part I of this series can be found here.

My code didn’t improve just because I was giving up on OOP principles. I was simultaneously adopting functional principles in their place, whether I realized it at first or not. I found several benefits in doing so:

Testability

Consider a unit test for a pure function. You determine the appropriate range of input values, you pass the values in to the function, you check that the return values match what you expect. Nothing could be simpler.

Now consider testing a method that does the same job, but as an instance method. You’re not just considering the input and output values of a function with a simple unit test anymore. You either have to go through every step of the method and make sure you’re not touching any variables you’re not supposed to, or change the unit test to check every variable that is in scope (i.e. at least every method on that object), if you’re being comprehensive. And if you take a shortcut and don’t test every in-scope variable, anyone can add a line to your method that updates another variable, and your tests will never complain. Methods that perform external IO, such as reading values from a database to get their relevant state, are even worse — you have to insert and compare the appropriate values in the database out of band of the actual test, not to mention making sure the database is available in the same environment as the unit tests in the first place.

When software is designed in such a way that side effects are minimized, test suites are easy to write and maintain. When side effects occur everywhere in code, automated testing is impractical at best and impossible at worst.

Limited coupling of systems

When I take the idea of eliminating side effects to their logical conclusion, I end up with a lot of data objects that don’t do anything except hold state, a lot of pure functions that produce output values from input parameters, and a mere handful of methods that set state on the data objects based on the aggregate results from the pure functions. This, not coincidentally, works extremely well with loosely coupled and distributed systems architectures. With modern web services communicating between systems that were often developed independently, it’s usually not even possible to send instance methods along with the data. It just doesn’t make sense to transfer anything except state. So why bundle methods and state together internally if you will have to split them up again when dealing with the outside world? Simplify it and keep it decoupled everywhere.

Decoupling internally means that the line between ‘internal’ code and ‘external’ code is malleable, as well. This leads me to another benefit of eliminating side effects:

Scalability (scaling out)

If all you’re doing is passing state from one internal pure function to another internal pure function, what’s stopping you from moving one of those functions to another computer altogether? Maybe your software has a huge bottleneck in the processing that occurs in just one function that gets repeatedly applied to different inputs. In this case, it may be reasonable to farm that function out to a dedicated server, or a whole load balanced/distributed server farm, by simply dropping in a service boundary.

Concurrency (scaling up)

One of the keys to maximizing how well your software scales up on multi-CPU systems is minimizing shared state. When methods have side effects, the developer must perform rigorous manual analysis to ensure that all the state touched by those methods is threadsafe. Further, just because it’s threadsafe doesn’t mean it will perform well; as the number of threads increases, lock contention for that state can increase dramatically.

When a method is totally pure, on the other hand, you have a guarantee that that method, at least, will scale linearly with your hardware and the number of threads. In other words, as long as your hardware is up to snuff, running 8 threads of your method should be 8 times faster; you simply don’t have to worry about it. On top of that, pure methods are inherently threadsafe; no side effects means no shared state in the method, so threads won’t interfere with each other. While you probably won’t be able to avoid shared state completely, keeping as many functions pure as possible means that the few places that you do maintain shared state will be well-known, well-understood, and much easier to analyze for thread safety and performance.

For all these reasons, I found adopting a functional style to be a huge win. However, not all is wine and roses for C# developers who have learned to love functional programming…

(to be continued in part III)

Posted in Uncategorized. Tags: , , , . Comments Off on Effects-based programming considered harmful (Or: How I learned to stop worrying and love functional programming), part II

Effects-based programming considered harmful (Or: How I learned to stop worrying and love functional programming), part I

A while back I was refactoring some C# code, and I noticed that, in the process, I was blindly converting as many instance methods to static as I feasibly could. My process was this:

  1. Stumble across instance method.  This method would run an algorithm using values stored in the object before setting a different value on the same object; it usually returned void or some sort of success flag. For example:
    instance method
  2. Convert to static method that accepts the necessary ‘before’ state as parameters and returns the results.
    static method
  3. Alter the calling method to both pass in the ‘before’ state and set the ‘after’ state of the object according to the value returned from the newly-staticized method.

    From
    old constructor
    to
    new constructor

  4. Feel sense of satisfaction that I’ve “fixed” the software somehow.
  5. Rinse and repeat, sometimes moving the actual setting of particular object state up the call tree several times by the time I’m done.

But is my sense of satisfaction justified? Certainly in an example as simplistic as the one I pasted above one could argue that I actually made the program harder to read. I had just been doing it because it felt right, not because I had logical reasons for it. And that’s no way to write code.

Surely, I needed to sit back and think about what I was doing.

My first realization was that I wasn’t preferring static methods, per se; I was preferring methods that had no side effects.  I was merely converting instance methods to statics in order to guarantee that no instance variables were being set, a common side effect in day-to-day imperative programming.  External IOs (e.g. accessing a database) and setting static variables (e.g. in rarely-used singletons) are still possible side effects in static methods.  I considered both of those with the same distaste, or at least wariness.  However, I had already ghettoized external IO and minimized singletons as much as possible, each for their own reasons; as those had already been addressed before I went on my latest refactoring rampage, only the instance methods remained to draw my ire.

Eliminating or minimizing side effects via converting instance to static methods had a corollary:  I had to separate object state from the methods operating on that state.  No longer would I have objects that simply knew what do with themselves.  Before, I would have a void method call that did calculations and set state on its object (e.g. the original CalculateTotalPrice() method); now I had some functions that only performed calculations on input values (the new CalculateTotalPrice), and other methods that only set/retained state (the new CoolClass() constructor).  I didn’t even need the two kinds of methods to exist on the same object.  I was giving up on one of the most basic principles of object-oriented programming.

And you know what?  It made the code better.

(to be continued in part II…)