Leveling Up: Exploring Immutability
Continuing from the last couple of months, we’re going to be looking more into objects. Objects allow us to combine state and behavior into a single construct. In objects, state is held in variables called properties while behaviors are associated functions we call methods. This month’s topic is immutability which in this case is specifically about making objects with unchangeable state.
The concept of a variable, but its very nature and name indicates something that changes, or varies. So why would we want to consider making variables that cannot change, and how does that differ from constants?
If you followed the PHP-FIG when they were designing PSR-7, the topic of immutability was common, often heated, and and sometimes contentious. Immutability in the context of objects means that internal properties cannot be changed once the object is created.
Trade-offs of Immutability
Some languages provide language enforced immutability. Swift, for example, allows the declaration of a variable using let which indicates that during the life of that variable it will not change. In PHP, we have to work around this limitation. In order to implement immutability in PHP, we need to be able to control access to the properties so they cannot be changed.
We also need to have a way to set the values of the variables initially as well. This can be done through the use of a constructor, and private or protected values.
class ImmutableThing
{
private $a;
private $b;
private $c;
public function __construct($a, $b, $c)
{
$this->a = $a;
$this->b = $b;
$this->c = $c;
}
}
Mutability via Leaking Internal Object References
In order for an object with dependencies to be immutable, those dependencies must also be immutable as well. Think about it – if an object is used in a function and its dependencies play a part in the result, if they change then the result of the function could change. This is certainly not desirable in an immutable system. Let’s see how leaking internal object references could lead to a problem.
class ImmutableThing
{
private $dep;
public function __construct(MyDependency $dep)
{
$this->dep = $dep;
}
}
In order to create our ImmutableThing, we need to create a MyDependency object and pass it in.
$dep = new MyDependency();
$immutable = new ImmutableThing($dep);
At this point, we still have a reference to $dep, which is the same exact object that is stored and used inside our object.
$dep->setValue('foo');
By calling the setter on $dep, we’ve changed the object that is inside of $immutable. If instead, $dep was an immutable object, then changing the value inside of $immutable would not be possible:
$dep = $dep->withNewValue('foo');
At this point, $dep would be a different object than the one in $immutable so its immutability is preserved.
DateTimeImmutable
PHP 5.2 introduced a new class for dealing with dates and times, the DateTime class. It provides convenient methods for dealing with almost every aspect of these concepts. You can create new objects from many different strings that could represent a date, including things like 3 weeks from now or last Tuesday. You can convert from one timezone to another, format a date and time in countless ways, or determine the difference between two different DateTime objects in terms of years, days, hours, minutes and seconds. In short, it’s an awesome and powerful class. However, there are some problems with it. Suppose you’d like to determine when a library book is due back.
$checkedOut = new DateTime('2017-03-01');
$dueBack = $checkedOut->add(new DateInterval('P3W'));
$now = new DateTime();
$timeOut = $now->diff($checkedOut);
if ($now < $dueBack) {
echo 'Book has been out for ' . $timeOut->d . ' days';
} else {
echo "Book is overdue!\n";
echo 'Book has been out for ' . $timeOut->d . ' days';
}
On the surface, this looks simple, and possibly even correct. We’ve got a checkout date of March 1st. The book is due back three weeks after it is checked out which is the date we store in $dueBack. The date right now is stored in $now. The default value for creating a new DateTime object is ’now’. We can then calculate how many days the book has been checked out. If $now is before the due date, we can echo how long it has been out. Otherwise, it’s overdue and we’ll indicate that as well.
There’s a problem though that if you’re initially not aware of it will probably want to make you tear your hair out if you don’t already know it. Go ahead and try it out and see if you can determine the issue. I’ll wait.
The issue becomes obvious if we compare $checkedOut and $dueBack. The problem is that the DateTime class creates mutable objects. What this means is that when we were calculating the due date, we were also simultaneously updating the $checkedOut date to be three weeks ahead as well. In fact, if you tried a strict comparison between $checkedOut and $dueDate, you’d find they are both referring to exactly the same object. By calling add(), we’ve mutated the state of the DateTime object.
Let’s try again, but with a different object for representing dates and times. The DateTimeImmutable class was introduced to PHP in version 5.5. It will change the behavior of one of our lines of code compared to the original but it will make all the difference in the world:
$checkedOut = new DateTimeImmutable('2017-03-01');
$dueBack = $checkedOut->add(new DateInterval('P3W'));
$now = new DateTimeImmutable();
$timeOut = $now->diff($checkedOut);
if ($now < $dueBack) {
echo 'Book has been out for ' . $timeOut->d . ' days';
} else {
echo "Book is overdue!\n";
echo 'Book has been out for ' . $timeOut->d . ' days';
}
Now, what happens is that when we use the add method to calculate the new due date, we get a brand new object with the values we expect. This leaves the $checkedOut date at March 1st, and the $dueBack date at March 22nd. Since I don’t know when you might be running this, I cannot predict the output you’ll get, but for me, the mutable version told me that I had the book checked out for 27 days (but not overdue) and the immutable version says the book has been checked out for only six.
There are other issues of course, but the important ones, we’ve covered. In my instance, I’m running the code before March 1st, so my six days and 27 days really represent the system telling me it is six days and 27 days until the checkout date because I haven’t bothered with checking if the flag has been set to indicate that the difference is positive or negative. For more information on the DateTimeImmutable class, please check out my Getting a Date with PHP article in the June 2015 issue of PHP Architect.
PSR-7 And Immutability
If you’ve built so-called value objects, you’ve likely either created them with all their attributes with public visibility (not recommended) or you’ve built “getters” and “setters”. These allow you to mutate state by changing the properties of an object. With immutable objects, setters are not good since they allow someone to change state after the object is instantiated. However, we may want an object that is very close to one we already have but with something changed. If state can only be injected via the constructor, this can be very inconvenient if an object has a lot of properties.
With the DateTimeImmutable class, using the add or sub methods will return a new DateTimeImmutable object with a different internally represented date and time. It allows us to use a starting point, perhaps “now”, and then using that, build a new object based on that. Calculating the due date of the library book based on a starting date and modifying it using an interval is an example.
PSR-7 is about HTTP message interfaces. It defines interfaces for objects dealing with HTTP requests and responses, and almost all of it is immutable. But there are a lot of things that need to be considered in an HTTP request or response. For a message, we have protocol versions, headers and a body. For a server request, we have a target, a method, a request URI, server parameters, cookie parameters, query parameters, potentially uploaded files, attributes, etc. In short there’s a lot that goes into the object. The designers of PSR-7 decided to use something that looks similar to a setter, but instead the method names start with “with”. So you have methods like “withQueryParams” or “withCookieParams”. These methods use all the values of the object they are called on and then return a new object with only the specified changes in place. This makes it so you can easily get a new object with only a few things changed without needing to start from a constructor, and you can effectively “chain” these calls together allowing you to create the object you need with just a few calls.
$modifiedRequest = $request->withCookieParams($newCookies)
->withQueryParams($newQueryParams)
->withAttribute('foo', 'bar');
When we run the code above, starting with whatever $request has, we’d end up with a completely new object in $modifiedRequest, but with new cookies, query parameters and a new attribute named foo. However, since $request is immutable, the calls above represent the creation of three new objects. But we won’t have access to 2 of them at all.
New state Means New Object
In the example above, we are creating a new object with each call to a with* method. We get a new object that is different from $request when the withCookieParams call is made and another new object with the call to withQueryParams and finally another new object with the call to withAttribute. This final object is stored in the $modifiedRequest variable, but what happens to the others? We have no real way of getting access to them. As it turns out, PHP keeps track of how many references there are to an object and when there aren’t any left, the object is marked for garbage collection.
Garbage collection is the mechanism PHP uses to reclaim memory that is no longer needed by your code. If it’s not referenced by anything, it is no longer needed. Because of this, I feel that in most cases, there’s no reason for concern over the intermediate objects that are created and discarded by following this pattern.
Let’s look at how one of these “with” methods might work.
public function withSamoflange($samoflange)
{
$new = clone $this;
$new->samoflange = $samoflange;
}
That’s it. There’s really not a lot to it, but it does allow for “chaining” or fluent calling of these mutators in order to build a new object based on some base object plus some changes. With these methods in place, and the original dependencies coming in from the constructor, and no “setters”, the effect is that you’ve got an object that is effectively immutable.
Garbage Collection in PHP
After PHP detects there are no remaining references to an object or value, PHP can garbage collect or reclaim the memory for use elsewhere. If your immutable code is built similarly to PSR-7, and you make new objects with new state often, especially when you may need to make multiple intermediate objects, there is often a concern that PHP will be making dozens, hundreds or even thousands of extra objects in order to achieve immutability. In fact, this was one of the bigger concerns brought up on the PHP-FIG mailing list in regards to PSR-7 when immutability was announced. However, PHP is pretty good with the garbage collection and these intermediate objects will be dealt with quickly.
So, even though PHP doesn’t support immutability directly at the language level, and an implementation of it either means object creation only via constructor or via cloning mutators, immutability is still a good idea.
Implementing and Ensuring Immutability
We’ve talked already about building the object via passing in values and dependencies via constructor, and providing “cloning mutators” which provide a new object that contains most of the same state. We also talked about the importance of ensuring that any depencies are also immutable. However, there are other considerations as well.
First we have the visibility modifiers of the dependencies. Public is clearly right out since it doesn’t provide any means of ensuring a dependency cannot be changed. Protected is better, but it still would allow someone to extend our immutable object and write code that could change the dependencies. Therefore, we should make the dependencies private. Additionally, to ensure inheritance cannot be used to break immutability, consider marking immutable classes as final so they cannot be extended.
However, there’s still more, even if you do all of this. An object’s dependency’s visibility can be modified via Reflection. Objects could be serialized, changed via their string representation and unserialized. Magic methods could be used in order to change dependencies of immutable objects. It’s also possible to build an anonymous function that can access and change dependencies of immutable objects. It is possible to disable serialization and deserialization by overriding __sleep and __wakeup, but all of these additional ways of breaking immutability really require someone to go out of their way to do so. Unless or until PHP provides language-level ways of creating immutable values, this is probably as good as we’ll be able to do.
Functional Programming Roots
Much of the argument towards development with immutable objects can be traced to functional programming. Immutable objects greatly reduce the chances of errors in a multi-threaded environment. Since PHP is (generally) not threaded and typically written in a “share-nothing” way, threading issues don’t come up as much as they might in other languages. The general idea with a threading issue is that if multiple threads run the same bit of code that changes the same bit of memory, the order they run can affect the results of the code. The common example is a banking transaction and is often used to explain why database transactions are important.
// Thread 1 (withdraw)
$balance1 = getBalance('12345');
$newBalance = $balance1 - 100;
setBalance('12345', $newBalance);
// Thread 2 (deposit)
$balance = getBalance('12345');
$newBalance = $balance1 + 100;
setBalance('12345', $newBalance);
Suppose these two pieces of code are running simultaneously. With threaded code, each line of code from the top runs after the line above it, and each line from the thread 2 listing runs in the order expected. However, these lines can be intermixed in any order. This means we could have a situation where thread 2 runs all of its code before thread 1, or vice versa. We also have a situation where some of thread 1 could run, then some of 2 and then back to 1. These situations result in the final balance for account 12345 being different. Suppose both threads run the getBalance line, and they both receive a value of $100. Then each thread figures out the new balance, that is, $0 for thread 1 and $200 for thread 2. The final balance at this point could be either $0 or $200 depending on which thread runs the final code last.
If thread 1 runs all the code and then thread 2 runs its code, we start and end with a final balance of $100.
Functional programming encourages immutable structures. Typically variables are created immutably unless you otherwise declare them mutable. In OOP languages like PHP, the opposite is true. You have to jump through hoops in order to make things immutable, assuming you can do it at all.
Here’s the key piece with functional programming – if you call the same function with the same arguments you get the same result. If a function works on the state of an object and that state can change, then calling the same function with the same object (with a different state) will have a different result.
Bidirectional Cannot Be Immutable
One limitation of immutable state is that it is not possible if you have bidirectional linking of objects. Since immutability requires that the underlying objects be immutable, it’s not possible to create a bidirectional dependency tree. The problem is this: An object could be created with a reference to another immutable object. However, once that object is created, the reference needs to be injected into an already existing immutable object. That cannot be done without changing the state of the original object which means we cannot do it.
$a = new Immutable();
$b = new Immutable($immutableA);
$a = $a->withReferenceTo($b);
// $a is a new object now and not the one referenced in $b.
Conclusion
Immutability, while it may seem weird at first can provide for simpler code that is easier to understand. It greatly reduces issues in threaded code, which most of us dealing in PHP don’t need to be concerned with, but it is something very useful and important to be aware of. It can help us to make code more functional and reduces side effects and hard-to-find bugs and issues. PSR-7 defines immutable structures for nearly every aspect of the request/response cycle and provides a great way to build reusable code for dealing with incoming or outgoing requests. It is not possible to make all code use immutable variables and objects, but I recommend that you look into it and consider using it whenever you decide it makes sense for your application. It can help reduce bugs and improve understanding.