19 minutes
Leveling Up: New Features in PHP 7, Part 2
This article was originally published in PHP|Architect magazine in the November 2015 issue. These articles are all copyright by David Stockton. You may be able to purchase the issue here.
Requirements:
- PHP 7
- https://github.com/rlerdorf/php7dev
- Vagrant
It’s getting even closer now. As of this writing, PHP 7 RC 5 is now out. By the time you read this, PHP 7 RC 6 will have been released. PHP 7 may literally be just around the corner. Last month we covered a number of new and exciting features coming in PHP 7. There are a few more I wanted to cover this month as well as diving a bit deeper into a couple that we talked about last time.
Introduction
If I haven’t mentioned it before (I have, though), I’m pretty excited about the upcoming release of PHP 7. I will reiterate from last time, if you’re running PHP older than 5.5, you really need to upgrade. PHP 5.5 is the oldest version still supported and it’s currently only receiving security upgrades. PHP 5.6 is the current version and will receive updates until August 28, 2016 and security fixes for a year following. In addition to massive speed improvements and security fixes, PHP continues to improve all the time. Writing code that is able to take advantage of these new features is not only fun, but it helps to expand your mind in how you think of code. Continuous learning is an important aspect of leveling up.
Playing Along
In the sidebar, you’ll find links to Vagrant and Rasmus Lerdorf’s PHP 7 vagrant machine. He’s made this incredible Vagrant box that will allow you to run PHP 7 as well as every version of PHP since 5.3. It’s a great way to play with new PHP features that you may not have tried as well as testing out your codebase against PHP 7 so you can know if you’re likely to have any issues when upgrading. I highly encourage you to download the VM and play around with it. If you run into bugs or issues, open tickets with PHP 7. You’ll literally be helping make the language better for yourself and everyone else. With that, let’s get into the features. I already covered most of the deprecations and backward compatibility breaks in the previous article so this one should be all about the fun stuff.
Constant Arrays with Define
In PHP 5.6 it was possible to create constant arrays using the const
keyword. PHP 7 brings this ability to define as well:
define ('PHP7_EXCITEMENT_LEVEL', [
'Not at all',
'A little',
'Some',
'Lots',
11 => 'Ermahgerd, I want this now!',
]);
echo PHP7_EXCITEMENT_LEVEL[11];
This means you can do a few more complex things that you cannot do with const. For instance, suppose you have this function:
function add2($x) {
return $x+2;
}
You cannot define a constant array using the result of that function call using const:
const FOO = [add2(2), add2(4), add2(6)];
If you try that, PHP will complain with Fatal error: Constant expression contains invalid operations
. However, you can create that constant array using define:
define ('FOO', [add2(2), add2(4), add2(6)]);
So you can now create constants that have values that may not be known at development time and put them in an constant array using define. Hooray!
Unicode Codepoint Escape Syntax
Coming soon to PHP 7, rather than trying to remember how to output special unicode characters in HTML, we now have a standard syntax for outputting unicode characters. Inside of double quotes (or heredoc syntax) you can put \u followed by the hex value of the unicode character in curly brackets:
echo "\u{1F4A9}\u{1F550}"; // outputs 💩🕐
Never before has it been so easy to output emoji from our PHP code. I’m sure there are other important uses as well.
Closure Binding at Call Time with Closure::call()
In PHP prior to 7, it was necessary to first bind a closure to an object before running it. This means that we could start with a simple class like so:
class TheLanguage
{
private $language = 'PHP';
}
In PHP before 7, we could do the following:
$getLanguageCB = function() { return $this->language; };
$getLanguage = $getLanguageCB->bindTo(new TheLanguage, 'TheLanguage');
echo $getLanguage();
With PHP 7, we can shorten the code (and it’s faster to boot):
$getLanguage = function() { return $this->language; };
echo $getLanguage->call(new TheLanguage);
Security Updates for Unserialize
PHP’s serialize
and unserialize
have been around since the PHP 4 days. They allow you to convert PHP objects and structures into plain text that can be stored and retrieved later. If you’ve ever looked at the session files PHP generates, you’ve seen what serialized PHP looks like. Let’s take a look at what serializing an instance of the TheLanguage
class above might look like:
$lang = new TheLanguage;
echo serialize($lang);
If you were to run this in PHP, you’ll see something like this:
O:11:"TheLanguage":1:{s:21:"TheLanguagelanguage";s:3:"PHP";}
This only tells part of the story though, as there are things in there that we can’t see and that cannot be printed in this magazine – not because it is explicit, but because it literally has no visible representation. But we can see the hidden bits with a slightly different bit of code:
var_export(serialize($lang));
The output from this is a bit longer:
'O:11:"TheLanguage":1:{s:21:"' . "\0" . 'TheLanguage' . "\0" . 'language";s:3:"PHP";}'
The difference is those “\0” or null characters that are now explicit in the output. The PHP serialize format isn’t terribly hard to understand though. The first character ‘O’ indicates that we’re dealing with an object. The is followed by a colon and then 11 which let’s PHP know that the name of the class is 11 characters long. Following the next colon, we get the name of the class - ‘TheLanguage’. After that, one more colon and a 1 indicating that the object contains a single property. If this were an array it would tell us how many elements were in the array.
Inside the curly brackets, we get the internal representation, or state, of the object. The s:21 indicates we’ve got a string with 21 bytes. It consists of the null byte, the name of the class, another null byte and then the name of our property, in this case, ‘language’. The semicolon delimiter is then followed by another s:3 for a string of 3 bytes and a value of PHP. PHP’s serialize function uses the null bytes to ‘hide’ private and protected values.
If I changed the class definition so that the language property was protected, we’d get a slightly different output (again with var_export):
'O:11:"TheLanguage":1:{s:11:"' . "\0" . '*' . "\0" . 'language";s:3:"PHP";}'
In this case, notice that the length of the string for the property is now 11 bytes instead of 21, and we’ve now got null * null and then the name of the property. Let’s look at the serialized value if the language property were public (this time with echo):
O:11:"TheLanguage":1:{s:8:"language";s:3:"PHP";}
Since nothing is private or protected, serialize doesn’t need to obfuscate the name of the property in any way. So far, nothing I’ve talked about in this section is new. But you can see that it’s not terribly difficult to manipulate the serialized string in a way that could result in different or undesirable values or objects being created. For instance, if the string above were stored in a database (or session) and retrieved and unserialized, we’d get back out an object that looked just like the one we serialized. However, if we changed that string to something like this:
O:12:"SomeLanguage":1:{s:8:"language";s:6:"Python";}
Our code that expects to get back an object of TheLanguage
be very sad. If ‘SomeLanguage’ was a class that does not exist or could not be autoloaded, PHP will substitute __PHP_Incomplete_Class
and related values. This means if you try to access or use anything from this object, you’ll get a notice about how you need to ensure that the ‘SomeLanguage’ class is defined or can be loaded. However, if ‘SomeLanguage’ does exist and can be loaded, then the code that restored this object may be expecting one thing and getting something else entirely. It is a potentially dangerous security vulnerability. And with that, we’re finally on to the new functionality with PHP 7 and unserialize
.
The unserialize
function can now accept a second parameter that tells PHP what classes to allow to be unserialized. Let’s take a look:
$serialized = 'O:11:"TheLanguage":1:{s:8:"language";s:3:"PHP";}';
$object = unserialize($serialized);
var_dump($object);
The output we receive is:
object(TheLanguage)#1 (1) {
["language"]=>
string(3) "PHP"
}
That’s the standard behavior right now as well. Suppose we don’t want to allow any classes out (perhaps we’ve serialized an array and expect to receive an array:
$serialized = 'O:11:"TheLanguage":1:{s:8:"language";s:3:"PHP";}';
$object = unserialize($serialized, ['allowed_classes' => false]);
var_dump($object);
In this case, instead of loading or trying to load TheLanguage
, we get this structure:
object(__PHP_Incomplete_Class)#3 (2) {
["__PHP_Incomplete_Class_Name"]=>
string(11) "TheLanguage"
["language"]=>
string(3) "PHP"
}
If we wanted to allow ‘TheLanguage’ but prevent hackery of potentially unserializing a ‘SomeLanguage’ object, we can do this:
$serialized = 'O:12:"SomeLanguage":1:{s:8:"language";s:6:"Python";}';
$object = unserialize($serialized, ['allowed_classes' => ['TheLanguage']]);
var_dump($object);
In this case, the output is the same as the example above. By limiting the classes that we allow unserialize to use to only those that are expected, we can avoid a whole slew of potential issues. Of course, if you want to allow any kind of serialized classes but with the new ‘allowed_classes’ key, you can pass in true to get the same behavior as leaving off that parameter entirely.
Expectations
If you have written unit tests, you’re no doubt familiar with the concept of assertions. If you’ve been reading this column for a bit and following along, you’ve probably written some unit tests. Assertions are a way to ensure that something in our code is the way we expect it to be. PHP has had an assert function since the PHP 4 days, but I haven’t seen it used much in the wild.
In PHP 7, assert is a language construct that can behave differently based on configuration. The idea being that you can add some tests to code that are inside the actual code. In development, we want these assertions to run and tell us if there’s a problem. In production, however, we don’t want the slow-down that comes with the extra checks, so PHP allows them to be turned off completely leading to a zero cost for production code. Assertions are not something you’d use in order to validate inputs, however.
PHP 7 has two new ini
directives that control the behavior of assertions.
php.ini directives | Default Value | Possible Values |
---|---|---|
zend.assert | 1 | 1: Generate and evaluate code |
0: Generate but skip code | ||
-1: Do not generate code (production mode) | ||
assert.exception | 0 | 1: Throw an exception on failure |
0: use or generate a throwable, but warning only |
You could use these assertions to establish pre-conditions and post-conditions in your functions and methods. For instance, suppose you wanted to start taking advantage of PHP 7’s scalar type hints but you didn’t know if every place you called the function from actually sent in the expected type:
function add($x, $y) {
return $x + $y;
}
Suppose the function above should be dealing only with integers. You could add in the type-hints and turn on strict mode, but in that case if any code called this function with a string or a float, it would blow up and stop working. We could do something like this though:
function add($x, $y) {
assert('is_int($x)', '$x is not an int');
assert('is_int($y)', '$y is not an int');
return $x + $y;
}
If I turn on assert.exception
(and zend.assert
) and run my test suite (you have a test suite, right) or let this run in QA for awhile, any code that calls this add function without the proper type will generate an exception that will (hopefully) be logged with a stack trace showing where the call came from. I could then use that information to fix those calls allowing me to safely move from non-strict-type code to strictly typed code in a safe and controlled way. I should be able to avoid breaking production this way. The code above would be deployed to production this way, but the ini setting would be zend.assert = 0
or zend.assert = -1
which means that even if we missed a call, production would still continue to work.
The first parameter to assert can be a string like above, which is then evaluated, or it can be the actual expression itself. If you use a string, you’ll get a message about the the assertion that failed. Your second parameter to assert can be either a description of the failure, or a Throwable
(like an Exception).
I recommend using assertions as a development tool. Use them to show situations that should clearly never happen. If an assertion fails, then it will let you know that there’s a serious problem that needs to be addressed.
Group Use Declarations
If you’ve adopted namespaces (and you should have), the top of many of your class files may be littered with lots of different use
statements. These could include any classes that you’re relying on, and since 5.6, you can also use use
to import constants or functions from namespaces as well. In PHP 7, you can now group classes from the same namespace into a single use
statement.
In PHP 5.3+:
use MyVendor\Namespaced\ClassA;
use MyVendor\Namespaced\ClassB;
use MyVendor\Namespaced\ClassC;
In PHP 7, we can combine these:
use MyVendor\Namespaced\{ClassA, ClassB, ClassC};
The same syntax works for constants and functions but they’ll still need their own line, even if they came from the same namespace as your classes:
use function MyVendor\Namespaced\{functionA, functionB, functionC as C};
use const MyVendor\Namespaced\{CONST_A, CONST_B, CONST_C};
It’s a pretty nice way to reduce the overall number of lines of code and potentially increase legibility by grouping common imports into a single line.
Generator Return Statements
I like generators. I’ve got a few in production that help to keep large file or dataset processing down to essentially constant memory. Generators are special functions that use the keyword yield
to send back a value to code that is iterating over the generator. They can be used to create infinite sequences or handle large datasets one item at a time without needing to store everything in memory. Prior to PHP 7, the only way you could use the return
keyword in a generator was to use it with no value. With PHP 7, you can use return
with a value and you can retrieve that returned value with the ->getReturn()
call.
function firstFewPiDigits()
{
yield 3;
yield 1;
yield 4;
return 'mmm Pi!';
}
$piDigits = firstFewPiDigits();
foreach ($piDigits as $digit) {
echo $digit;
}
echo "\n\n" . $piDigits->getReturn();
The code above will output:
314
mmm Pi!
If you try to call ->getReturn()
on a generator that hasn’t returned, you’ll get an Exception
with the message Cannot get return value of a generator that hasn't returned
.
Generator Delegation
Speaking of generators, as of PHP 7, generators can now delegate to another generator, Traversable
or array without needing to do anything fancy in your outermost generator. Take a look:
function someDigitsOfE()
{
yield 2;
yield '.';
yield 7;
yield from andrewJacksonElectionYear();
yield from andrewJacksonElectionYear();
yield from halfRightAngle();
yield from rightAngle();
yield from halfRightAngle();
}
function andrewJacksonElectionYear()
{
yield from [1, 8, 2, 8];
}
function halfRightAngle()
{
yield 45;
}
function rightAngle()
{
yield 90;
}
We can now iterate over the someDigitsOfE
generator, like so:
foreach (someDigitsOfE() as $digit) {
echo $digit;
}
The output will be 2.718281828459045
. I think that’s pretty neat. But, as I said, I like generators.
Integer Division
In many other languages (mostly strictly typed) dividing a floating point value (a number that contains a decimal and fractional part) by an integer will result in an integer. Also, dividing an integer by another integer that doesn’t divide evenly. In PHP, we end up with a floating point value. Sometimes this is desirable, but other times, we really want just the integer portion without needing to jump through hoops of calling floor
or doing silly string manipulations to blow up the value and remove the decimal. PHP 7 has you covered now with the new intdiv
method.
echo intdiv(100, 30); // echo's 3
A Little More on Strict Types
I mentioned in the previous article that I thought that PHP getting scalar typehints and the ability to enable strict typehinting is a good thing. It is a change from how we’re all used to dealing with PHP, but it’s one that I recommend you look into and practice. If you’ve coded in C or C++ or Java, or Swift or any other of the numerous strongly typed languages, using strict types may not seem so weird. PHP has been historically extremely forgiving and accommodating of mixing and matching different types. Sometimes this is helpful, but it also leads to problems that can be extremely hard to track down and can be very frustrating.
When we retrieve values from $_GET
, $_REQUEST
or $_POST
, they are strings. Even if the values should represent numbers, they come across as strings. PHP does its best to let us as developers forget about whether we’re dealing with the number 2 or the string ‘2’. If you add the string 2 and the number 2 in PHP, you’ll get 4. If you do the same in Javascript, it adds it in a way that seems like a two year old learning about math might try to add the values. Instead of 4, you get ‘22’. PHP assumes with strings that could be interpreted as numbers that we mean to treat them like numbers, whereas Javascript assumes that if you are trying to add something to a string you meant it as a string and so it coerces the number 2 to the string ‘2’ and concatenates the values. Both of these situations are ones we deal with all the time, but they arguably result in unexpected behavior.
In languages that are strictly typed, we must explicitly define what we expect when we want to combine values that are not of similar types. When we can rely on the language enforcing variable types, rather than limit us, it opens up new possibilities and removes the need for a lot of the error checking we’d normally have to do. Let’s revisit the add function from before:
function add($x, $y)
{
return $x + $y;
}
As it stands right now, we could pass in anything at all to the parameters of this function. If I pass in numbers, I get the sum of them. If I pass in two objects, I’ll get a result of int(2) (along with a notice). If I pass in strings… well, it depends on what’s in them. If the strings are something like a couple of names, then I’ll get a return value of 0. However, the following may be a bit unexpected.
echo add(9, '1591 Pennsylvania Ave.');
In this case, PHP will output 1600 as it feels that I’m clearly trying to add 9 and the 1591 portion of the address. Personally, I cannot think of a reason I’d need to legitimately do this, so I’d much prefer that PHP or my editor could tell me when I’m about to do The Dumb™.
If I take add and use scalar type hints, I can get some more predictable behavior:
function add(int $x, int $y) : int
{
return $x + $y;
}
In this case, without adding anything more, I get “coercive” type hints meaning that PHP will send in integers to my function as long as whatever parameters were used originally can be converted to an integer cleanly. This means I’ll get results like the following:
echo add(1, 2); // 3
echo add('2', '3'); //5
echo add(1.1, 9.9); // 10, not 11 (numbers converted to int before passed in)
So far this makes sense. I’ve got a different result for the last example than I would have without the typehint, but it’s a reasonable change. The next example I’m less happy with but at least the notice gives me a clue that I’m attempting The Dumb™.
echo add(1, '1600 Pennsylvania Ave'); // 1601, but with a notice
// Notice: A non well formed numeric value encountered
And if I try with something even more ridiculous, like trying to add objects, I get the following.
echo add(new stdClass, new stdClass);
Warning: Uncaught TypeError: Argument 1 passed to add() must be of the type integer, object given, called in php shell code on line 1 and defined in php shell code:1
Stack trace:
#0 php shell code(1): add(Object(stdClass), Object(stdClass))
#1 {main}
thrown in php shell code on line 1
Inside my function, I know that I’ll be receiving integers as long as the function is actually called. It allows me to include the return type as well since I know that adding two integers will result in an integer. However, I’d like to take it one step further and make it fail outright for anything coming in that’s not an integer. To do this, I must add the declare(strict_types=1);
at the top of the file.
The biggest misunderstanding I’ve seen expressed so far is what is affected by this declaration. So I’ll wrap up this column by trying to explain how it works. The ‘strict_types’ statement only affects the file that contains it. Specifically, it only affects the function calls made in the files where that directive exists. This means that if I write declare(strict_types=1);
at the top of the file and then define my add function with int type hints, and then I call it later (with non-integer arguments) in the same file, I’ll see an error. However, if I include this file in another file but do not include that directive, I’ll have coercive typehints for the calls I make in there.
In fact, if your file doesn’t contain any function or method calls (including method calls into other methods defined in that file) then that directive really isn’t needed. You’d still likely want to include your type hints and return types. It will be up to the user or caller of your functions and methods to decide whether they want strict types or coercive types. If you are that developer, it becomes your choice on a file-by-file basis whether you want strict types or not. The additional scalar type hints and return types allows for benefits such as additional IDE and static code analysis that can help prevent errors before the code is checked into source control.
Conclusion
I’ve said it before and I’ll say it again: PHP 7 is looking great. Between this article and the previous one, you should have a pretty good idea about what to expect in PHP. I’d definitely recommend grabbing the PHP 7 VM and working with PHP 7. Play around with the features we’ve talked about and start thinking about how you can take advantage. Personally, I’ve tried to see if I could monkey patch a method into an existing object, but so far, my experiments were not successful. Perhaps you’ll have better luck than I did. If you find problems, be sure to open bugs and help improve PHP 7. If you figure out whether patching a method onto an object is possible, please let me know. I’ll see you next month.
David Stockton is a husband, father and Software Engineer. He builds software in Colorado, leading a few teams of software developers creating a very diverse array of web applications. His two daughters, age 11 and 9, are learning to code JavaScript, Python, Scratch, a bit of Java and PHP as well as building electrical circuits. His 4 year old son has been seen studying calculus, practicing linguistics with regards to human biology, and is excelling at annoying his sisters. David is a conference speaker and an active proponent of TDD, APIs and elegant PHP. He's on twitter as @dstockto and can be reached by email at levelingup@davidstockton.com.
3855 Words
2015-11-01 20:00 -0700