Best way to test for a variable's existence in PHP; isset() is clearly broken


Question

From the isset() docs:

isset() will return FALSE if testing a variable that has been set to NULL.

Basically, isset() doesn't check for whether the variable is set at all, but whether it's set to anything but NULL.

Given that, what's the best way to actually check for the existence of a variable? I tried something like:

if(isset($v) || @is_null($v))

(the @ is necessary to avoid the warning when $v is not set) but is_null() has a similar problem to isset(): it returns TRUE on unset variables! It also appears that:

@($v === NULL)

works exactly like @is_null($v), so that's out, too.

How are we supposed to reliably check for the existence of a variable in PHP?


Edit: there is clearly a difference in PHP between variables that are not set, and variables that are set to NULL:

<?php
$a = array('b' => NULL);
var_dump($a);

PHP shows that $a['b'] exists, and has a NULL value. If you add:

var_dump(isset($a['b']));
var_dump(isset($a['c']));

you can see the ambiguity I'm talking about with the isset() function. Here's the output of all three of these var_dump()s:

array(1) {
  ["b"]=>
  NULL
}
bool(false)
bool(false)

Further edit: two things.

One, a use case. An array being turned into the data of an SQL UPDATE statement, where the array's keys are the table's columns, and the array's values are the values to be applied to each column. Any of the table's columns can hold a NULL value, signified by passing a NULL value in the array. You need a way to differentiate between an array key not existing, and an array's value being set to NULL; that's the difference between not updating the column's value and updating the column's value to NULL.

Second, Zoredache's answer, array_key_exists() works correctly, for my above use case and for any global variables:

<?php
$a = NULL;
var_dump(array_key_exists('a', $GLOBALS));
var_dump(array_key_exists('b', $GLOBALS));

outputs:

bool(true)
bool(false)

Since that properly handles just about everywhere I can see there being any ambiguity between variables that don't exist and variables that are set to NULL, I'm calling array_key_exists() the official easiest way in PHP to truly check for the existence of a variable.

(Only other case I can think of is for class properties, for which there's property_exists(), which, according to its docs, works similarly to array_key_exists() in that it properly distinguishes between not being set and being set to NULL.)

1
184
5/23/2017 11:54:54 AM

Accepted Answer

If the variable you are checking would be in the global scope you could do:

array_key_exists('v', $GLOBALS) 
95
4/24/2014 10:11:54 PM

Attempting to give an overview of the various discussions and answers:

There is no single answer to the question which can replace all the ways isset can be used. Some use cases are addressed by other functions, while others do not stand up to scrutiny, or have dubious value beyond code golf. Far from being "broken" or "inconsistent", other use cases demonstrate why isset's reaction to null is the logical behaviour.

Real use cases (with solutions)

1. Array keys

Arrays can be treated like collections of variables, with unset and isset treating them as though they were. However, since they can be iterated, counted, etc, a missing value is not the same as one whose value is null.

The answer in this case, is to use array_key_exists() instead of isset().

Since this is takes the array to check as a function argument, PHP will still raise "notices" if the array itself doesn't exist. In some cases, it can validly be argued that each dimension should have been initialised first, so the notice is doing its job. For other cases, a "recursive" array_key_exists function, which checked each dimension of the array in turn, would avoid this, but would basically be the same as @array_key_exists. It is also somewhat tangential to the handling of null values.

2. Object properties

In the traditional theory of "Object-Oriented Programming", encapsulation and polymorphism are key properties of objects; in a class-based OOP implementation like PHP's, the encapsulated properties are declared as part of the class definition, and given access levels (public, protected, or private).

However, PHP also allows you to dynamically add properties to an object, like you would keys to an array, and some people use class-less objects (technically, instances of the built in stdClass, which has no methods or private functionality) in a similar way to associative arrays. This leads to situations where a function may want to know if a particular property has been added to the object given to it.

As with array keys, a solution for checking object properties is included in the language, called, reasonably enough, property_exists.

Non-justifiable use cases, with discussion

3. register_globals, and other pollution of the global namespace

The register_globals feature added variables to the global scope whose names were determined by aspects of the HTTP request (GET and POST parameters, and cookies). This can lead to buggy and insecure code, which is why it has been disabled by default since PHP 4.2, released Aug 2000 and removed completely in PHP 5.4, released Mar 2012. However, it's possible that some systems are still running with this feature enabled or emulated. It's also possible to "pollute" the global namespace in other ways, using the global keyword, or $GLOBALS array.

Firstly, register_globals itself is unlikely to unexpectedly produce a null variable, since the GET, POST, and cookie values will always be strings (with '' still returning true from isset), and variables in the session should be entirely under the programmer's control.

Secondly, pollution of a variable with the value null is only an issue if this over-writes some previous initialization. "Over-writing" an uninitialized variable with null would only be problematic if code somewhere else was distinguishing between the two states, so on its own this possibility is an argument against making such a distinction.

4. get_defined_vars and compact

A few rarely-used functions in PHP, such as get_defined_vars and compact, allow you to treat variable names as though they were keys in an array. For global variables, the super-global array $GLOBALS allows similar access, and is more common. These methods of access will behave differently if a variable is not defined in the relevant scope.

Once you've decided to treat a set of variables as an array using one of these mechanisms, you can do all the same operations on it as on any normal array. Consequently, see 1.

Functionality that existed only to predict how these functions are about to behave (e.g. "will there be a key 'foo' in the array returned by get_defined_vars?") is superfluous, since you can simply run the function and find out with no ill effects.

4a. Variable variables ($$foo)

While not quite the same as functions which turn a set of variables into an associative array, most cases using "variable variables" ("assign to a variable named based on this other variable") can and should be changed to use an associative array instead.

A variable name, fundamentally, is the label given to a value by the programmer; if you're determining it at run-time, it's not really a label but a key in some key-value store. More practically, by not using an array, you are losing the ability to count, iterate, etc; it can also become impossible to have a variable "outside" the key-value store, since it might be over-written by $$foo.

Once changed to use an associative array, the code will be amenable to solution 1. Indirect object property access (e.g. $foo->$property_name) can be addressed with solution 2.

5. isset is so much easier to type than array_key_exists

I'm not sure this is really relevant, but yes, PHP's function names can be pretty long-winded and inconsistent sometimes. Apparently, pre-historic versions of PHP used a function name's length as a hash key, so Rasmus deliberately made up function names like htmlspecialchars so they would have an unusual number of characters...

Still, at least we're not writing Java, eh? ;)

6. Uninitialized variables have a type

The manual page on variable basics includes this statement:

Uninitialized variables have a default value of their type depending on the context in which they are used

I'm not sure whether there is some notion in the Zend Engine of "uninitialized but known type" or whether this is reading too much into the statement.

What is clear is that it makes no practical difference to their behaviour, since the behaviours described on that page for uninitialized variables are identical to the behaviour of a variable whose value is null. To pick one example, both $a and $b in this code will end up as the integer 42:

unset($a);
$a += 42;

$b = null;
$b += 42;

(The first will raise a notice about an undeclared variable, in an attempt to make you write better code, but it won't make any difference to how the code actually runs.)

99. Detecting if a function has run

(Keeping this one last, as it's much longer than the others. Maybe I'll edit it down later...)

Consider the following code:

$test_value = 'hello';
foreach ( $list_of_things as $thing ) {
    if ( some_test($thing, $test_value) ) {
        $result = some_function($thing);
    }
}
if ( isset($result) ) {
    echo 'The test passed at least once!';
}

If some_function can return null, there's a possibility that the echo won't be reached even though some_test returned true. The programmer's intention was to detect when $result had never been set, but PHP does not allow them to do so.

However, there are other problems with this approach, which become clear if you add an outer loop:

foreach ( $list_of_tests as $test_value ) {
    // something's missing here...
    foreach ( $list_of_things as $thing ) {
        if ( some_test($thing, $test_value) ) {
            $result = some_function($thing);
        }
    }
    if ( isset($result) ) {
        echo 'The test passed at least once!';
    }
}

Because $result is never initialized explicitly, it will take on a value when the very first test passes, making it impossible to tell whether subsequent tests passed or not. This is actually an extremely common bug when variables aren't initialised properly.

To fix this, we need to do something on the line where I've commented that something's missing. The most obvious solution is to set $result to a "terminal value" that some_function can never return; if this is null, then the rest of the code will work fine. If there is no natural candidate for a terminal value because some_function has an extremely unpredictable return type (which is probably a bad sign in itself), then an additional boolean value, e.g. $found, could be used instead.

Thought experiment one: the very_null constant

PHP could theoretically provide a special constant - as well as null - for use as a terminal value here; presumably, it would be illegal to return this from a function, or it would be coerced to null, and the same would probably apply to passing it in as a function argument. That would make this very specific case slightly simpler, but as soon as you decided to re-factor the code - for instance, to put the inner loop into a separate function - it would become useless. If the constant could be passed between functions, you could not guarantee that some_function would not return it, so it would no longer be useful as a universal terminal value.

The argument for detecting uninitialised variables in this case boils down to the argument for that special constant: if you replace the comment with unset($result), and treat that differently from $result = null, you are introducing a "value" for $result that cannot be passed around, and can only be detected by specific built-in functions.

Thought experiment two: assignment counter

Another way of thinking about what the last if is asking is "has anything made an assignment to $result?" Rather than considering it to be a special value of $result, you could maybe think of this as "metadata" about the variable, a bit like Perl's "variable tainting". So rather than isset you might call it has_been_assigned_to, and rather than unset, reset_assignment_state.

But if so, why stop at a boolean? What if you want to know how many times the test passed; you could simply extend your metadata to an integer and have get_assignment_count and reset_assignment_count...

Obviously, adding such a feature would have a trade-off in complexity and performance of the language, so it would need to be carefully weighed against its expected usefulness. As with a very_null constant, it would be useful only in very narrow circumstances, and would be similarly resistant to re-factoring.

The hopefully-obvious question is why the PHP runtime engine should assume in advance that you want to keep track of such things, rather than leaving you to do it explicitly, using normal code.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon