Etienne Kneuss

home » news » SplObjectStorage for a fast and secure object dictionary

SPL Datastructures

The Standard PHP Library was recently completed by a couple of data structures, namely heaps and doubly linked lists. You can find more information about those on
php.net/spl.datastructures

Feeds available

You can read colder.ch directly from the rss/atom feeds:
rss general RSS (general)
rss php RSS (PHP)
atom general Atom (general)
atom php Atom (PHP)

New design : Clearblue

I've made a new skin using the same design with blue colors: Clearblue. Check it out

New website

As you can see, the website has changed completely. I've re-designed everything using different types of technology to give an overall improvement. Take a look in the news section for detailed information.

SplObjectStorage for a fast and secure object dictionary The 8th of January 2009 @ 03:03

There have been activity recently about the different ways to uniquely identify objects in PHP. In fact, a function have been sitting in SPL unnoticed for quite some time and while people came across it, some got frustrated. I'm of course talking about spl_object_hash(). To summarize it: in PHP, you basically need two things to safely identify an object: a object index, the handle, and the class handlers which is how the object will react internally. This set of handlers is actually a pointer, and since disclosing valid pointers is not something that should be done, spl_object_hash is simply providing a MD5 hash of those two values concatenated. Now two problems comes from this MD5 hash:

  • It's quite slow
  • It may generate collisions

One of the usages of this hash that comes to mind is an object dictionary(or map): attach information to instances, for example:

<?php $dict = array(); $dict[spl_object_hash($obj1)] = $info1; $dict[spl_object_hash($obj2)] = $info2; // and so on. ?>

Sadly, since PHP arrays are themselves hashtables, that means the hash will get hashed one more time, this is a waste of time.

Another example could be to mark nodes in a graph traversal algorithm, using a set of visited nodes.

SPL thankfully provides a class (as of PHP5.3) that can implement quite easily both examples without the problems stated above: SplObjectStorage.
Since an example is better than thousand words, here is a demonstration:

<?php // Map $dict = new SplObjectStorage; $dict[$obj1] = $info1; $dict[$obj2] = $info2; var_dump($dict[$obj1]); // $info1 // Set $set = new SplObjectStorage; $set->attach($obj1); var_dump($set->contains($obj1)); // True ?>

SplObjectStorage directly uses the unique identifier without pre-hashing it, so you spare time and you will be safe against collisions!

Comments

08.01.2009 #1 scott

Your code is nice and all but what practical use does the SplObjectStorage class have? Can you give me a real world example of when you would put something into $set and need to check if it is there later?

08.01.2009 #2 Jaik

The first example that comes to my mind is when using the Active Record pattern to ensure you only ever load one instance of a record.

08.01.2009 #3 Federico

It's a very useful class indeed. I remember in 2007 when I created Zend_Di_Storage_Object people kept telling me to use SplObjectStorage instead, thinking it was part of PHP 5.2, although SplObjectStorage wasn't added until 5.3.

08.01.2009 #4 Aaron

SplObjectStorage is indeed part of PHP 5.2, but it doesn't implement ArrayAccess and so lacks the capabilities described in the "Map" section of the last code block. Those capabilities are only available starting with PHP 5.3.

08.01.2009 #5 Elizabeth Smith

The name of this classes is really unfortunate.

Because it's not "object storage" in the way many people think. It is useful, but for a case where you want to map an identifier to an object (think string or integer), toss the object in a container, and somewhere else get the object back out (using the key alone because you don't have the original object) this doesn't work.

Once you understand the object is the key for this implementation, not the data being stored, it's a lot easier to understand how it's intended to be used.

08.01.2009 #6 colder

The naming is a bit unfortunate and that's probably why it's often not considered even when it fits the task. I'd have loved this to be split into two classes, a Map and a Set.

It might be considered when SPL datastructures grow a bit.

21.01.2009 #7 David Grudl

SplObjectStorage is in PHP even since 5.1.0 ;)

I hope spl_object_hash() will be replaced with spl_object_id() without mentioned disadvantages.

10.02.2009 #8 colder

spl_object_hash() has been improved to produce a 32 chars hash that is really unique, without using md5.

Add a comment

Username:

Spam Challenge: 10+20=?

Comment: