Paul M. Jones

Testivus (for the Rest of Us) and the Testing Gene

The esteemed Sebastian Bergmann, author of PhpUnit, makes a great comment in my earlier post on testing and ravioli code.

In the comment, he links to the Testivus Manifestivus from Alberto Savoia, which is about as non-dogmatic about testing as you can get, while still highlighting the importance of testing. You should read the whole thing, but here are some of the highlights.

Less testing dogma, more testing karma

Dogma can be informally described as authoritative principles, beliefs, often considered to be absolutely true. Testivus tries to keep testing dogma to a minimum. What works for some people sometimes, may not work for some other people, or even the same people at some other time.

Karma, on the other hand, can be informally explained as: "Do good things, and good things will happen to you."Â We believe that writing and running tests is a good thing so if you write and run tests good things will happen to you "Â¦ well perhaps just to your code.

We'd like to say that this is the central tenet of Testivus, but calling something a tenet would be too dogmatic.

Any tests are better than no tests

Self-explanatory and inspired by Martin Fowler, who once wrote "Imperfect tests, run frequently, are much better than perfect tests that are never written at all"Â.

Savoia follows with "Testing beats debugging", "Test first, during, or after -- whatever works best for you", "If a method, technique, or tool, gives you more or better tests use it". While I may have issues with test-first and TDD, I am fully in support of Testivus.

What's intersting about Testivus is that it is the result of an earlier Savoia post about susceptibility to test-infection. That entry from Savoia is good too; he seems to approach things from a "how do people actually work" point of view, rather than "what would a perfect mode of working be". Strongly suggest you read it, if only to determine if you are a T1, T2, or T3 (I think of myself as in the T2 camp). TDD dogmatists (hi Noel Darlow!) take note: for whatever reason, some people are highly resistant to test-first, and airs of moral superiority and/or condescension do little to help your cause among the T2 and T3.

Update (2007-08-19): Alberto Savoia in the comments below notes that there are new extended versions of the Testivus Manifestivus.

Thanks for the new links.

TDD, Test-First, and Ravioli Code

I know that test-first and test-driven development (TDD) are popular methodologies these days, but something about those processes has always met with a level of mental resistance from me. Even though it sounds good in theory, I have been intuitively wary of the "test-first" mentality for some reason. My own approach is closer "remember to code so you can test it later" and then "test-last" after the API is mostly stable. This comment from Slashdot is a good summary of my feelings on the matter.

Thinking about testing early -- good. Writing unit tests -- good. The test driven development mentality (write tests instead of design, write unit tests before coding) -- bad. ... Thinking about testing early is useful, it may cause you to think about corner cases. But writing them first causes 2 problems -- you end up writing the code to solve the tests (rather than solving the problem) and/or you end up throwing away half the test suite in the middle when you refactor the design.

-- AuMatar

Just that you have tests does not guarantee that you're solving the right problem, or solving it well, or that your solution is comprehensible to others.

In the same conversation, someone brings up the term "ravioli code", which I had never heard before. The idea is that instead of long strings of procedural "spaghetti" that are difficult to untangle, there are lumps of methods jumbled together:

The problem is that it [Ravioli Code] tends to lead to functions (methods, etc.) without true coherence, and it often leaves the code to implement even something fairly simple scattered over a very large number of functions. Anyone having to maintain the code has to understand how all the calls between all the bits work, recreating almost all the badness of Spaghetti Code except with function calls instead of GOTO. It is far better to ensure that each function has a strong consistent description ... rather than splitting it up into smaller pieces ("stage 1 of preparing to frobnicate the foo part of the foobar", etc.) with less coherence. The principal reason why this is better is precisely that it makes the code easier overall to understand.

...

People who deliberately split code up into ravioli, and especially those who advocate that others do so, are "dangerous idiots" precisely because they've lost sight of the fundamental need for comprehensibility ...

-- dkf

I think that a lot of TDD and test-first idealists and evangelists* end up with ravioli code that is well-tested but still difficult to comprehend.

Some people will read this and think that I am against unit testing; that would be an incorrect interpretation. TDD and test-first dogmatism are what bother me, not testing per se.

(* Note that I said "idealists and evangelists" -- not all of the test-firsties are like this.)

Speaking at PHP Works 2007

I got stiff-armed for ZendCon 2007 (apparently they don't want presentations on competing frameworks ;-).

However, the PHP Architect folks graciously accepted two of my talk proposals for php|works 2007 in Atlanta:

Organizing Your PHP Projects, an updated version of the same talk I gave last year, and
Framework and Application Benchmarking, a presentation based on my popular New Year's Benchmarks research, including why (and how) to apply the same methods to your own work.

In a way, the two talks complement each other, but you'll have to attend both to see why. ;-)

Hope to see you at the conference!

Solar 0.28 Alpha Released

Last Friday I released Solar 0.28 alpha. (As usual, the guys on the mailing list got notification of this on the same day.)

This is the first release in four months. The last time I delayed so long between releases I gave the change notes inline, but I won't punish readers that way again. ;-) If you really want to, you can see the very very long list of change notes here.

Over the next several days, I'll post more about individual developments in the framework, but here's a little to pique your interest:

New HTTP package with request and response classes
New Mail and SMTP packages
New page-controller magic
Incorporation of the Stenhouse CSS Framework into Solar_App_Base

In other news, it looks like Enygma at phpdeveloper.org gave Solar a try and liked it. Head over there and give him some comment-love! :-)

Solar Views and Layouts

Looks like the Zend Framework project doesn’t have “complex views” settled just yet. I’m sure they’ll hit on a solution soon. In the mean time, let me show you how easy it is to work with views and layouts in Solar, including automatic format discovery and inherited layouts.

Basic Directory Structure

By way of introduction, here is the directory structure for an example application. We’ll call the top-level namespace “Vendor”, and the application itself “Example”. (These would live in the PEAR directory next to the Solar installation.)

 Vendor/
     App/
         Example.php
         Example/
            View/
                hello.php
            Layout/
                default.php

Page Controller, View, and Layout

The example application is a simple “Hello World” page controller:

/* Vendor/App/Example.php */

class Vendor_App_Example extends Solar_Controller_Page {

    protected $_layout = 'default';

    public $foo;

    public $zim;

    public function actionHello()
    {
        // let's set some properties
        $this->foo = 'bar';
        $this->zim = 'gir';

        // Solar_Controller_Page automatically finds and renders the
        // 'hello.php' view, then takes that output and automatically
        // injects it into the 'default.php' layout.
    }
}

The view script in this case is dirt-simple, but you can use Solar_View helpers to jazz it up.

        <!-- Vendor/App/Example/View/hello.php -->
        <p>Hello, world!</p>
        <p>Foo is <?php echo $this->escape($this->foo) ?>.</p>

As with most 2-step view implementations, the view output is “injected” into the layout script. In this case, let’s use a bare-bones HTML layout.

<!-- Vendor/App/Example/View/default.php -->
<html>
    <head>
        <title>Example</title>
    </head>
    <body>
        <?php echo $this->layout_content ?>
    </body>
</html>

(The $layout_content property is automatically populated by the page-controller with the output of the rendered view.)

When you browse to http://example.com/example/hello, you should see this output from the application:

<!-- Vendor/App/Example/Layout/default.php -->
<html>
    <head>
        <title>Example</title>
    </head>
    <body>
        <!-- Vendor/App/Example/View/hello.php -->
        <p>Hello, world!</p>
        <p>Foo is bar.</p>
    </body>
</html>

Variable Assignment

Wait, how did $foo get into the view? The page-controller automatically assigns all public properties of the controller to the view object, so you don’t have to think about what gets set and what doesn’t. If a controller property is public, the view can use it.

Likewise, the page-controller assigns the same variables to the layout, so you have full access to them in your layouts as well. For example, we could change the layout script to use $zim as the title …

<!-- Vendor/App/Example/View/default.php -->
<html>
    <head>
        <title><?php echo $this->escape($this->zim)</title>
    </head>
    <body>
        <?php echo $this->layout_content ?>
    </body>
</html>

… and the output would become:

<!-- Vendor/App/Example/View/default.php -->
<html>
    <head>
        <title>Gir</title>
    </head>
    <body>
        <!-- Vendor/App/Example/View/hello.php -->
        <p>Hello, world!</p>
        <p>Foo is bar.</p>
    </body>
</html>

Other Layouts, or No Layout

If you want to use a layout other than the default one, just change $this->_layout to the one you want to use. First, add the layout script:

 Vendor/
     App/
         Example.php
         Example/
            View/
                hello.php
            Layout/
                default.php
                other.php

Then ask for it in your action:

/* Vendor/App/Example.php */

class Vendor_App_Example extends Solar_Controller_Page {

    protected $_layout = 'default';

    protected $_action_default = 'hello';

    public $foo;

    public $zim;

    public function actionHello()
    {
        // let's set some properties
        $this->foo = 'bar';
        $this->zim = 'gir';

        // let's use some other layout
        $this->_layout = 'other';
    }
}

If you don’t want to use a layout at all, set $this->_layout = null.

You can do the same thing for views; by default, the page controller looks for a view that matches the action name, but you can set $this->_view to the name of any view you like.

Multiple Formats

Now let’s say that we want to expose an XML version of our view. The Solar page-controller can look at the format-extension on a request and render the right view for you automatically. All you need to do it provide the view script for it – you do not have to change your controller logic at all.

Let’s add the XML view for our “hello” action (“hello.xml.php” below).

 Vendor/
     App/
         Example.php
         Example/
            View/
                hello.php
                hello.xml.php
            Layout/
                default.php
                other.php

The hello.xml.php view script looks like this:

<hello>
    <foo><?php echo $this->escape($this->foo) ?></foo>
    <zim><?php echo $this->escape($this->zim) ?></zim>
</hello>

Now when you browse to http://example.com/example/hello.xml (notice the added “.xml” at the end), you will get this output:

<hello>
    <foo>bar</foo>
    <zim>gir</zim>
</hello>

You can do this for any output format you like: .atom, .rss, and so on – and not have to change your controller logic at all.

Wait a minute, what happened to the layout? The Solar page-controller knows that if it receives a non-default format request, it should turn off the layout and use only the view.

Shared Layouts

Now, what if you have a layout or view that you want to share among multiple page controllers? This is pretty easy, too. First, define a “base” controller from which other controllers will extend, then put the shared layouts there.

 Vendor/
     App/
         Base.php
         Base/
            Layout/
                default.php
                other.php

The base controller might look like this:

/* Vendor/App/Base.php */
class Vendor_App_Base extends Solar_Controller_Page {
    // nothing really needed here, unless you want
    // shared methods and properties too
}

Now the “example” controller extends the base controller:

/* Vendor/App/Example.php */
class Vendor_App_Example extends Vendor_App_Base {
    // ...
}

And you can remove the layouts from the example controller; it will automatically look up through the class-inheritance hierarchy to find the requested layout.

 Vendor/
     App/
         Base.php
         Base/
            Layout/
                default.php
                other.php
         Example.php
         Example/
            View/
                hello.php

You can override the shared layouts with local ones if you want to. If you have Example/Layout/default.php the page-controller will use it instead of the shared one.

This works with views too. Put any views you want to share in Base/View, and the page-controller will find them if they don’t exist in the Example/View directory.

That’s All For Now

Questions? Comments? Leave a message below, or join the Solar mailing list and chime in there – we’d be happy to have you around.

Zend Devzone Podcast: Solar Overview

Cal Evans at Zend has posted my Solar Overview podcast. Thanks Cal! Click through to see the transcript ("script", really ;-) of the audio.

---

In this episode, I'm going to give a brief overview of Solar project and how it helps with the mundane aspects of building applications.

Solar is an open-source library and framework for PHP 5; you can read more about it at solarphp.com.

Some early versions of Solar formed the basis of some parts of the Zend Framework, in particular the database and view components (this was around late 2005 and early 2006). Since that time, the two projects have continued to mature along separate paths, but the structure and organization of the two projects is still very similar, with one class per file using PEAR coding-style standards and E_STRICT compliance.

Solar originated from my attempts to build a PHP 4 framework out of PEAR components, a project I called Yawp. However, I encountered a number of difficulties in doing so. One of the biggest problems was, how to have each component automatically configure itself at construction time. It turns out that even though PEAR has a great set of coding-style standards, the different packages all have different construction and configuration. This meant that for each package I wanted to include in Yawp, I would have to build a wrapper specifically for it.

As a result, Solar uses a unified construction and configuration mechanism. Solar uses a single configuration file that returns PHP array, so there's no parsing of ini, yaml, or xml files. The configuration values are all keyed on the related class name. All constructors have a single parameter, a config array, and the base constructor merges those values with the default values from the config file automatically. This means that if you write a class for Solar, it will configure itself automatically at the moment you instantiate it.

Additionally, and not to go on too long about configuration issues, but child classes inherit their parents' default configuration values. This means that if you want a set of common configurations for a particular hierarchy, you can set them once in the config file, and all the child classes will use those -- although you can override those in the child class, of course.

So that's one thing that Solar automates for you, construction and configuration.

Something else Solar has is built-in localization. Each class that needs localized text has a subfolder called "Locale", with a file for each country-and-language code. Like the config file, the locale file just returns a PHP array, with translation keys and string values. The Solar base class provides a method called locale() that automatically loads the right file for the class and returns singular or plural translations from that file. As with everything else in Solar, locale files are inherited along class hierarchies, so a child class uses its parent locale file by default, and can override any or all of the parent translations if it needs to.

The Solar base class comes with another method that helps in throwing exceptions. If you need to throw an exception in Solar, you call that method and give it a string error code. It then looks for the right exception class file for that code, gets a localized exception message from the locale file if one exists, and returns it for throwing. By now, you may have guessed that child classes inherit their parents' exceptions, so you can start with very generic exception classes, and add specific ones as you need them, all without having to change your call for throwing the exception.

As you can tell, Solar makes a lot of use of class inheritance hierarchies as an organization and automation tool. This is only one example of Solar's conceptual integrity, which is one of Solar's greatest strengths. Anywhere you look in Solar, you will see things being done almost exactly the same way every time. We even go so far as to have standard names for methods. In some projects, the words "get" and "fetch" are interchangeable; in Solar, they have well-defined separate meanings. (As an aside, "fetch" means the method reads information from some external source and returns it, while "get" reads from a property or other internal value.)

Finally, Solar is built from the ground up with name-spacing in mind. Whle PHP doesn't have real name-spaces, they are easy to emulate through a naming convention. Solar classes are fully name-spaced, and expects that developers will also name-space the code they build to work with Solar. Few if any PHP frameworks are fully name-spaced this way; some are name-spaced themselves, but do not allow for developers to extend into a different name-space if needed.

For example, a Zend Framework controller for "users" has to be called UserController; the moment you try to combine two separate projects that have a UserController, you will get name confliction issues. In Solar, you are expected to pick a top-level name-space for yourself and then extend into it; for example, you might have MyProject_App_Users. Because Solar already makes such extensive use of class hierarchies, it can tell how to address the controller automatically, even though it's in a different name-space.

This has been just a brief overview of how the Solar Framework for PHP is organized; if you want to learn more, please visit solarphp.com.

A Bit About Benchmarks

As the author of a relatively popular benchmarking article, I feel compelled to respond to this bit of misguided analysis from the Symfony camp about benchmarks.

Full disclosure: I am the lead developer on the Solar framework, and was a founding contributor to the Zend framework.

M. Zaninotto sets up a number of straw-man arguments regarding comparative benchmarks in general, although he does not link to any specific research. In doing so, he misses the point of comparative benchmarking almost entirely. Herein I will address some of M. Zaninotto’s arguments individually in reference to my previous benchmarking series.

All of the following commentary regards benchmarking and its usefulness in decision-making, and should not be construed as a general-purpose endorsement or indictment of any particular framework. Some frameworks are slower than others, and some are faster, and I think knowing “how fast is the framework?” is an important consideration when allocating scarce resources like time, money, servers, etc.

And now, on to a point-by-point response!

Symfony is not slow in the meaning of “not optimized”

But it *is* slow in the meaning of “relative to other frameworks.”

Regarding the title of M. Zaninotto’s article, I don’t know of any reputable benchmark projects that conclude Symfony is “too slow for real-world usage” in general. (Perhaps M. Zaninotto would link to such a statement?) Of course, the definition of “real-world” is subjective; the requirements of some applications are not necessarily the same as others.

What is not subjective is the responsiveness of Symfony when compared to other frameworks in a controlled scenario: for a target of 500 req/sec, you are likely to need more servers to balance across with Symfony than with Cake, Solar or Zend. This is implied by my earlier benchmarking article.

If some benchmarks show that symfony is slower, jumping to the conclusion that symfony is not optimized is a big mistake.

I don’t know of any comparative benchmark research that concludes “Symfony is not optimized.” M. Zaninotto is arguing against a point that no benchmark project seems to be making. (Note that the benchmarks I generated explicitly attempt use each framework in its most-optimized dynamic state, including opcode caching. You can even download the source of the benchmarking code to see what the optimizations are.)

I'd say that people who take this shortcut are either way smarter than us, or they don't know what profiling is, they didn't look at the symfony code, and they can't make the difference between efficient code and a bottle of beer.

Profiling to optimize a subset of code lines is not the same as benchmarking the responsiveness of a dynamic controller/action/view dispatch sequence. (The speed of the code blocks together are taken into account by the nature of the benchmark.)

So for instance, you will not find this code in symfony:

for ($i = 0; $i<count($my_array); $i++)

instead, we try to always do:

for ($i = 0, $count = count($my_array); $i<$count; $i++)

This is because we know it makes a difference.

How do we know it? Because we measured. We do use profiling tools a lot, on our own applications as well as on symfony itself. In fact, if you look at the symfony code, you can see that there are numerous optimizations already in place.

I agree that the second code-block is much better than the first speedwise. (N.b.: the first one calls count() on each cycle of the loop, whereas the second one calls count() only once.)

But if that faster piece is called only once or twice, and another much-slower piece is called 2 or three times, the overall effect is to slow down the system as a whole. Optimizing individual blocks of code does not necessarily result in a fast system as a whole.

And if you use a profiling tool yourself on a symfony application, you will probably see that there is no way to significantly optimize symfony without cutting through its features.

… at least, not without rewriting the system as a whole using a different and more-responsive architecture.

Of course, there might still be a lot of small optimizations possible here and there.

I think one would need a lot of “small optimizations” to make the 41 percentage-point gain necessary to equal the next faster dispatch cycle of Cake (per my benchmarking article; your mileage may vary).

Symfony results from a vision of what would the perfect tool for developers, based on our experience. For instance, we decided that output escaping should be included by default, and that configuration should be written preferably in YAML. This is because output escaping protects applications from cross-site scripting (XSS) attacks, and because YAML files are much easier to read and write than XML. I could name similar arguments about security, validation, multiple environments, and all the other features of symfony. We didn't add them to symfony because it was fun. We added them because you need them in almost every web application.

I don’t see how this is different from how Cake, Solar, or Zend approached their development process. Each of those frameworks has output escaping, configuration (either by YAML or by much-faster native PHP arrays), security, validation, multiple environment support, etc. (Those frameworks still perform a dynamic controller/action/view dispatch faster than Symfony does.)

It is very easy to add a new server to boost the performance of a website, it is very hard to add a new developer to an existing project to make it complete faster. Benchmarking the speed of a “Hello, world” script makes little sense

M. Zaninotto completely misses the point here.

At least for my own benchmarking series, the purpose is not to merely to say “this one is faster!” but to say “you can only get so much responsiveness from any particular framework, I wonder how each compares to other frameworks?”

A “hello world” application is the simplest possible thing you can do with a dynamic controller/action/view dispatch, and so it marks the most-responsive point of the framework. Your application code cannot get faster than the framework it’s based on, and the “hello world” app tells you how fast the framework is.

Based on that information, you can get an idea how many servers you will need to handle a particular requests-per-second load. Based on my benchmarking, you are likely to need more servers with a Symfony-based app than with a comparable application in Cake, Solar, or Zend. This is about resource usage prediction, not speed for its own sake.

Using plain PHP will make your application way faster than using a framework. Nevertheless, none of the framework benchmarks actually compare frameworks to a naked language.

Incorrect; my benchmarking series specifically compares all the frameworks to a plain PHP “echo ‘hello world’” so you can see what the responsiveness limits are for PHP itself. I also compare the responsiveness of serving a plain-text ‘hello world’ file without PHP, to see what the limits are for the web server. These numbers become important for caching static and semi-static pages.

... none of the framework benchmarks actually compare frameworks to a naked language. This is because it doesn't make sense.

Incorrect again. It does make sense to do so, because you can use a framework to cache a static or semi-static page. Caching lets you avoid the dynamic controller/action/view dispatch cycle and improve responsiveness dramatically. However, if your requests-per-second requirements are higher even than that provided by caching, you’ll still need more servers to handle the load. Again, this is about resource usage, not speed per se.

If frameworks exist, it is not for the purpose of speed, it is for ease of programming, and to decrease the cost of a line of code. This cost not only consists of the time to write it, but also the time to test it, to refactor it, to deploy it, to host it, and to maintain it over several years.

Ease of programming is a valid concern … and so is resource usage. If you can get comparable ease-of-use in a different framework, and it’s also more responsive, it would seem to make sense to use the less resource-intensive one. (Of course, measuring ease-of-use and programmer productivity is much harder than measuring responsiveness – the plural of “anecdote” is not “data”. ;-)

It doesn't make much more sense to compare frameworks that don't propose the same number of features. The next time you see symfony compared with another framework on a “hello, world”, try to check if the other framework has i18n, output escaping, Ajax helpers, validation, and ORM turned on. It will probably not be the case, so it's like comparing pears and apples.

I completely agree: one must compare like with like. And my benchmarking series attempts exactly that: all features that can be turned off are turned off: no ORM, no helpers, no validation, etc. Only the speed of the controller/action/view dispatch cycle is benchmarked, and Symfony still came out as the least-responsive with all those fetaures turned off.

Also, how often do you see pages displaying “hello, world” in real life web applications? I never saw one. Web applications nowadays rely on very dynamic pages, with a large amount of code dedicated to the hip features dealing with communities, mashups, Ajax and rich UI. Surprisingly, such pages are never tested in framework benchmarks. And besides, even if you had a rough idea of the difference in performance between two frameworks on a complex page, you would have to balance this result with the time necessary to develop this very page with each framework.

M. Zaninotto is again missing the point; the idea is not to generate “hello world” but to see what the fastest response time for the framework is. You can’t do much less than “hello world”, so generating that kind of page measures the responseiveness of the framework itself, not the application built on top of the framework.

In a way, the above is M. Zaninotto’s strongest point. Any Ajax, rich UI, and other features you add after “hello world” will only reduce responsiveness, but it is difficult to measure how much they reduce responsiveness in a controlled manner (especially when comparing frameworks). It may be that some frameworks will degrade at a faster rate than others as these features are added. Having said that, Symfony starts at a much lower point on the responsiveness scale than other frameworks, so it doesn’t have as much leeway as other frameworks do.

The speed of a framework is not the most important argument

While not the most important argument, it is *an* important argument. And it is one we can reliably measure if we are careful – at least in comparison to other frameworks. Ignoring it is to ignore one of many important considerations.

And between two framework alternatives with comparable speed, a company will look at other factors to make a good decision.

Agreed – when the speeds are comparable, other factors will have stronger weight. This was the point of benchmarking a “hello world” implementation: to compare speed/responsiveness in a controlled fashion.

And if you need a second opinion, because you can't believe what the creator of a framework says about his own framework, perhaps you could listen to other companies who did choose symfony. Yahoo! picked symfony for a 20 Million users application, and I bet they measured the speed of a “hello, world” and other factors before making that decision. Many other large companies picked the symfony framework for applications that are crucial to their business, and they continue to trust us.

M. Zaninotto “bets” they measured it, but does not say “they did” measure it. I would be interested to hear what Yahoo themselves have to say about that experience. All public references to this seem to be from Symfony developers and user sites, not the Yahoo project team. (Yahoo folks, please feel free to contact me directly, pmjones88 -at- gmail -dot- com, or to leave a comment on this page.)

This page from the Symfony developers says that documentation, not speed, was Yahoo’s “first reason” to choose Symfony. It also says that Yahoo “extended and modified symfony to fit their needs,” which is plenty possible with Cake, Solar, and Zend.

Perhaps this is an example of a developer at Yahoo who used Symfony not because he compared it to other frameworks, but because he was already familiar with it or liked the way it looked. That would be perfectly fair, I think; we all pick what we like and then try to popularize it. But did Yahoo actually do a cost-benefit study (or even a simple “hello world” implementation comparison) ?

While we’re at it, how much hardware does it take for Yahoo to serve up the bookmarks application? Yahoo can afford to throw more servers at an application than most of us – a framework with better responsiveness (and thus needing fewer servers to balance across) is sure to become an important factor.

Solar 0.27.0 and 0.27.1 Released

Yesterday, I released Solar 0.27.0, then quick-fixed two minor bugs and released 0.27.1 an hour later. It feels so good to be back doing releases on a monthly basis. :)

There are a few highlights in this release:

We're using spl_autoload now to auto-load Solar classes as requested. One nice thing about this is that SPL uses a stack of autoloading functions, so Solar doesn't override any autoload you already have set up.
The locale translation functions have been split out to their own class, Solar_Locale, and you can now configure your own replacement localization class if you need custom behaviors.
It appears we now have the fastest and most-compliant JSON encoder/decoder in the PHP universe, thanks to Clay Loveless. It uses ext/json but does a little pre-checking to make sure the strings to be decoded are actually JSON payloads.
Our SQL adapter adds a bit of convenience to get around stricter binding behaviors in the PHP 5.2.1 version of PDO.
The Solar_Uri class now determines the '.ext' filename extension in a URI automatically; this bit of magic helps when determining what format is being requested from a page-controller, and helps when constructing alternative links for a single page that supports multiple formats.

New PDO Behavior In PHP 5.2.1

UPDATE (2016-05-30): Rasmus Schultz comments that "this does work â it was fixed after this article was published." So apparently the issue described herein has been fixed. I'll leave the article in place for archival purposes.

Prior to PHP 5.2.1, you could do this with PDO ...

<?php
// assume $pdo is PDO connection
$sql = "SELECT * FROM some_table
        WHERE col1 = :foo
        OR col2 = :foo
        OR col3 = :foo";

$sth = $pdo->prepare($sql);

$sth->bindValue('foo', 'bar');

$sth->execute();
?>

... and PDO would happily bind the value 'bar' to every ':foo' placeholder in the statement.

Sadly, this is no longer the case in PHP 5.2.1. For valid reasons of security and stability in memory handling, as noted to me by Wez Furlong, the above behavior is no longer supported. That is, you cannot bind a single parameter or value to multiple identical placeholders in a statement. If you try it, PDO will throw an exception or raise an error, and will not execute the query. In short, you now need to match exactly the number of bound parameters or values with the number of placeholders.

In most cases, I'm sure that's not a problem. However, in Solar, we can build queries piecemeal, so we can't necessarily know in advance how many placeholders there are going to be in the final query.

Also, it's often convenient to throw an array of data against a statement with placeholders, and only bind to the placeholders that have elements in the data array. Alas, this too is no longer allowed in PDO under PHP 5.2.1, because the number of bound values might not match the number of placeholders.

As a result, the newest Solar_Sql_Adapter::query() method includes some code to examine the statement and try to extract the named placeholders that PDO expects to see. Given the above example statement, PDO will expect placeholders for :foo, :foo2, and :foo3 (PDO auto-numbers repeated placeholder names). While a bit brain-dead, it does seem to do its job tolerably well ... at least well enough to get around this newly-implemented (but apparently always-planned) behavior.

The code in the query() method looks something like this; note that we call it by sending along an array of $data to bind as values into the statement.

// prepare the SQL command and get a statement handle
$sth = $this->_pdo->prepare($sql);

// find all :placeholder matches.  note that this will
// find placeholders in literal text, which will cause
// errors later.  so in general, you should *either*
// bind at query time *or* bind as you go, not both.
preg_match_all(
    "/W:([a-zA-Z_][a-zA-Z0-9_]+?)W/m",
    $sql . "n",
    $matches
);

// bind values to placeholders, adding numbers as needed
// in the way that PDO renames repeated placeholders.
$repeat = array();
foreach ($matches[1] as $key) {

    // only attempt to bind if the data key exists.
    // this allows for nulls and empty strings.
    if (! array_key_exists($key, $data)) {
        // skip it
        continue;
    }

    // what does PDO expect as the placeholder name?
    if (empty($repeat[$key])) {
        // first time is ":foo"
        $repeat[$key] = 1;
        $name = $key;
    } else {
        // repeated times of ":foo" are treated by PDO as
        // ":foo2", ":foo3", etc.
        $repeat[$key] ++;
        $name = $key . $repeat[$key];
    }

    // bind the $data value to the placeholder name
    $sth->bindValue($name, $data[$key]);
}

// now we can execute, even if we had multiple identical
// placeholders in the statement.
$sth->execute();

With this code in place, we can now bind one 'foo' value to many identical ':foo' placeholders.

NB: Do not try doing this with bound parameters, or you are likely to run into memory problems.

UPDATE (2016-05-30): Rasmus Schultz comments that "this does work â it was fixed after this article was published." So apparently the issue described herein has been fixed. I'll leave the article in place for archival purposes.

TypeKey and Big-Number Math: Yay Wez!

Wez Furlong gives us good news about implementing the math functions needed to support TypeKey and OpenID more directly within PHP.

Solar users have had integrated TypeKey support via Solar_Auth_Adapter_Typekey for almost 6 months now. This is
in addition to all our other auth adapters (SQL database, LDAP, .htpasswd, even .ini file).

The internals of our TypeKey adapter use big-number math functions implemented in userland by Daiji Hirata. These are very useful and serve a purpose unfulfilled by anything else in PHP, but simply cannot compare in speed or simplicity to the big-number functions presented by Wez.

I for one welcome our new big-number overlord, and I hope it arrives sooner rather than later. It will make our TypeKey adapter that much faster, and open the way for an easier-to-implement OpenID adapter for Solar.