Paul M. Jones

Don't listen to the crowd, they say "jump."

The Framework as Franchise

My PHP Advent article is up; therein I try to describe the parallels between public frameworks and business franchises. However, the PHP Advent site doesn't support comments; if you would like to comment, please do so on this blog post instead. Thanks!


Patterns of Intellectual Bullies

This post is in response to http://terrychay.com/blog/article/challenges-and-choices.shtml, specifically this part:

When people put "design patterns" on their resume, I like to ask them a particular question -- especially when their background is J2EE or they say they know design patterns. The question I like to ask is define design patterns -- what does that term mean? I’d say about 90% of the people who put that on their resume bomb that question. It’s actually not an easy question. As soon as they answer it -- they give me some sort of pseudo-book definition -- I tear into them. I’ll give you an example:

The typical thing that they’ll say is, "Oh! A design pattern is this code thing that solves...umm...a problem."

And I’ll go, "Well, shit." laughter "Quicksort, right? That must be a fucking design pattern then." laughter

And then they’ll say, "Well no. Quicksort isn’t a design pattern."

Then I’m like, "Well, explain to me how it isn’t a design pattern. Your definition is that is solves a problem -- which I agree, design patterns do solve a problem -- but obviously that’s not a sufficient definition for design patterns."

You get where I’m coming from? And the reason isn’t...

And then they’ll say something like, "Well, you know. It doesn’t have like... It’s not an algorithm!"

"Umm...Yeah. So then design problems are something that solves a problem but isn’t an algorithm. So, code versioning! The practice of code versioning solves a problem and it’s not an algorithm clearly! (In fact this is what’s called a "best practice.") So how is a best practice not a design pattern?"

See no matter what they do they fall in a fucking trap. laughter

So I’ll give you my definition of design patterns. Well my honest-to-goodness definition of design patterns is to quote a famous Supreme Court justice when he was talking about it: He said that he’ll know it when he sees it.

Actually, he was talking about porn. laughter But there is pretty much no difference between design patterns and porn so we are all okay with that.


For Terry to say "design patterns are like porn, you know it when you see it" is funny and entertaining, but careless and unhelpful.

When a web developer talks about design patterns, it seems likely he means patterns of the type described by Martin Fowler in "Patterns of Enterprise Application Architecture". Regarding the definition of patterns, Fowler has this to say on page 9:

There's no generally accepted definition of a pattern, but perhaps the best place to start is Christopher Alexander ... "Each pattern describes a problem which occurs over and over again in out environment, and then describes the core of the solution to that problem, in such a way that you can use this soution a miillion times over, without ever doing it the same way twice."

Fowler then goes on for several paragraphs refining and explaining the concept. So while nobody has a rigourous definition of "design patterns", there does appear to be a rough outline of how to discover them, and then to agree on instances of patterns by naming and describing them. (Whereas the definition of porn cannot ever be agreed on, becuase it is in the eye of the beholder. I'd prefer not to take the analogy much further. ;-)

Patterns as Domain-Specific Vocabulary

Fowler (page 11) says "... the value of the pattern is not that it gives you a new idea; the value lies in helping you communicate your idea." That is, patterns are a common vocabulary to aid communication. Application design patterns are a vocabulary to aid communication about application design.

There are many kinds of patterns in the software world. To use Terry's examples, quicksort could easily be called a pattern of some kind, perhaps a sorting pattern. Code versioning could also be called a pattern of some kind, perhaps an organizing pattern. Best practices might be patterns of management. But they're not application design patterns.

Intellectual Bullying

As the interviewer, Terry does not appear to be seeking to tease out what the applicant thinks he means when he says "design patterns". Terry uses the term "design patterns" in a generic way, instead of in the way the applicant most likely intends -- "application design patterns". It sounds like Terry is attempting to trap the interviewee by subtly and purposely misleading him.

I have to wonder if that kind of questioning technique is appropriate behavior for someone in a position of power (and the interviewer does have a measure of power over the applicant). It sounds like an intentionally negative experience, one that is unnecessarily humiliating.

In fact, it sounds like bullying; intellectual bullying, to be sure, but bullying nonetheless. It reminds me of passages from the chapter on "Homo Logicus" in Alan Cooper's "The Inmates Are Running The Asylum". Cooper (101-104) compares and contrasts the physical/athletic jock and the mental/intellectual jock, both of whom exhibit immature bullying behavior.

The athlete bully, with great physical prowess, begins with the idea that "If I can beat you in a physical contest, then I am your master and I am better than you," but eventually is conditioned to accept that physical domination is not socially acceptable. He grows up when he realizes he can't get along with other adults by bullying them.

The intellectual bully, with great mental prowess, begins with the idea that "If I can beat you in a mental contest, then I am your master and I am better than you." However, the intellectual bully rarely learns that mental domination is similarly unacceptable in civil, adult discourse. "There is no maturation process to temper their exercise of that power." (Cooper, 104)

Closing Thought

When in a competition, physical or mental, try to win! But civil discourse is not competitive; you don't "win" a conversation. Mature adults attempt to work with each other to clarify meaning; they are both truthful and helpful when speaking to each other. They try to "find out what is right." Bullies and the immature, on the other hand, want to "be right" period, even if (maybe even because) that means knocking the other person around. Beware the mental bully in yourself, and point it out when you see it in others.



Escape from Namespaces

I admit that I am an unproductive whiner on this issue. I don't care if namespaces go into PHP or not; at this point, I'd almost rather they not. But some of my feelings as expressed on IM this morning:


09:13:08  pmjones: yayfornewnamespaceseparator
09:13:12  pmjones: hmmmm
09:13:19  nate: oh geez
09:13:27  nate: I can't believe they picked *that* one
09:13:33  pmjones: does that mean there are two newlines in that phrase?
09:13:36  pmjones: who knows.
09:13:53  nate: you should really post something like that
09:14:01  pmjones: maybe PHP really *is* getting bought by Microsoft
09:14:12  nate: yeah
09:14:17  nate: you'd have at least thought they'd go with /
09:14:31  pmjones: no, that's division
09:14:38  pmjones: which might make sense, now that i think about it
09:14:47  pmjones: for all the divisiveness we have over it
09:14:51  nate: heh ;-)
09:14:59  nate: you took the words out of my mouth
09:15:03  nate: er, fingers
09:15:06  pmjones: indeed
09:15:16  pmjones: i know you want namespaces very badly ...
09:15:23  pmjones: ... but do you want them *this* badly?
09:15:45  nate: still undecided
09:16:00  pmjones: if you want them badly, badly is what you've got ;-)

To explain the jokes:

The "n" characters in the namespace string are escaped newlines; thus, "yayfornewnamespaceseparator" might well be translated as "yayfor[newline]ew[newline]amespaceseparator". ASCII gurus will know what f and s translate to.

Zend Is Not PHP, so Microsoft can't buy "PHP". But the backslashes are very DOS-ish.

Here ends the unproductive whining, at least for now.


... But Some Suck Less Than Others

(N.b.: This is a post I've had in the queue for several months now, and while I still don't feel like it's "finished", it's time to just publish the thing and be done with it.)

Laura Thomson says that all frameworks suck -- and she's right! But maybe not for the reasons you think.

Before we get started, let me give her a big public thank you for her praise of my benchmarking methodology: thanks, Laura. :-)

Also, let me point out that I am the author of a framework, Solar, and so I am as much an example of the behaviors I describe below as anyone else.

I don't mean to put words in her mouth, but I'd prefer to extend Laura's phrasing a bit. I'd argue that "all frameworks from other people suck". (Cf. Rule Number One from my "obsessive-compulsive sociopath" blog post.)

The "other people" part is important here. It sucks to have to learn how someone else wants you to work, and that's a big part of what a framework needs from you: to learn how to use it. Learning someone else's code is much less rewarding in the short term than writing your own code. I think there's a kind of subjective relativistic effect: time spent learning and analyzing drags out, but code-writing time flies by -- even if it's the same amount of objective time spent. Time-drag sucks.

By definition, this means that the framework you write for yourself sucks less than anything else out there -- it feels more rewarding. Jeffrey Palermo points out another factor: the framework author is his own customer, and has to satisfy only himself (or his team) when writing it.

Even if you are a responsible developer, perhaps because you are one, you probably will build your own framework, and pretty early on at that. You would be a fool not to; if you face the same set of problems over and over, eventually you will settle on a preferred series of solutions. If you write the same code over and over again, from scratch, on each project that solves similar problems, then you're probably not getting the "code reuse" thing yet.

That collection of solutions-in-code is your framework. It may be highly formalized or very loose, highly consistent (or not), and so on. But it is a framework.

And I guarantee there will be things you don't like about that first framework -- so you'll write another one. Maybe even a third, as you continuously internalize the problem sets, because there's no substitute for front-line experience (do all the testing you like, but real-world use will be the truest critic of your process).

Finally, after all your work extracting that solution-in-code, you will want to share your wonderful creation with the world, the True Path that is clearly useful if only others are wise enough to recognize it. And to those great unwashed, who do not recognize all your effort and genius, your framework will suck.

This is because there are quirks and workarounds and hacks that you have internalized and accepted and are so familiar with that you no longer pay attention to them, and they don't make sense to other developers. Even working-style similarities among framework developers and adopters will only reduce, not eliminate, framework suckage. There's always something that could have been done differently -- and many prospective adopters will see that as a reason to build an entire new framework, from scratch, to address those points, because (by definition) their own work sucks less.

Sturgeon's law says 90% of everything sucks, and the development world is no different. Almost nothing is perfect for every developer: there's always significant room for valid criticism on any project, and even the best projects are lacking in at least one vital area (and that area is different for each project).

It's all about tradeoffs between what you want to do and what you are willing to put up with in order to do it -- and at no point will you get everything exactly precisely the way you want, either with a framework or without one. There's no silver bullet. This means that you have to put up with suckage no matter what -- some frameworks suck less than others, is all.

(Personally, I think Solar sucks least; but then, I would say that, wouldn't I? ;-)



Solar 1.0.0alpha2 released

After a long delay (almost a year) Solar has a new release: version 1.0.0alpha2. You ca read more about it on Solar's new blog, which is where I'll be trying to keep all the Solar-specific stuff from now on. (You may see cross-posting between there and here from time to time.) Thanks to all who made this new release possible!


Rasmus Lerdorf's Laconic(a) Performance

As many of you know, I maintain a series of web framework benchmarks. The project codebase is here and the most recent report is here.

It was with some interest, then, that I viewed Rasmus Lerdorf's slides on the subject of performance benchmarking. I'm beginning to think there's something unexpected or unexamined in his testing methodology.

Note: see the update at the end of this entry.

On this slide, Rasmus notes the following:

  • Static HTML nets 611.78 trans/sec
  • Trivial PHP nets 606.77 trans/sec

This would seem to indicate that the mere invocation of PHP, on Rasmus' setup, reduces Apache's performance from serving static pages by less than 1%.

In my testing on Amazon EC2 small instances, I note somewhat different results:

  • Static HTML nets 2309.14 req/sec
  • Trivial PHP nets 1320.47 req/sec

The net reduction there is about 43%. Yes, this is with opcode caching turned on.

I then became really curious as to how Rasmus might have his system set up, to see only a 1% bit of "added overhead" from PHP being invoked. It would be nice if I could set up my own systems the same way.

When I asked about that at work, my colleague Elizabeth Smith opined that maybe Rasmus' web server is running all requests through the PHP interpreter, not just PHP files. That sounded like a good intuition to me, so I set up an EC2 instance to try it out.

Per the setup instructions on this page I built an EC2 server the same as I've done for my own benchmarking reports. I didn't check out the whole project, though; this time we just need the "bench", "baseline-html", and "baseline-php" directories.

As a reminder, the baseline index.html page is just the following plain text ...

Hello World!

... and the baseline index.php page is the following single line of PHP code:

<?php echo 'Hello World!'; ?>

The php5.conf file for Apache looks like this by default ...

<IfModule mod_php5.c>
  AddType application/x-httpd-php .php .phtml .php3
  AddType application/x-httpd-php-source .phps
</IfModule>

... and we're going to leave it alone for a bit.

Using ab -c 10 -t 60 to benchmark baseline-html and baseline-php with the default php5.conf gives us these results (an average over 5 runs each):

               |      avg
-------------- | --------
baseline-html  |  2367.02
baseline-php   |  1270.15

That's a 47% drop for invoking PHP. (That is itself 4 points different than the numbers I show above, so it appears there are some variables I have not controlled for, or maybe I just need to let this run longer than 5 minutes to smooth out the deviations.)

To test our hypothesis, we modify the php5.conf file to add .html to the list of files that get passed through PHP ...

<IfModule mod_php5.c>
  AddType application/x-httpd-php .html .php .phtml .php3
  AddType application/x-httpd-php-source .phps
</IfModule>

... restart Apache, and run the same ab tests again:

framework      |      avg
-------------- | --------
baseline-html  |  1348.80
baseline-php   |  1341.31

That's less than a 1% drop -- close enough to make make me think that Rasmus might be pushing everything through the PHP interpreter, regardless of whether or not it's a PHP file.

If that is true (and it's a big "if"), then merely invoking PHP does appear to cause about a 45% drop (give or take) in Apache's responsiveness, which is contrary to the point Rasmus makes on this slide about PHP "rarely" being a bottleneck -- and I say that as someone who works with PHP almost exclusively. In fairness, I am depending only on the text of his slides here, so he may have said something to that effect in the presentation itself.

Failure modes on this analysis:

  • I am using XCache and not APC for the opcode cache. (Why? Because it works with both Lighttpd+FCGI and Apache+modphp, at least the last time I checked, and I'm interested in the differences between those two setups.)

  • I am using an EC2 server, which is more production-ish than Rasmus' laptop.

  • I am using ab to benchmark with, not siege. I tried using Siege and did not notice any significant differences, so I'm sticking with the ab tools I've built for now.

I can't imagine those three differences would lead to the kind of disparity in performance that I'm seeing, but it's possible.

Has anyone else tried doing this?

Rasmus, if you have the time and inclination, would you care to shed some light on these prognostications?

Update: Is It EC2?

Rasmus replies below that he did not, in fact, have PHP running for all .html files.

For me, the next question was to see what the real difference on EC2 is between no cache, XCache, and APC.

No cache:


               |      avg
-------------- | --------
baseline-html  |  2339.25
baseline-php   |  1197.28

XCache (copied from above)


               |      avg
-------------- | --------
baseline-html  |  2367.02
baseline-php   |  1270.15

APC:


               |      avg
-------------- | --------
baseline-html  |  2315.83
baseline-php   |  1433.91

So on EC2, you get about 1200 req/sec without caching, about 1300 req/sec with XCache, and about 1400 req/sec with APC, in the "hello world" baseline scenarios.

Maybe this is all an artifact of how EC2 works, then? I have no idea. Next step is to test on non EC2 systems, if I can find one that others can reasonably build on themselves (since one of the goals here is for others to be able to duplicate the results).


Labor Day Benchmarks

By popular request, here is an update of my web framework benchmarks report. You can see previous result sets here:

Before you comment on this post, please have the courtesy to read at least the first two articles above; I am tired of refuting the same old invalid arguments about "hello world makes no sense", "if you cache, it goes faster", "the ORM systems are different", and "speed isn't everything" with people who have no understanding of what these reports actually say.

Full disclosure: I am the lead developer on the Solar Framework for PHP 5, and I was an original contributor to the Zend framework.

In the interest of putting to rest any accusations of bias or favoritism, the entire project codebase is available for public review and criticism here.

Flattered By Imitators

They say that imitation is the sincerest form of flattery. As such, I am sincerely flattered that the following articles and authors have adopted methodologies strikingly similar to the methodology I outlined in Nov 2006.

  • SellersRank here and here.
  • AVNet Labs here.
  • Rasmus Lerdorf here. I am considering writing a separate post about this talk by Rasmus.

Methodology, Setup, and Source Code

The methodology in this report is nearly identical to that in previous reports. I won't duplicate that narrative here; please see this page for the full methodology.

The only difference from previous reports regards the server setup. Although I'm still using an Amazon EC2 instance, I now provide the full setup instructions so you can replicate the server setup as well as the framework setup. See this page for server setup instructions.

Finally, you can see all the code used for the benchmarking here.

Results, Part 1

Update: FYI, opcode caching is turned on for these results.

The "avg" column is the number of requests/second the framework itself can deliver, with no application code, averaged over 5 one-minute runs with 10 concurrent users. That is, the framework dispatch cycle of "boostrap, front controller, page controller, action method, view" will never go any faster than this.

The "rel" column is a percentage relative to PHP itself. Thus, if you see "0.1000" that means the framework delivers 10% of the maximum requests/second that PHP itself can deliver.

framework avg rel
baseline-html 2309.14 1.7487
baseline-php 1320.47 1.0000
cake-1.1.19 118.30 0.0896
cake-1.2.0-rc2 46.42 0.0352
solar-1.0.0alpha1 154.29 0.1168
symfony-1.0.17 67.35 0.0510
symfony-1.1.0 67.41 0.0511
zend-1.0.1 112.36 0.0851
zend-1.5.2 86.23 0.0653
zend-1.6.0-rc1 77.85 0.0590

We see that the Apache server can deliver 2300 static "hello world" requests/second. If you use PHP to echo "Hello World!" you get 1300 requests/second; that is the best PHP will get on this particular server setup.

Cake: After conferring with the Cake lead developers, it looks like the 1.2 release has some serious performance issues (more than 50% drop in responsiveness from the 1.1 release line). They are aware of this and are fixing the bugs for a 1.2.0-rc3 release.

Solar: The 1.0.0-alpha1 release is almost a year old, and while the unreleased Subversion code is in production use, I make it a point not to benchmark unreleased code. I might do a followup report just on Solar to show the decline in responsiveness as features have been added.

Symfony: Symfony remains the least-responsive of the tested frameworks (aside from the known-buggy Cake 1.2.0-rc1 release). No matter what they may say about Symfony being "fast at its core", it does not appear to be true, at least not in comparison to the other frameworks here. But to their credit, they are not losing performance. (Could it be there's not much left to lose? ;-) In addition, I continue to find Symfony to be the hardest to set up for these reports -- more than half my setup time was spent on Symfony alone.

Zend: The difference between the 1.0 release and the 1.5 release is quite dramatic: a 25% drop in responsiveness. And then another 10% drop between 1.5 and 1.6.

To sum up, my point from earlier posts that "every additional line of code will reduce responsiveness" is illustrated here. Each of the newer framework releases has added features, and has slowed down as a result. This is neither good nor bad in itself; it is an engineering and economic tradeoff.

Results, Part 2

I have stated before that I don't think it's fair to compare CodeIgniter and Prado to Cake, Solar, Symfony, and Zend, because they are (in my opinion) not of the same class. Prado especially is entirely unlike the others.

Even so, I keep getting requests to benchmark them, so here are the results; the testing conditions are idential to those from the main benchmarking.

framework avg rel
baseline-html 2318.89 1.7710
baseline-php 1309.39 1.0000
ci-1.5.4 229.29 0.1751
ci-1.6.2 189.89 0.1450
prado-3.1.0 39.86 0.0304

CodeIgniter: Even the CI folks are not immune to the rule that "there is no such thing as a free feature"; between 1.5.4 and 1.6.2 releases they lost about 18% of their requests/second. However, they are still running at 14.5% of PHP's maximum, compared with the 11.68% of Solar-1.0.0-alpha1 (the most-responsive of the frameworks benchmarked above), so it's clearly the fastest of the bunch.

Prado: Prado works in a completely different way than the other frameworks listed here. Even though it is the slowest of the bunch, it's simply not fair to compare it in terms of requests/second. If the Prado way of working is what you need, then the requests/second comparison will be of little value to you.

This Might Be The Last Time

Although I get regular requests to update these benchmark reports, it's very time-consuming and tedious. It took five days to prepare everything, add new framework releases, make the benchmark runs, do additional research, and then write this report. As such, I don't know when (if ever) I will perform public comparative benchmarks again; my thanks to everyone who provided encouragement, appreciation, and positive feedback.


Solar System

In the spirit of some other framework projects, the Solar Framework for PHP 5 now offers a ready-to-use Solar system to get new users off to a quick start. It's not prepared as a tarball just yet, but it is available for checkout or export using Subversion from http://svn.solarphp.com/system/trunk.

For example, if you make a checkout in your document root ...

$ cd /var/www/html
$ svn checkout http://svn.solarphp.com/system/trunk solar

... and follow the README instructions, you will have a fully-operational installation in very short order, including an SQLite database, authentication, and three example applications:

http://example.com/solar/index.php
A simple "hello world"
http://example.com/solar/index.php/hello-app
A complex "hello world" with authentication and localization
http://example.com/solar/index.php/bookmarks
A "bookmarks" application.

(Note that the "index.php" is only in the evaluation deployment; when you create a virtual host and point it at the Solar system document root, a .htaccess file makes the "index.php" unnecessary.)

You can read more about the structure and principles of the Solar system here.


BREAD, not CRUD

Several developers have asked me what "BREAD" means in web applications. Most everyone knows that CRUD is "create, read, update, delete," but I think that misses an important aspect of web apps: the listing of records to select from.

I don't recall where I first heard the term BREAD; it stands for "browse, read, edit, add, delete". That covers more of what common web apps do, including the record listings. It even sounds nicer: "crud" is something icky, but "bread" is warm and fulfilling. That's why I tend to use the term BREAD instead of CRUD, especially when it comes to Solar and action-method names in the application logic.

Update 1 (2008-08-21): Wow, lot of traffic from Reddit and Y-Combinator on this one. Be sure to check out my post on Web Framework Benchmarking, and of course the Solar Framework for PHP 5.

I see a couple of comments saying that "browse is the same thing as read, it's just a special-case of read." I can see where that would be true, in a limited way. Using similar logic, one could argue that "add" is a special case of "edit", it just happens that the record isn't there yet; and then "delete" is another special case of "edit", you're just editing it out of existence. So that leaves you with just Read (one/many) and Edit (existing/non-existing/out-of-existence).

I think that takes things way too far. ;-) The special cases of "edit" are *so* special that they deserve their own logic. I think the same thing applies to "browse" -- it might be a special case of "read", but it's different-enough to deserve its own place.

Update 2: Matthew Weier O'Phinney refreshes my memory -- he mentioned the term to me years ago in a discussion about his PHP port of CGI::App. Thanks, Matthew!

Update 3: I said above that you could reduce all operations to "read" (with 2 cases) and "edit" (with 3 cases). It occurs to me now that those correspond to the way GET and POST are most-widely used. So maybe it wasn't such a silly argument after all. ;-)