Paul M. Jones

Don't listen to the crowd, they say "jump."

On Preferring Spaces Over Tabs in PHP

The best lack all conviction, while the worst are full of passionate intensity. — “The Second Coming”, William Butler Yeats

Keep the above in mind when considering either side of the debate. ;-)

tl;dr

Herein I assert and discuss the following:

  • Using spaces has subtle advantages over using tabs in collaborative environments.

  • The “tabs reduce file size” argument is factually true, but is a case of optimizing on the wrong resource.

  • The “tabs allow each developer to set his own indent widths” argument sounds good in theory but leads to problems in practice regarding line length recognition and inter-line alignment.

Introduction

In the PHP world, there are effectively two competing indenting practices: “4-spaces” and “tab.” (There are some in the 2-space camp as well but they are very few.)

I want to point out a couple things about why spaces might be considered preferable in a collaborative environment when style is important on a PHP project. And yes, it turns out this dicussion does matter.

I used to use tabs, and slowly migrated over to spaces. Over the course of several years, I have found there is a slight but useful advantage to using spaces for indentation when working with other developers, and I want to discuss that one advantage in this essay.

Note that I am not asserting an overwhemling, absolutely obvious, infallible moral rule that clearly favors spaces over tabs as the One True Path. It is merely a noticeable improvement regarding more sophisticated rules of style.

Do I expect this essay to change anybody’s mind on using tabs? No; but, I do hope it will give some food for thought.

Regarding Tabs

When making an argument, it is important to state the alternative viewpoint in a way so that people who hold that viewpoint actually agree with it.

What are the reasons for preferring a tab indent? As far as I can tell, they are:

  • The tab is a single character, so files are smaller.

  • Using a tab character allows each developer to change the level of indent
    that he sees, without actually modifying the on-disk file.

If there are other reasons I have missed, please let me know.

File Size

In general, I assert that the “file size” argument is a case of “optimizing on the wrong resource.”

By way of example, let’s take one file from a real project that uses 4-space indenting, Zend_Db_Abstract, and use wc -c to count the number of bytes in the file.

$ wc -c Abstract.php
40953 Abstract.php

Now, let’s convert each 4-space indent to a tab.

$ unexpand -t 4 Abstract.php > Abstract-tabs.php
$ wc -c Abstract-tabs.php
34632 Abstract-tabs.php

We save 6K of space on a 40K file, or roughly 15%, by using a tab character for indents instead of a 4-space indent.

Now, to get an idea of how that compares to another way to reduce size, let’s remove all the comments from the original 4-space file and see what that does. We’ll use a tool I found after two minutes of Googling (you may need to change the hashbang line of remccoms3.sed to point to your sed):

$ wget http://sed.sourceforge.net/grabbag/scripts/remccoms3.sed
$ chmod +x remccoms3.sed
$ ./remccoms3.sed Abstract.php > Abstract-no-comments.php
$ wc -c Abstract-no-comments.php
21022 Abstract-no-comments.php

That’s about a 50% reduction. If disk storage is really a concern, we’d be much better off to remove comments than to convert spaces to tabs. Of course, we could do both.

This example makes me believe that the “file size” argument, while factually correct, is a case of “optimizing on the wrong resource.” That is, the argument gives strong consideration to a low-value item. Disk space is pretty cheap, after all.

A followup argument about this is usually, “Even so, it’s less for the PHP interpreter to deal with. Fewer characters means faster code.” Well, not exactly. Whitespace is tokenized, so the parser sees it all the same.

Developer Tab Stop Preferences

This, to me, seems to be the primary argument for preferring tabs over spaces for indenting. Essentially, the idea is to allow each individual developer on a project to make the code look the way that individual developer prefers.

This is a non-trivial argument. It’s very appealing for the individual developers to be able to work on a project where Developer A sees a tab stop every 4 characters, and Developer B sees a tab stop every 2 or 8 or whatever characters, without changing the actual bytes on disk.

I have two arguments against this; they seem to be minor, until we examine them in practice:

  • It becomes difficult to recognize line-length violations with over-wide tab stop settings.

  • Under sophisticated style guides, inter-line alignment for readability becomes inconsistent between developers using different tab stops.

These arguments require a little exposition.

Line Length Recognition

Because of limitations of this blog, let’s say that our coding style guide has a line length limit of 40 characters. (I know, that’s half or less of what it should be, but it serves as an easy illustration.)

The following code, with 4-character tab stops, shows what that line length limit looks like:

         1         2         3         4
1234567890123456789012345678901234567890
function funcFoo()
{
    $varname = '12' . funcBar() . '34';
}

It’s clearly within the line length limit. But it looks like this under an 8-character tab stop:

         1         2         3         4
1234567890123456789012345678901234567890123
function funcFoo()
{
        $varname = '12' . funcBar() . '34';
}

A developer who sees this code under 8-character stops will think the line is past the limit, and attempt to reformat it in some way. After that reformatting, the developer working with 4-character tab stops will think the line is too short, and reformat it back to being longer. This is not particularly productive.

Some will say this just shows that line length limits are dumb. I disagree.

Inter-Line Alignment

By “inter-line alignment” I mean the practice where, if we have several lines of code that are similar, we align the corresponding parts of each line in columns. To be clear, it’s not that unaligned code is impossible to read; it’s just noticeably easier to read when it’s aligned.

Typically, inter-line alignment is applied to variable assignment. For example, the following unaligned code …

$foo = 'bar';
$bazdib = 'gir';
$zim = 'irk';

… is easier to scan in columns aligned on the = sign:

$foo    = 'bar';
$bazdib = 'gir';
$zim    = 'irk';

We can see clearly what the variables are in the one column, and what the assigned values are in the next column.

Alternatively, we may need to break an over-long line across several lines, and make it glaringly obvious during even a cursory scan that it’s all one statement.

Now, let’s say we have a bit of code that should be aligned across two or more lines, whether for readability or to adhere to a line length limit. We begin with this contrived example using 4-space indents (the spaces are indicated by • characters):

function funcName()
{
••••$varname = '1234' . aVeryLongFunctionName() . 'foo' . otherFunction();
}

Under a style guide where we align on = to keep within a line length limit, we can do so regardless of tab stops:

function funcName()
{
••••$varname = '1234' . aVeryLongFunctionName()
•••••••••••• . 'foo' . otherFunction();
}

Under a guide where we use tabs, and Developer A uses 4-character tab stops, we need to push the alignment out to the tab stops to line things up (tabs are indicated by → characters):

function funcName()
{
→   $varname→   = '1234' . aVeryLongFunctionName()
→   →   →   →   . 'foo' . otherFunction();
}

However, if a Developer B uses an 8-character tab stop, the same code looks like this on Developer B’s terminal:

function funcName()
{
→       $varname→       = '1234' . aVeryLongFunctionName()
→       →       →       →       . 'foo' . otherFunction();
}

The second example has the same tabbing as in the first example, but the alignment looks broken under 8-character tab stops. Developers who prefer the 8-character stop are likely to try to reformat that code to make it look right on their terminal. That, in turn, will make it look broken for those developers who prefer a 4-character stop.

Thus, the argument that “each developer can set tab stops wherever he likes” is fine in theory, but is flawed in practice.

The first response to alignment arguments is generally: “Use tabs for indenting and spaces for alignment.” Let’s try that.

First, a 4-character tab stop indent, followed by spaces for alignment:

function funcName()
{
→   $varname = '1234' . aVeryLongFunctionName()
→   •••••••• . 'foo' . otherFunction();
}

Now, an 8-character tab stop indent, followed by spaces for alignment:

function funcName()
{
→       $varname = '1234' . aVeryLongFunctionName()
→       •••••••• . 'foo' . otherFunction();
}

That looks OK, right? Sure … until a developer, through habit (and we are creatures of habit) hits the tab key for alignment when he should have used spaces. They are both invisible, so the developer won’t notice on his own terminal — it will only be noticed by developers with other tab stop preferences. It is the same problem as before: misalignment under the different tab stop preferences of different developers.

The general response at this point is to modify the tab-oriented style guide to disallow that kind of inter-line alignment. I suppose that is reasonable if we are committed to using tabs, but I find code of that sort to be less readable overall.

Solution: Use Spaces Instead

The solution to these subtle and sophisticated issues, for me and for lots of other PHP developers, is to use spaces for indentation and alignment. All professional text editor software allows what are called “soft tabs” where pressing the tab key inserts a user-defined number of spaces. When using spaces for indentation and alignment, all code looks the same everywhere, and does not mess up alignment under different tab stop preferences of different developers.

Conclusion

I realize this is a point of religious fervor among developers. Even though I have a preference for spaces, I am not a spaces zealot. This post is not evangelism; it is a dissection of the subtle and long-term issues related to tabs-vs-spaces discovered only after years of professional collaboration.

Please feel free to leave comments, criticism, etc. Because this is such a touchy subject, please be especially careful to be civil and maintain a respectful tone in the comments. If you have a very long comment, please consider pinging/tracking this post with a blog entry of your own, instead of commenting directly. I reserve the right to do as I wish with uncivil commentary.

Thanks for reading, all!


Newt Gingrich, How Will You Lead The Nation Back To God?

Gingrich did *not* say the following:

... it is not the job of the President of the United States or in in the skill set of the President of the United States to lead the nation back to or toward God. For starters, a lot of Americans don’t believe in God and don’t want to hear about God. I don’t want a President to lead the nation toward meat-eating or vegetarianism or any of the myriad of other choices we make privately in our day-to-day lives.

More importantly, the President doesn’t know how to lead the nation back to God. He is just as likely to lead the nation away from God. There are people who specialize in Godliness–they are called “the clergy.” Having the President get involved in religion or belief in God inevitably crowds out private efforts or has unintended consequences.

In other words, the right answer is that the President of the United States is not the Messiah or a prophet or even a member of the clergy. Leading the nation back to God is the job of private voluntary acts. The President’s job is to leave those alone.

Via Not the Messiah.



What Do Men Want In A Woman?

The obvious Rodney Dangerfield joke aside:

In fact, there is one thing above all that men want from their women: that they are pleasant.

A girlfriend or wife doesn’t have to have the looks of Giselle Bundchen, the homemaking skills of Martha Stewart or the bedroom skills of a professional call girl to make a man happy. All of these would be nice bonuses, but they are not nearly as important as the ability to make a man feel relaxed, content and appreciated. A woman who is mediocre in all of the former attributes can easily make up for it by being a sweet, pleasant person who takes the edge off at home.

via What do Men Want in a Wife, and Why Won’t Wives Oblige? - The Spearhead.


Differences in Packaging Approaches: Aura, Symfony2, and ZF2

Looking from outside both Symfony2 and ZF2 is full of standalone components. But the reality is not the same. Though Symfony2 components are split into each components in github, you cannot give a pull request to that component. The tests for all the components still resides in the core. The same will be applying to ZF2 too. I wrote my concern on the topic in mailing list of Symony2, if you are interested you can read on it here.

Let's leave the contributions or pull requests, if I / you are trying to integrate Symfony form component to your project or library, we want to bring the tests some how make the necessary changes to the core if I want to make use of Event Dispatcher / Signal or another like Aura or Zend. Try to make how you can render in your view if I am using Aura View or Zend or even in phly_mustache.

So let me tell you the design principle to make them as standalone have some failure. Coming back to Aura, as a small contributor I can see Paul M Jones right decision to make the component library has more to speak. The good thing is Aura has all the tests in one place for the component. For eg: consider Aura Router the routing component. The source lies in src folder , the tests in tests folder.

My biases in posting this should be clear. ;-) Via Is there a design flaw for the Components or Packages made by Symfony2 and ZF2 | harikt.com.


A Civil Rights Victory in Canada

The end of Canada's long gun registry:

Despite spending a whopping $2.7 billion on creating and running a long-gun registry, Canadians never reaped any benefits from the project. The legislation to end the program finally passed the Parliament on Wednesday. Even though the country started registering long guns in 1998, the registry never solved a single murder. Instead it has been an enormous waste of police officers’ time, diverting their efforts from patrolling Canadian streets and doing traditional policing activities.

Keeping and bearing arms is a civil right in a civilized society. Here in the US, our Constiturion recognizes that right in the 2nd Amendment. Cia Death of a Long-Gun Registry - John R. Lott Jr. & Gary Mauser - National Review Online.



Science and Pseudoscience: "Newton Was An Alchemist"

We can all be both [scientist and pseudoscientist]. Newton was an alchemist.

...

{I]ndeed, the more you know, the more you fall for confirmation bias. Expertise gives you the tools to seek out the confirmations you need to buttress your beliefs.

...

"Science is the belief in the ignorance of the experts", said Richard Feynman. Never rely on the consensus of experts about the future. Experts are worth listening to about the past, but not the future. Futurology is pseudoscience.

...

A theory so flexible it can rationalize any outcome is a pseudoscientific theory.

[S]cience as an institution is and always has been plagued by the temptations of confirmation bias. With alarming ease it morphs into pseudoscience even – perhaps especially – in the hands of elite experts and especially when predicting the future and when there’s lavish funding at stake. It needs heretics.

Via Matt Ridley: Scientific Heresy.


Trials and Errors: Why Science Is Failing Us

This assumption--that understanding a system’s constituent parts means we also understand the causes within the system--is not limited to the pharmaceutical industry or even to biology. It defines modern science.

...

The problem with this assumption, however, is that causes are a strange kind of knowledge. This was first pointed out by David Hume, the 18th-century Scottish philosopher. Hume realized that, although people talk about causes as if they are real facts--tangible things that can be discovered--they’re actually not at all factual. Instead, Hume said, every cause is just a slippery story, a catchy conjecture, a “lively conception produced by habit.” When an apple falls from a tree, the cause is obvious: gravity. Hume’s skeptical insight was that we don’t see gravity--we see only an object tugged toward the earth. We look at X and then at Y, and invent a story about what happened in between. We can measure facts, but a cause is not a fact--it’s a fiction that helps us make sense of facts.

The truth is, our stories about causation are shadowed by all sorts of mental shortcuts. Most of the time, these shortcuts work well enough. They allow us to hit fastballs, discover the law of gravity, and design wondrous technologies. However, when it comes to reasoning about complex systems--say, the human body--these shortcuts go from being slickly efficient to outright misleading.

...

While correlations help us track the relationship between independent measurements, such as the link between smoking and cancer, they are much less effective at making sense of systems in which the variables cannot be isolated. Such situations require that we understand every interaction before we can reliably understand any of them.

The trouble with science is that people are the ones doing it. Any time anyone tells you "it's science!" you need to replace that, mentally, with "it's *scientists*" -- especially when political policy is involved. Via Trials and Errors: Why Science Is Failing Us | Wired Magazine | Wired.com.


Return-on-Investment of Lobbying Greater Than Entrepreneurship

Wall Street can do math, and the math looks like this: Wall Street + Washington = Wild Profitability. Free enterprise? Entrepreneurship? Starting a business making and selling stuff behind some grimy little storefront? You’d have to be a fool. Better to invest in political favors.

...

Wall Street wants an administration and a Congress -- and a country -- that believes what is good for Wall Street is good for America, whether that is true or isn’t. Wall Street doesn’t want free markets -- it wants friends, favors, and fealty.

...

If you don’t think that the government can just arbitrarily rewrite the bankruptcy rules to suit its political preferences, revisit the General Motors bailout, when it did just that, shortchanging bondholders in favor of the union goons who act as Democratic footsoldiers and dues-collectors.

There's a work for this; it's called "rent-seeking." Also, replace "Wall Street" with "Hollywood" (or big media companies) and you have the same thing. Via Repo Men - National Review Online.