Is PHP here to stay?

As a LAMP developer, I am starting to question the long term viability of PHP. PHP was born during an era when knowing HTML was a valid and valuable resume bullet. Because of this, most of the “advanced” aspects of PHP — which relate to the OOP functionality — were introduced only after PHP 4/5, and weakly at that. Additionally, new languages have since become popularized that show the weakness of PHP. Don’t get me wrong, I am very supportive of PHP. I just believe that it’s important that people understand both the strengths and weaknesses of the tools they use.  There are two main points I want to cover:

  1. PHP thread support is weak
  2. PHP OOP = Broken

The second point is rather technical, but it closely relates to another strength and weakness of PHP: it is loosely typed. More on that later.

Thread Support is Weak

True threading support in PHP does not exist. The closest thing is the pcntl_fork method, which copies the current process, rather than create a thread. This means asynchronous processing within a single process is not supported. Threading is useful in event-driven architectures (common in JavaScript) or when doing blocking operations such as network calls.

Because the forked process is a clone of the original, it shares all of the original resources, including database and file resources. This means that the forked process must be self-aware of whether it is a child or not, and must be careful not to modify or close these resources. This encourages spaghetti code that contains large logic forks (“if I am not a clone, else…”). Because of this, forking is messy and error prone. This gets further complicated when PHP is executed by Apache in a web environment. In fact, the PHP manual advises avoiding forking with web servers:

Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.

Not to mention the method is incredibly C-like in that it is very “raw” (unlike other native PHP methods/classes). This increases the barrier to entry significantly, which ultimately serves to have the feature ignored by most shops.

Why is all of this important? Well, at most companies, one language is selected for all in-house development. This is because cross training and hiring is simplified if everybody speaks the same language. There are a few common tasks that are unnecessarily difficult to do in PHP:

  • Asynchronous work — handing off work such as connecting to a remote server to a child and wait for a response
  • Manage thread pools — this sort of work requires significant “by-hand” management of any processes spawned by the parent via pcntl_fork

The threading issue is only a pain point that impacts processes that need to become parallelized. It is a pain most big shops live with, or, alternatively, introduce other languages to help solve.

PHP OOP = Broken

Because of the loosely typed nature of PHP, true, well-formed object oriented programming is broken. I know that for many PHP programmers, “Object Oriented” means putting together classes and reusing code as objects. However, that is truly, sincerely, only a portion of the point of OOP. Some of the most powerful aspects of OOP are lost in PHP’s implementation of the concept. Don’t get me wrong: these decisions were probably the right fit for the niche PHP was filling, but I don’t believe most PHP programmers are fully aware of what they are missing.

While the language, thankfully, has interfaces and abstract classes, they are woefully underused. This is, in part, due to to the developer community being largely self-taught. This creates a misconception about the nature of OOP, which ultimately leads to the devaluation of the most important feature of OOP: interfaces.

I can go into why they are so important in another article, but the point is: without interfaces, true polymorphic code is impossible. Or, rather, extremely susceptible to spaghetti code and fatal errors.

In other languages (Java), code might look like this:

interface Animal { void makeSound(); }
void farm(Animal cat, Animal dog, Animal parrot) {
  cat.makeSound();
  dog.makeSound();
  // Note: Parrot class *DOES* have a method called moveAround()
  parrot.moveAround(); // ERROR!

The interface in this example defines a uniform way to access a class through a standardized API (thus the name, application programming interface). In a strongly typed language where all variables must have a type, the cat variable is defined as an implementation of Animal. This enforces and allows the method call makeSound(). If cat has a meow() and dog has a woof() method, they can not be called here without a compiler error. This is because in this function call, the parrot variable is defined as being an instance of Animal (versus being a Dog, Cat, or Parrot). As such, only Animal methods work here.

More importantly, because the compiler does this type checking, any invalid calls, such as the last one, would error and never compile. Even if the Parrot class has a moveAround() method, it can not be called in the code above. This is an extremely important aspect of OOP since, as a definer of the Animal class, I want to make it very specific how Animals should be treated (you can only makeSound!). If a programmer tries to do something to an Animal that I haven’t defined, they get an error. If they wanted to make that last line work, they would need to use object typecasting:

void farm(Animal cat, Animal dog, Animal parrot) {
  ...
  ((Parrot) parrot).moveAround();

Or by changing the function definition:

void farm(Animal cat, Animal dog, Parrot parrot) {
  ...
  parrot.moveAround();

But note that in this case, the user had to make an explicit choice to stop using Animal’s interface. Yes, parrot is still an Animal, but it doesn’t have to be. This, in short, helps prevent spaghetti code because it forces the developers to think about whether or not they want to deviate from a particular interface. Realistically, if presented with these alternatives, a Java programmer would probably use other types of abstraction techniques (e.g., dependency injection)  to keep this method from needing to be used. However, this example was necessary to illustrate how things are done in PHP.

So how would this look in PHP? Why isn’t this the same there? Well, take a look at the following code that, unlike the Java example, works perfectly fine and raises no red flags.

interface Animal { function makeSound(); }
function farm(Animal $cat, Animal $dog, Animal $parrot) {
  $cat.makeSound();
  $dog.makeSound();
  $parrot.moveAround(); // WORKS FINE 
}

This code works great. We have three arguments all forced to use the Animal interface. Great. As a casual observer, there is really, truly, nothing wrong with this code. It’s a little strange, but if it’s commonly known that Birds can moveAround(), there is no problem. In fact, in most PHP shops, I will bet money that type hinting is NOT used. This will further illustrate how bad the spaghetti is about to get (read on).

Now imagine in six months if we decide we wanted to group up this code so that it uses a single array/collection as an argument. This is where things would look like traditional polymorphic code. I mentioned spaghetti above. Let me show you why:

interface Animal { function makeSound(); }
function farm(array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    if($animal instanceof Parrot) { // or maybe a method_exists() call?
      $animal.moveAround(); // SPAGHETTI
    }
    else {
      $animal.makeSound(); // Hope for no fatal errors!
    }
  }
}

Wow, look at what we just did. A harmless piece of code in PHP six months ago completely breaks when you try to refactor it to use a fairly typical design pattern. More importantly, unless I put in even MORE code to do type checking, there’s a chance that the makeSound() line will actually die in a fatal error if, for example, a string is passed in as an element of the argument array! See example without Parrots:

interface Animal { function makeSound(); }
function farm(Array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    $animal.makeSound(); // Hope for no fatal errors!?
  }
}

PHP is extremely flexible when it comes to hacking out a page, but when it comes to OOP, it’s about as brittle as you get. Refactoring is painful and error prone, and elegant design patterns like the ones you might see in a message-passing language such as Objective-C, Scala, or Erlang don’t work. Remember that by using functions such as method_exists() and is_object(), I can emulate the desired behavior; however, the extra code means more places for bugs and less time spent making the program do what you want it to do. The point is that the OOP constructs in PHP don’t fully work. As a result, certain very important aspects of OOP don’t translate very well to PHP.

Some people may still cling on to the notion that “ultimately, you can still do it, it just requires more code!” But I argue that preventing “more code” is the exact reason why OOP was invented. By writing more boiler plate error checking code, we are wasting time. The issue is exacerbated by the fact that the error checking code isn’t required, unlike say, if you were throwing exceptions. It isn’t immediately obvious in that last example that you need to do error checking for is_object() on the $animal variable. It’s these types of oversights that really damage PHP as the code base gets larger.

Conclusion

What I’m realizing is that PHP isn’t meant to scale. Yes, it can take a lot of web traffic, but that’s not what I mean. I’m talking about scaling in the sense of growing team size and code base. The design of the language promotes coding paradigms that ultimately damage the code base. This is because PHP makes it harder use good OOP practices on legacy code. To illustrate:

  • PHP became popular because it is easy to hack things out, even if that something required doing it the “wrong” way. These problems come back and bite you when the code base grows.
  • PHP can’t support a large development team as effectively because its weak typing allows for sidestepping certain core OOP principles (see above)
  • PHP  allows for invisible future-bugs (see above) to be inserted without any immediate cause for alarm
  • As applications get complex and require threading or distributing of processes, PHP fails to keep up (so other languages get used)
  • Because PHP does not use dynamic dispatching (message passing), calling a method can cause runtime FATAL ERRORS (unacceptable and very hard to debug!)

All of this makes me rethink the popularity of PHP. There are some new languages, still in their infancy, that pose a threat to PHP’s current dominance. I believe that in the next few years, as today’s systems become “legacy,” today’s newcomers will finally be production ready. At that point, we might see companies adopt the newer languages, which will support more modern programming paradigms. We are seeing this today with Ruby, for example.

Of course, I could be wrong. I once told people that PHP was “C of the web.” It’s possible it’s here to stay forever, despite all of its flaws. And, for the record: I do not believe Python or Ruby will be the language that will overtake PHP, but that’s for another post.

I just want everybody to know that I am a PHP developer, so I speak from experience. We should recognize that technology changes and evolves, and it is important that we constantly update our skill to ensure they don’t become obsolete. I’m just pointing out that perhaps PHP isn’t as timeless as C (or, possibly, Java).

Lastly, I will plug my personal belief that being “religious” about a language because it is “the best” is short sighted. New languages are born, literally, every week. It’s only a matter of time before a language comes along that does what your language does more elegantly, faster, and with less code.

Only time will tell. :)

  • http://pulse.yahoo.com/_3ZB5ZOAWTDHDDFELGTUN23CVUI Avery Monkisno

    Sounds like idiot. Before copying those nonsense Java example code from a dummy book, try to write a few thousand lines yourself. You can’t be a good programmer by just talking about it. I don’t think you know what you are talking about.

  • http://www.facebook.com/people/Danny-Lieberman/630534688 Danny Lieberman

    On the polymorphism thing – I know it’s a standard interview question…but Read Steve Yegge on the topic
    http://sites.google.com/site/steveyegge2/when-polymorphism-fails

  • http://www.facebook.com/people/Danny-Lieberman/630534688 Danny Lieberman

    Michi

    Interesting and thoughtful article – you are mistaken (imho) on some of your points but I definitely feel your “pain” and I know it is real

    Let’s start with the points I disagree with:

    1) Operating system threads are a bad thing. They are verry verry slow and not very stable.

    2) OOP – you admit yourself that most PHP programmers are self-taught and don’t understand object orientation. If you do – you can develop OO in PHP with the same concepts in Java and indeed all the big frameworks (cake, yii, elgg all use OOP)

    3) Polymorphism – this is another bad idea, good in textbooks and poor in practice.

    4) Frameworks – the article does not consider the possibility that just like in Microsoft Visual Studio, PHP developers use frameworks like Symfony, Cake, yii and igniter.

    Now for the “pain”.

    The current rich Web 2.0 application development and execution model is broken.

    Yes it is.
    Consider that a Web 2.0 application has to serve browsers and smart phones. It’s based on a heterogeneous server stack with 5-7 layers (database, database connectors, middleware, scripting languages like PHP, Java and C#, application servers, web servers, caching servers and proxy servers. On the client-side there is an additional heterogeneous stack of HTML, XML, Javascript, CSS and Flash.

    On the server-side, we have

    2-5 languages (PHP, SQL, tcsh, Java, C/C++, PL/SQL)
    Lots of interface methods (hidden fields, query strings, JSON)
    Server-side database management (MySQL, MS SQL Server, Oracle, PostgreSQL)
    On the client side, we have

    2-5 languages ((Javascript, XML, HTML, CSS, Java, ActionScript)
    Lots of interface methods (hidden fields, query strings, JSON)
    Local data storage – often duplicating session and application data stored on the server data tier.
    A minimum of 2 languages on the server side (PHP, SQL) and 3 on the client side (Javascript, HTML, CSS) turns developers into frequent searchers for answers on the Internet (many of which are incorrect) driving up the frequency of software defects relative to a single language development platform where the development team has a better chance of attaining maturity and proficiency. More bugs means more security vulnerabilities.

    Back end data base servers interfaced to front end scripting languages like C# and PHP comes built-in with vulnerabilities to attacks on the data tier via the interface.

    But the biggest vulnerability of rich Web 2.0 applications is that message passing is performed in the UI in clear text – literally inviting exploits and data leakage.

    The multiple interfaces, clear text message passing and the lack of a solid understanding of how the application will actually work in the wild guarantee that SQL injection, Web server exploits, JSON exploits, CSS exploits and application design flaws that enable attackers to steal data will continue to star in today’s headlines.

    Passing messages between remote processes on the UI is a really bad idea, but the entire rich Web 2.0 execution model is based on this really bad idea.

    More on my post – http://www.software.co.il/wordpress/2010/12/why-rich-web-2-0-may-break-the-cloud/

  • http://limez.net/ Gadis

    Unlike the mainstream ideas, I have to disagree with you, as there are in order that many other elements to consider. Properly in case you are open for hyperlink exchange, maybe we must always be in contact with each other, so that we can construct a better weblog together. What do you assume?

  • http://www.michikono.com Michi

    Sid, good point. On the subject of Symfony, a much better example is Yahoo. They’re pretty much all-in when it comes to PHP.

  • Sid

    Well, Facebook is written in PHP.

    But setting that aside

    “I’m talking about scaling in the sense of growing team size and code base.”

    Using something like Symfony Framework, it’s definitely scalable in terms of growing team size and code base.

  • http://www.michikono.com Michi

    Thanks for the plug! ;)

  • http://www.sharpdeveloper.net Sameer Alibhai