Is PHP here to stay?

As a LAMP developer, I am starting to question the long term viability of PHP. PHP was born during an era when knowing HTML was a valid and valuable resume bullet. Because of this, most of the “advanced” aspects of PHP — which relate to the OOP functionality — were introduced only after PHP 4/5, and weakly at that. Additionally, new languages have since become popularized that show the weakness of PHP. Don’t get me wrong, I am very supportive of PHP. I just believe that it’s important that people understand both the strengths and weaknesses of the tools they use.  There are two main points I want to cover:

  1. PHP thread support is weak
  2. PHP OOP = Broken

The second point is rather technical, but it closely relates to another strength and weakness of PHP: it is loosely typed. More on that later.

Thread Support is Weak

True threading support in PHP does not exist. The closest thing is the pcntl_fork method, which copies the current process, rather than create a thread. This means asynchronous processing within a single process is not supported. Threading is useful in event-driven architectures (common in JavaScript) or when doing blocking operations such as network calls.

Because the forked process is a clone of the original, it shares all of the original resources, including database and file resources. This means that the forked process must be self-aware of whether it is a child or not, and must be careful not to modify or close these resources. This encourages spaghetti code that contains large logic forks (“if I am not a clone, else…”). Because of this, forking is messy and error prone. This gets further complicated when PHP is executed by Apache in a web environment. In fact, the PHP manual advises avoiding forking with web servers:

Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.

Not to mention the method is incredibly C-like in that it is very “raw” (unlike other native PHP methods/classes). This increases the barrier to entry significantly, which ultimately serves to have the feature ignored by most shops.

Why is all of this important? Well, at most companies, one language is selected for all in-house development. This is because cross training and hiring is simplified if everybody speaks the same language. There are a few common tasks that are unnecessarily difficult to do in PHP:

  • Asynchronous work — handing off work such as connecting to a remote server to a child and wait for a response
  • Manage thread pools — this sort of work requires significant “by-hand” management of any processes spawned by the parent via pcntl_fork

The threading issue is only a pain point that impacts processes that need to become parallelized. It is a pain most big shops live with, or, alternatively, introduce other languages to help solve.

PHP OOP = Broken

Because of the loosely typed nature of PHP, true, well-formed object oriented programming is broken. I know that for many PHP programmers, “Object Oriented” means putting together classes and reusing code as objects. However, that is truly, sincerely, only a portion of the point of OOP. Some of the most powerful aspects of OOP are lost in PHP’s implementation of the concept. Don’t get me wrong: these decisions were probably the right fit for the niche PHP was filling, but I don’t believe most PHP programmers are fully aware of what they are missing.

While the language, thankfully, has interfaces and abstract classes, they are woefully underused. This is, in part, due to to the developer community being largely self-taught. This creates a misconception about the nature of OOP, which ultimately leads to the devaluation of the most important feature of OOP: interfaces.

I can go into why they are so important in another article, but the point is: without interfaces, true polymorphic code is impossible. Or, rather, extremely susceptible to spaghetti code and fatal errors.

In other languages (Java), code might look like this:

interface Animal { void makeSound(); }
void farm(Animal cat, Animal dog, Animal parrot) {
  cat.makeSound();
  dog.makeSound();
  // Note: Parrot class *DOES* have a method called moveAround()
  parrot.moveAround(); // ERROR!

The interface in this example defines a uniform way to access a class through a standardized API (thus the name, application programming interface). In a strongly typed language where all variables must have a type, the cat variable is defined as an implementation of Animal. This enforces and allows the method call makeSound(). If cat has a meow() and dog has a woof() method, they can not be called here without a compiler error. This is because in this function call, the parrot variable is defined as being an instance of Animal (versus being a Dog, Cat, or Parrot). As such, only Animal methods work here.

More importantly, because the compiler does this type checking, any invalid calls, such as the last one, would error and never compile. Even if the Parrot class has a moveAround() method, it can not be called in the code above. This is an extremely important aspect of OOP since, as a definer of the Animal class, I want to make it very specific how Animals should be treated (you can only makeSound!). If a programmer tries to do something to an Animal that I haven’t defined, they get an error. If they wanted to make that last line work, they would need to use object typecasting:

void farm(Animal cat, Animal dog, Animal parrot) {
  ...
  ((Parrot) parrot).moveAround();

Or by changing the function definition:

void farm(Animal cat, Animal dog, Parrot parrot) {
  ...
  parrot.moveAround();

But note that in this case, the user had to make an explicit choice to stop using Animal’s interface. Yes, parrot is still an Animal, but it doesn’t have to be. This, in short, helps prevent spaghetti code because it forces the developers to think about whether or not they want to deviate from a particular interface. Realistically, if presented with these alternatives, a Java programmer would probably use other types of abstraction techniques (e.g., dependency injection)  to keep this method from needing to be used. However, this example was necessary to illustrate how things are done in PHP.

So how would this look in PHP? Why isn’t this the same there? Well, take a look at the following code that, unlike the Java example, works perfectly fine and raises no red flags.

interface Animal { function makeSound(); }
function farm(Animal $cat, Animal $dog, Animal $parrot) {
  $cat.makeSound();
  $dog.makeSound();
  $parrot.moveAround(); // WORKS FINE 
}

This code works great. We have three arguments all forced to use the Animal interface. Great. As a casual observer, there is really, truly, nothing wrong with this code. It’s a little strange, but if it’s commonly known that Birds can moveAround(), there is no problem. In fact, in most PHP shops, I will bet money that type hinting is NOT used. This will further illustrate how bad the spaghetti is about to get (read on).

Now imagine in six months if we decide we wanted to group up this code so that it uses a single array/collection as an argument. This is where things would look like traditional polymorphic code. I mentioned spaghetti above. Let me show you why:

interface Animal { function makeSound(); }
function farm(array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    if($animal instanceof Parrot) { // or maybe a method_exists() call?
      $animal.moveAround(); // SPAGHETTI
    }
    else {
      $animal.makeSound(); // Hope for no fatal errors!
    }
  }
}

Wow, look at what we just did. A harmless piece of code in PHP six months ago completely breaks when you try to refactor it to use a fairly typical design pattern. More importantly, unless I put in even MORE code to do type checking, there’s a chance that the makeSound() line will actually die in a fatal error if, for example, a string is passed in as an element of the argument array! See example without Parrots:

interface Animal { function makeSound(); }
function farm(Array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    $animal.makeSound(); // Hope for no fatal errors!?
  }
}

PHP is extremely flexible when it comes to hacking out a page, but when it comes to OOP, it’s about as brittle as you get. Refactoring is painful and error prone, and elegant design patterns like the ones you might see in a message-passing language such as Objective-C, Scala, or Erlang don’t work. Remember that by using functions such as method_exists() and is_object(), I can emulate the desired behavior; however, the extra code means more places for bugs and less time spent making the program do what you want it to do. The point is that the OOP constructs in PHP don’t fully work. As a result, certain very important aspects of OOP don’t translate very well to PHP.

Some people may still cling on to the notion that “ultimately, you can still do it, it just requires more code!” But I argue that preventing “more code” is the exact reason why OOP was invented. By writing more boiler plate error checking code, we are wasting time. The issue is exacerbated by the fact that the error checking code isn’t required, unlike say, if you were throwing exceptions. It isn’t immediately obvious in that last example that you need to do error checking for is_object() on the $animal variable. It’s these types of oversights that really damage PHP as the code base gets larger.

Conclusion

What I’m realizing is that PHP isn’t meant to scale. Yes, it can take a lot of web traffic, but that’s not what I mean. I’m talking about scaling in the sense of growing team size and code base. The design of the language promotes coding paradigms that ultimately damage the code base. This is because PHP makes it harder use good OOP practices on legacy code. To illustrate:

  • PHP became popular because it is easy to hack things out, even if that something required doing it the “wrong” way. These problems come back and bite you when the code base grows.
  • PHP can’t support a large development team as effectively because its weak typing allows for sidestepping certain core OOP principles (see above)
  • PHP  allows for invisible future-bugs (see above) to be inserted without any immediate cause for alarm
  • As applications get complex and require threading or distributing of processes, PHP fails to keep up (so other languages get used)
  • Because PHP does not use dynamic dispatching (message passing), calling a method can cause runtime FATAL ERRORS (unacceptable and very hard to debug!)

All of this makes me rethink the popularity of PHP. There are some new languages, still in their infancy, that pose a threat to PHP’s current dominance. I believe that in the next few years, as today’s systems become “legacy,” today’s newcomers will finally be production ready. At that point, we might see companies adopt the newer languages, which will support more modern programming paradigms. We are seeing this today with Ruby, for example.

Of course, I could be wrong. I once told people that PHP was “C of the web.” It’s possible it’s here to stay forever, despite all of its flaws. And, for the record: I do not believe Python or Ruby will be the language that will overtake PHP, but that’s for another post.

I just want everybody to know that I am a PHP developer, so I speak from experience. We should recognize that technology changes and evolves, and it is important that we constantly update our skill to ensure they don’t become obsolete. I’m just pointing out that perhaps PHP isn’t as timeless as C (or, possibly, Java).

Lastly, I will plug my personal belief that being “religious” about a language because it is “the best” is short sighted. New languages are born, literally, every week. It’s only a matter of time before a language comes along that does what your language does more elegantly, faster, and with less code.

Only time will tell. :)

The Destruction of the Head Hunting Industry

This is a random thought that just popped in my head.

With information becoming increasingly available, I’ve been thinking that the head hunting business will go through a major destructive phase in the next few years. There’s two things the Internet changed:

  • Better distribution of information on job openings
  • Better distribution of information on candidates

Definition: For those of you who are unaware, head hunters are professionals that search for employees and pair them up with open positions in companies. In a typical scenario, a company will pay a recruiter (head hunter) a fee that equates to 2-3 months of that employee’s yearly salary. Companies pay this because recruiting employees is expensive. I’ve done a lot of hiring in the last few years, and I know how time consuming it is to review hundreds of resumes and then interview. A head hunter is basically an outsourced HR department. Additionally, candidates often approach head hunters who re-post job openings in various job boards.

And there’s a third trend that will come based on increasing information available to the public:

  • Automation of job and candidate pairing

A long time ago, I was business partners with a man who was formerly a head hunter. I remember him telling me how wonderful the internet made his job. He told me that when he was my age, recruiting meant shaking a lot of hands, memorizing every face and name you ever met, and storing large piles of business cards. For him, recruiting was now about posting jobs on Craigslist and Monster and referring the candidates. To him, he was still the gatekeeper. These days, anybody can be a headhunter with a little Internet know how.

head hunter productivity chart
head hunter productivity goes up first, then down (we are in the middle stage now)

However, sites like LinkedIn can change all that. The one true value proposition that head hunters provide is that they serve as match maker. But as more information is available and technology improves, this process should become more and more automated. For example, right now, LinkedIn has job postings. On its own, it’s just a new competitor to Craigslist, but what makes things interesting is that LinkedIn also has the data points to find all of the candidates out there that might fit the job requirements — without anybody lifting a finger.

Right now, the information stream is mono-directional: job postings (and recruiters) broadcast information. The goal is a bi-directional system where seekers fill out their requirements (a.k.a. their resumes) and both sides let the system do the matching. This can only work if both sides have maximum information about the other. Think of it like dating site for job seekers. It’s a hard problem to solve given the time-sensitive nature of job searches, but it’s an inevitable outcome as more and more information centralizes onto the Internet.

5AM thought of the day.

Q: Hiding JS Files? A: Impossible

In my popular post about hiding your Word Press folder, a reader asked:

Hi Michi, can you help me with this, in the head section i wrote this:

<script src="/style/js/somescripts3.js” type=”text/javascript” charset=”utf-8″>

and when we go to the webpage then right click, it will show:

<script src="content/themes/exampletheme/exampletheme/style/js/somescripts3.js”
  type=”text/javascript” charset=”utf-8″>

can you teach me or show me how to do that, any help highly appreciated, And im so sorry if my english not good.

This question was complicated enough where I thought a new post might make sense.

For clarification, I believe he is asking if it’s possible to put one thing in the source and another that the browser sees. This is impossible. Anything that the browser can see, the user can see. There is no way to “show” something different in the source of an HTML file versus what the browser sees (except through obfuscation); however, you can forward things along behind the scenes. You want to create an htaccess rule that will redirect your requests.

RewriteRule /path/to/thejsfileyouwanttoshow\.js /path/to/real/js/file.js [L]

Let me reiterate that you *cannot* hide the content of the JS file. However, you *can* hide the true folder structure of the web server. If you desire to hide your JS contents, the better solution is a JS minifier.

Alternatively, if your goal is to somehow make it harder for somebody to steal your code and you don’t want ot use a JS minifier, you could write the JavaScript tag dynamically using another piece of JavaScript. However, ultimately, that level of weak obfuscation won’t protect you from anything since Firebug will quickly expose what’s really going on.

I hope that answers your question.

Rainbow Google and Annoying Google

I launched two more Google parodies: Rainbow Google and Annoying Google!

Rainbow Google is just a pretty demonstration of dynamic stylesheet modification. It was actually extremely hard to code — it took me about 8 hours of JavaScript hell. On another day, I’ll go over how I did it. This site adds a colorful spray of colors to the text on the page (see screenshot).

rainbowgoogle.com

RainbowGoogle.com

Annoying Google was about 10 minutes of work since it was just a super simple version of Rainbow Google’s code. :) Search queries and results are jumbled so that their letters are in random capitalized states… LiKe ThiS.

annoyinggoogle.com

AnnoyingGoogle.com

If you have suggestions or ideas, please let me know!

Make Your Blog iPhone Friendly with WPtouch

I installed a plugin today called WPtouch, which provides a custom layout for your blog’s iPhone visitors. I was pleased by how simple the plugin was to install and how gracefully it “just worked.” Normally, I don’t bump plugins, but this one was just too well executed to ignore. It even integrates with AdSense and supports YouTube embeds correctly!

If you have a blog, try it out.

What this site looks like on an iPhone.

What this site looks like on an iPhone.

Epic Google and Weenie Google

(EDIT: also check out this post for information on Annoying Google and Rainbow Google!)

Hello, I’m here to announce two new websites of mine: Epic Google and Weenie Google. They’re extraordinarily simple ideas. Feel free to mischievously make it the home page of your friends. :)

Unlike the last idea of mine (Google Loco), I made sure these pointed to domains that didn’t have “Google” in them. I did not enjoy the fact that Google blacklisted my Loco domain (A tip for the rest of your parody makers out there)!

Down But Not Out… Sun on the Other Hand…

For a brief period, the site was down. I was moving to a more permanent host. Special thanks to Brian for hosting my sites all these years. =)

Anyway, yes, I do keep this site in mind. And for any of you paying attention, I hope it’s not the end of the (open source database) world that Oracle bought Sun. I think it’s funny that Oracle just bought Sun for a price that puts MySQL’s value at 1/7th Sun’s value. Maybe instead of buying up MySQL, Sun should have been focusing on their own business strategy. And they did it during the hardest possible economic times. Moronic.

Oh well. As they say, “when the tide goes out, you can see who’s not wearing shorts,” right? I do feel bad for MySQL though. They dodged the Oracle Bullet only to get caught under the Oracle Steamroller.

What Drug Does a Programmer Do?

My coworker just made this one up and I had to share because I love corny humor:

Q: What drug does a programmer do?
A: Hextacy.

On the Web 2.0 Bubble

Everybody, listen. There’s a Web 2.0 bubble right now. I know it’s difficult for some people to acknowledge, and many people may even casually agree with me without actually believing the statement in full. But it’s true, and the quicker you realize this, the better it will be for your pocket books.

Lately, I’ve been doing stock trading, and have come to learn first hand about the energy and commodities bubbles that were slamming the market. And when that thing was going crazy, it helped deflate the banking bubble, which was a direct result of the housing bubble. And in many ways, the housing bubble was a result of the dot-com bubble bursting due to people exiting the stock market in search for a new investment. And everybody in the web industry likes to think they are wise to bubbles because they learned their lesson in the dot-com boom. But it is increasingly evident that this is not the case.

An Example Exercise

The problem here is that people are approaching this with the mind set of “this will be somebody else’s problem after I sell it.” It’s important we try to figure out what happens to the eventual owner of the startup.

  1. Take your favorite Internet 2.0 company. Decide how much you think that company is worth. $5 million? $10 million? $50 million? $500 million? The sky is the limit!
  2. Imagine now that you are going to trade your life savings for a current minority chunk in the company. If the company doubles up, so does your savings, but if it goes under, your savings are wiped out.
  3. Remember that number you threw up there in step #1? You aren’t allowed to cash out ANY PROFITS until the company is sold to a buyer.
  4. Your startup may not sell until it has reported a yearly net revenue of 10% of your purchase price.

That last part is the key because it effectively stops the hot potato game and forces you to examine if the company is truly viable. Some people would accuse that of being an unfair restriction, but I will show you why this is the key part in understanding why there is a 2.0 bubble.

Defining the Bubble

Let’s take a second to define a bubble:

An investment yields a return, much like a chicken can produce eggs, a savings account produces a yield, and a farm produces crop. This return is not always immediate, and is not always in the same terms as the input. Also, it is almost a law that returns are proportional to risk (some investments have negative returns). But ultimately, it is called an investment because it will (usually and) eventually generate more value than what you put in.

Now consider an investment that does not create a return. Such an example would be the web stocks of the dot-com boom. Back then, fundamentals like earnings, operating margins, and profitability were ignored when evaluating a stock. Companies that bled millions of dollars a year saw their stocks rising at record levels. This is because the investment – the stock – was being traded to somebody else for a profit because that next person believed they could trade it for an even higher profit.

A bubble is defined as a trend where merely owning something long enough to sell it is profitable. It is a giant game of hot potato. Everybody is essentially a middle man between the original owner and the eventual owner — adding to the price tag at every step. Eventually, people wise up and no longer want to trade the hot potato, causing the bubble to burst.

And most importantly, let’s define a bursting bubble:

A bubble is defined as bursting when the value of the traded item reverts to its true market value.

Understanding the Exercise

So let’s talk about the exercise again: if you thought the company was worth a paltry $50M, then your assets are stuck inside that stock until the startup can earn $5M in revenue AND be profitable while doing so. Why did I pick such restriction? Because those are reasonable things to assume when buying any other type of company. Why would another company offer to buy the startup if it failed to produce respectable revenues?

Given this extremely reasonable reality-check requirement, would you want to tie your personal investment to the startup being able to produce a profit? If the startup you chose has revenues and is profitable, then this article doesn’t apply to you. =)

Speculation Should Still be Grounded on Fundamentals

People aren’t investing for what 2.0 companies are worth today, it’s all about tomorrow. I agree that it is important that tomorrow’s profits are taken into today’s valuations, BUT isn’t this reasoning eerily similar to the reason people listed as to why they bought over-priced houses and profitless dot-com stocks? Both were purchases made while completely disregarding the fact that the *current* valuation of the items were negative.

But since that day of profitability is so far away into the future, you end up playing a giant game of corporate hot potato. Most people would agree that a profit of 10% is far better than a loss of 90%. So as soon as you find a sucker to pay 10% more than you paid, you bail. And of course that guy who bought your stake is thinking the exact same thing — sell this to somebody else for 10% profit before something bad happens. That’s a bubble, my friends.

In Conclusion

In 2001, the bubble was all about going IPO so that the general public could hold the hot potato.

Today, the bubble is all about selling to a big corporate entity that will hold the hot potato.

There is no difference.

If you are currently thinking about entering the 2.0 scene, think carefully about what your end goal is. If it isn’t “to be profitable”, then it’s likely just another bubble startup that will become completely worthless once the bubble pops. And believe me: given our current economy, that bubble is going to pop in the next year or two.

Finally, an extremely interesting speech about bubbles given in 2006 (gets good around part 2):

Part 1

Part 2

Part 3

Part 4

Part 5

Part 6

The Basics on Using Models and Controllers in PHP

Today I want to talk about passing in objects as arguments in PHP methods. Many PHP developers do not have this patience. This is obvious when studying libraries written for Java versus those in PHP. It is a horridly underused programming style in PHP, and since PHP supports argument type prototyping in methods, I thought it would be good to go over this particular style of development.

First of all, let’s start with a few “PHP” way of doing a mundane task (this will seem extremely familiar):

function login($username, $password, $remember);

function processMail($to, $from, $subject, $body);

function editPost($id, $title, $body, $newTime);

Of course, good developers would make sure these types of examples are part of a class:

$user = new User;
$user->username = $_POST["username"];
$user->password = $_POST['password'];
$user->login();

The examples can continue, and I hope that you good developers use this type of basic OOP development when doing using PHP. :) But there are examples where the standard “newbie” OOP model seemingly falls apart. The system quickly breaks down when new requirements are added to the system. For example, what if we want to do a “remember me” option in the login? What about logging in as an administrator versus a regular user? Okay, now what if we have different login session lengths depending on the user type? How long do you think that login function will be? Think about how it will accommodate IP bans, login logging, banned users, suspended users, max failed attempt lockouts, etc. The list goes on, and depending on your implementation, things can get very ugly. Your login function might become a huge bloated monster sitting in your User class.

The problem is that what you’re actually doing is mixing a data model [user data] with controller logic [how to login using the user data]. The solution is to separate these two entities into two classes, which is what you would see in most modern MVC frameworks.

Here is the seemingly more complex, but far more elegant solution:

$authenticationController = new UserAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

The prototype for the login method would look like this:

function login(User $userObject);

In this example, I am hoping to show you possibilities. First, notice that the arguments for login() are down to one. But the more interesting part of the implementation comes with the proper abstraction between the User data and the authentication process. In my example, I Just logged in a regular user. So if I had to map out my class structure, it might look like this:

(Abstract class) AuthenticationController
=> GuestAuthenticationController
  => UserAuthenticationController
    => AdminAuthenticationController

The old way of doing things would look something like these:

function adminLogin($username, $password, $remember);

$user->adminLogin();

But using the Java-esk model, we’d end up with something like this:

$authenticationController = new AdminAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

This means the login method is likely broken up in a few pieces inside the AuthenticationController. The Guest user’s login() method would always return false. The UserAuthenticationController would piggy back on AuthenticationController::login() by looking at the User::rememberMe variable and take it into account. But the AdminAuthenticationController doesn’t allow people’s logins to be remembered due to security reasons, so it doesn’t take that variable into account. And in that crazy case where there is nothing different about the admin login, the method would remain untouched (inherited from the parent), but any other changes (such as session length) would still kick in for the admin user with no additional coding.

All of this is done without modifying the core User class. The user class remains clean for its own further abstraction possibilities. New fields such as profile, name, DOB, etc., could be added with no modifications to the controller.

Yes, my version requires the most lines of code, but it is also the easiest to maintain and understand. Why? Because it isn’t cluttering up the User data class with methods that have nothing to do with the user. If you’ve ever written a generic “user” class, you know how large and cluttered such a class can become when you start piling in the methods for login, logout, preferences, session management, lookups, and other needs. I haven’t even talked about the fact that virtually all “operations” that involve a user also involve the database, which adds its own headaches. Being able to keep the hard work in other more data-manipulation-oriented classes is for your own good.

If down the road, it is determined that logging in should also require pre-approved IP addresses, what will your code need? Will your login method need an IP address passed in too? Or will the IP address be generated inside the login method? In my example, I would update the core Authentication class and be done. What happens when a new requirement is added that requires that the login also passes a CAPTCHA test? What about when logins need to be logged in a flat file? What happens when we change logins to use the email field instead of the user field?

Today, I only talked about logins. The ideas I propose here are not new; they are simply good ideas that get ignored by web developers. Remember, you’re application developers that happen to work in a browser. Don’t think that regular application design principles don’t apply to you: they do, more than ever.