Archive for the ‘geeky’ Category.

Adding a Centralized Event Dispatcher on Backbone.js

This article contains my solution to adding a simple global event dispatcher to Backbone. It should also help noobies (myself included) understand how Backbone events work (which has one big gotcha).

Sharing Events = Global Dispatcher

Today, I came across a fairly mundane – and probably common – problem: I have two views that need to talk to each other. An example for this would be when you have a status bar in one corner that gets updated when a user completes an action in another area. It turns out that Backbone (as far as I can tell) does not support this use case very well out of the box. Well, it does: it expects you to build that yourself.

Backbone supports a model where you publish and subscribe to things happening in your views/models (commonly known as an “event dispatcher“). For example, you can make your application do stuff when a user clicks on something or if a model attribute changes. The native solution localizes these events (probably a good thing) to the model/view that you’re working with. This works for simple stuff, but when you need it in other parts of your application, things break down. The short answer is that you need to build a global event dispatcher A.K.A. global event “hub.” Everything publishes (“trigger”) and listens (“bind”) to events from this object.

This is a common pattern. In fact, I found two solid articles on this topic. The first explains in detail why you need an event dispatcher. The second shows you how to make the dispatcher natively accessible by all of your Backbone classes.

Event Dispatcher Gotcha: Native Events

However, I felt both solutions weren’t quite there. The first forces you to pass around an Event object everywhere you need it. This is nice from a decoupling standpoint, but highly error prone. The second solution is nice, but makes the caller oblivious to the global event namespace they are triggering/binding to. This is dangerous because Backbone has native generic events that are fired when certain things happen. For example, when you edit a Model, it automatically fires a “change” event into its event dispatcher (which bubbles up to its Collection too). This means if that dispatcher was global, every object would see a “change” firing every time every Model changed. Not good.

Goals

I came up with my own solution that accomplishes three main goals:

  1. Retain native Backbone event triggering/binding
  2. Allow the developer to trigger/bind to global events
  3. Not require the developer to pass things around

Solution Code

(If you can’t read CoffeeScript, just use this converter to convert it back to JavaScript)

The following (very simple) code modifies your Backbone definition to attach the global dispatcher to all objects. The dispatcher is easily accessible from any Model, Route, View, or  Collection through the global_dispatcher parameter. I thought about having it trigger events to the local dispatcher as well, but I decided that keeping them fully separate was in everybody’s best interest (to prevent events from accidentally colliding):

# centralized global_dispatcher object added to all Backbone Collection, Model, View, and Router classes
(->
  return if this.isExtended
  # attaching the Events object to the dispatcher variable
  dispatcher = _.extend({}, Backbone.Events, cid: "dispatcher")
  _.each [ Backbone.Collection::, Backbone.Model::, Backbone.View::, Backbone.Router:: ], (proto) ->
    # attaching a global dispatcher instance
    _.extend proto, global_dispatcher: dispatcher
)()

Example Code

The next section is a simple class I wrote that will demonstrate how these events work. I am defining a simple Collection and a Model. Note that I named the global and local events the SAME. This was to help demonstrate that the namespaces are in fact completely separate (as in, just because they are named the same doesn’t mean a local event will ever trigger a global event with the same name).

# See http://www.michikono.com/2012/01/11/adding-a-centralized-event-dispatcher-on-backbone-js/ for an explanation

class Test extends sBackbone.Model

  initialize: () ->
    this.global_dispatcher.bind('model_custom_action', () ->
        console.log('GLOBAL TRIGGERED: test model custom action')
    )
    this.bind('model_custom_action', () ->
        console.log('TRIGGERED: test model custom action')
    )

  trigger_stuff: () ->
    console.log('triggering model')
    this.trigger('model_custom_action')
    this.global_dispatcher.trigger('model_custom_action')


class TestCollection extends sBackbone.Collection
  model: Test

  initialize: () ->
    this.global_dispatcher.bind('collection_custom_action', () ->
        console.log('GLOBAL TRIGGERED: test collection custom action')
    )
    this.bind('collection_custom_action', () ->
        console.log('TRIGGERED: test collection custom action')
    )

    this.global_dispatcher.bind('model_custom_action', () ->
        console.log('GLOBAL TRIGGERED inside Collection: test model custom action')
    )

    this.bind('model_custom_action', () ->
        console.log('TRIGGERED inside Collection: test model custom action')
    )

  trigger_stuff: () ->
    console.log('triggering collection')
    this.trigger('collection_custom_action')
    this.global_dispatcher.trigger('collection_custom_action')

The following snippet illustrates triggering the events attached to the Collection. Note that we add a Model into this collection (which has no impact on the output). The trigger_stuff method triggers both a local and global event (in that order). The output shows that the events were picked up in the order fired (local, then global). Note that the Collection also listens to the “model_custom_action” event, which coincides with an event the Model triggers. This is very important.

# Trigger both the local and global event bindings for the collection

# Outputs:
# triggering collection
# TRIGGERED: test collection custom action
# GLOBAL TRIGGERED: test collection custom action

collection = new TestCollection()
# adding an empty, but fully valid Model instance
collection.add()
collection.trigger_stuff()

The order of operations looks like this:

  1. Fire a local event
  2. The Collection picks it up
  3. END OF FIRST EVENT
  4. Fire global event
  5. The Collection’s globally attached event handler picks it up
  6. END OF SECOND EVENT

The next snippet shows what happens when you fire an event on a Model inside a Collection. It is also why I decided to keep things separate. This is a little hairy, so pay special attention.

# Trigger both the local and global event bindings for the model

# Outputs:
# triggering model
# TRIGGERED: test model custom action
# TRIGGERED inside Collection: test model custom action
# GLOBAL TRIGGERED inside Collection: test model custom action
# GLOBAL TRIGGERED: test model custom action

collection = new TestCollection()
# adding an empty, but fully valid Model instance
collection.add()
# getting the just-added instance
instance = collection.at(0)
# trigger an event on the Model
instance.trigger_stuff()

The order of operations looks like this:

  1. Fire a local event
  2. The Model picks it up
  3. The Collection picks it up since all events in children bubble up
  4. END OF FIRST EVENT
  5. Fire global event
  6. The Collection’s globally attached event handler picks it up
  7. The Model’s globally attached event handler picks it up
  8. END OF SECOND EVENT

Notice that in step #6, the Collection’s binding fires BEFORE the Model’s. This is key.

Step #3 fires because there is a LOCAL binding to the “model_custom_action” event in the Collection. The local binding is reacting to an event triggered in the child Model’s local event dispatcher. In other words, any event triggered from the local event dispatcher in a Model will bubble up to the Collection’s local event dispatcher and be otherwise indistinguishable from events triggered directly from that Collection.

Step #6, however, is different. That event did NOT originate from the child model’s local event dispatcher. Instead, it is reacting to the global bindings, which happen to share the same name as the event we saw earlier in step #3. Because it didn’t bubble up, the events are being processed in the order they were bound (via the bind() function). The Collection was defined before the Model, thus, the Collection’s global binding fires first.

In this last snippet, we demonstrate how Models behave when not inside a Collection.

# Trigger event bindings for a model not in a collection

# Outputs:
# triggering model
# TRIGGERED: test model custom action
# GLOBAL TRIGGERED: test model custom action

instance = new Test
instance.trigger_stuff()

  1. Fire a local event
  2. The Model picks it up
  3. END OF FIRST EVENT
  4. Fire global event
  5. The Model’s globally attached event handler picks it up
  6. END OF SECOND EVENT

This is very straight forward if you managed to follow the last example. To get additional clarity, you may want to try renaming the events in the above class and re-run the examples.

By having the global dispatcher separate, you can consciously decide when an event should be “public,” as well as not clobber any existing Backbone functionality. Backbone is still really young (pre 1.0!), so I wanted to avoid using any solution that might break if they changed the internals. Also, completely preserving the behavior of event bubbling for Model-Collections is important to future proof my hack.

I hope this is useful for all of your Google-visitors!

Browser Wars… Wait, That’s Still Going on Right?

Rewind 5 years. Ask any self-proclaimed nerd what the browser market shares were. Market share stats were like the stock market ticker of the Internet Nerds. Everybody knew about it, and everybody cared.

But what’s the market shares today? Did you know that IE is below 50% by most measures? Did you know that Chrome ate up Firefox’s market share? And what’s Safari’s market share if all the iPhones and iPads use it?

You probably don’t know.

Because, who cares.

5-10 years ago, it mattered that IE had 70+% of the browser market because it directly influenced what was possible as an application developer. But mobile changed all that.

Mobile browsers ended up being the adoption wedge for HTML5 and alternatives to Flash thanks in large to the fact users — and developers — treated mobile as separate from regular browser apps. What a blessing in disguise: it let everybody start over. And once the mobile stuff got popular and apps broke, people blamed the bad mobile browser (“My ghetto Blackberry won’t load Facebook right!”) instead of the website. It was the perfect storm to force everybody to start adopting HTML5. Add in CSS/JavaScript standardizing tools (Modernizr,  jQuery, GWT, etc.) and developers didn’t even have to do cross-browser testing for simple stuff.

Maybe this is a bad thing to admit, but I haven’t bothered testing in all browsers for a year or two now. Stuff just breaks less often. IE7 is “good enough,” and the other browsers work 99% of the time. Thus, the only time I bother checking browser compatibility is if I’m doing something super complex or a user complains.

Good job, Internet. Ya, the evil Microsoft IE empire is still around, but we won the war and nobody even noticed.

Ruby: Time Comparisons Seem Backwards Due to Asian Culture

I found a weird quick in Ruby best explained by the fact it’s written by a Japanese dude. Nerd post below. In Ruby, when you compare two time signatures, you use the operator:

<=>

It compares two Time objects and then returns -1, 0, +1, or nil. So this is how it looks in practice:

comparison_result = today_time <=> yesterday_time

What’s the value of “comparison_result?” The answer is 1. 1? I stared at this for a long time and my inclination was to think it should return whichever side was bigger. There was very little documentation I could find on this topic, but I finally figured out why.

The reason WHY is that time flows down or backwards in Asian culture, and Ruby is written by a Japanese dude. Confirmation in this wiki article.

So the conclusion is that this operator should be read as, “Which side is smaller?” As in: left side = -1, right side = +1, 0 = same, and nil = invalid.

PHP Tip: Always Put Constants on the Left in Boolean Comparisons

This was a standard I enforced at my last company:

Whenever you are doing a boolean check (such as an IF or WHILE), always place constants on the left side of the comparison.

The following is BAD:

// BAD
if($user == LOGGED_IN) {

The following is good:

// GOOD
if(LOGGED_IN == $user) {

Why is this such a big deal? Imagine the typo where you forget the second equals sign:

// Oops! This always evaluates to true!
if($user = LOGGED_IN) {

This sort of bug is fairly common. C# went as far as to say boolean conditions must always have boolean return values, thereby eliminating the possibility of accidental assignments. Well, since PHP can’t do that, this is the next best thing. Notice how this convention will save your butt:

// Fatal error. Bug caught immediately.
if(LOGGED_IN = $user) {

Think about it. :)

Is Your Blog Not Receiving Pingbacks? I Fixed Mine.

I recently noticed that my blog was no longer registering pingbacks (the automatic in-comment notification that occurs when somebody else blogs about your post). I like these because they help me understand which of my articles are gaining traction.

My symptoms

  • My other blogs hosted on the same server seem to be pinging fine; however, those have far less posts and plugins
  • I am able to send pingbacks, apparently
  • But ping backs TO my content were dropped (even when I am self-pinging)

The fix

I figured the issue was somehow related to my recent upgrades of WordPress. After scouring the web, I found that the issue was due to a poorly designed timeout setting in WordPress.

  1. Open wp-includes/cron.php in your blog folder
  2. Go to the line that starts with: wp_remote_post( in the spawn_cron function
  3. Change ‘timeout’ => 0.01 to ‘timeout’ => 1 (or any other far more reasonable value)

This will fix blogs that are plagued by this bug.

Autocast Variables Whitepaper: What I Want to See in PHP 6

Edit: I’ve moved the autocast white paper to its own page. Let me know what you think!

How would you answer the following question:

Imagine you are now in charge of PHP. What do you cut/add/change in PHP 6.0?

I ask this during interviews, and as you can imagine, I get all sorts of answers. The best answers are pulling in features from other languages, particularly OOP concepts. These answers aren’t bad, but they almost always try to “fix” PHP’s broken OOP while also crippling the strength of a loosely typed language. I have my own unique answer to this question, and I wanted to share.

Maybe somebody else has already thought of this, but if not, I’m going to coin it right here:

Autocast variables. An autocast variable is like a container for data — everything going into an autocast variable type will always be converted to the current type of that variable. As in, if you assign a string into an integer variable, the variable will become the integer representation of the string (via implicit and immediate typecasting).

The idea is a hybrid of limited type safety – where only some variables are type safe – and operator overloading of the equals sign – on native datatypes. To help explain the idea: it would act almost like somebody following around your cursor and typing (int), (string), etc. all over your code before all variable assignments.

The goal is to allow a developer to be – when desired – 100% certain they are working with a specific data type.

Introduction to Autocasting

To declare a variable as an autocast, simply place a colon after the dollar sign in a variable name. Then, everything assigned to that variable is now automatically typecast to the datatype of the variable. For example

// This variable is now a container for integers
$:orderTotal = 0;
// assign a float value
$:orderTotal = 1.01;
// outputs 1; 1.01 was typecast to an integer
echo $:orderTotal;

NOTE: Why the new syntax? I toyed with the idea of an autocast keyword, but the paradigm broke down when you started assigning objects. The problem is that objects are pass-by-reference. This meant a programmer could change the datatype of an autocast variable by altering its reference. The other problem was that by not having a visual marker, it would make things very confusing  since one could never tell if they were working with an autocast until runtime. Lastly, why the dollar-colon? I would have prefered straight colon, but most of the good single-character syntax would conflict with existing PHP systems (# is a comment, : is used in ternary operators, % is modulus, ^ is a bitwise operator, etc.). A dollar sign is universally understood as a variable, so I thought the next best thing was to alter the variable in a way that today’s PHP would recognize as invalid (and thus introducing the syntax would not conflict with legacy code).

The concept is simple, but gets more complicated as you introduce objects, magic methods, and method signatures into the equation. Don’t worry, I’ve thought about all of those scenarios. Key summary of benefits:

  • New coding paradigms allow for simpler interaction between different data types (see first Practical Example)
  • Refactoring can be done in a way never before possible (see second Practical Example)
  • Code is now more “reliable” because unintended data types aren’t used (such as during boolean checks)
  • Many fatal errors can now be avoided
  • Potential use in the realm of dependency injection
  • Possibilities for true function overloading since expected datatypes are known (although, this is possible today, to be honest)

Edit: read the rest here.

A PHP/MySQL Bug Most People Have But Don’t Realize

I’ve seen this over and over in my career and thought I should save others from the horror. Part of me feels like I blogged about this years ago, but I couldn’t find a post referencing it (EDIT: found it!). The bug is simple:

  1. Create a database table with a decimal value, such as order_total
  2. Write some code that retrieves the row
  3. Do an implicit boolean check on order_total to see if it has a value

Here’s some actual code:

$results = mysql_query("SELECT * FROM orders");
while($row = mysql_fetch_assoc($results)) {
    if($row['order_total']) {
        echo "Order total is clearly not zero!";
    }
    else {
        log_bad_order($row['id']);
    }
}

This code has a serious bug in it. The problem is the line pertaining to checking if the order_total has been set. Pop quiz:

What is the value of the following:
(bool) “0.00″

The answer is TRUE! 0.00 may evaluate to zero, but “0.00″ is not the same thing! As soon as PHP sees more than just a single “0″ in a string, it assumes it’s a regular string and treats it as a non-zero string. A more obvious way to ask the same question:

What is the value of the following:
(bool) “0.0000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000″

Or what about:

What is the value of the following: (bool) “0.”

The point is that as soon as you go beyond a single zero, PHP just assumes the rest is real data and will not discard it. Thus:

if(0.00) {
    echo "THIS NEVER EXECUTES";
}
if("0.00") {
    echo "THIS ALWAYS EVALUATES TO TRUE";
}

SO going back to the original problem, the way to solve is is by fixing the code to either explicitly type cast the variable or use a “greater than” check:

$results = mysql_query("SELECT * FROM orders");
while($row = mysql_fetch_assoc($results)) {
    if((float) $row['order_total'] > 0) {
        echo "Order total is clearly not zero!";
    }
    else {
        log_bad_order($row['id']);
    }
}

If you don’t do either of these things, I’d suggest you go and double check some of your code.

Representing heirarchical data in MySQL

I’ve always wondered if there was a better way to manage nested data structures (such as product categories) in MySQL. Today I stumbled across a solution called the Nested Set Model.

The only addition I made to the solution is rename what they call the “category_id” and call that a “sort_id”. Then I added a primary key called “id” to the table. This way, I have an immutable ID I can use in the application (such as for URL deep linking). For example:

CREATE TABLE nested_category (
  id INT AUTO_INCREMENT PRIMARY KEY,
  sort_id INT NOT NULL,
  name VARCHAR(20) NOT NULL,
  left_sort INT NOT NULL,
  right_sort INT NOT NULL
);

Is PHP here to stay?

As a LAMP developer, I am starting to question the long term viability of PHP. PHP was born during an era when knowing HTML was a valid and valuable resume bullet. Because of this, most of the “advanced” aspects of PHP — which relate to the OOP functionality — were introduced only after PHP 4/5, and weakly at that. Additionally, new languages have since become popularized that show the weakness of PHP. Don’t get me wrong, I am very supportive of PHP. I just believe that it’s important that people understand both the strengths and weaknesses of the tools they use.  There are two main points I want to cover:

  1. PHP thread support is weak
  2. PHP OOP = Broken

The second point is rather technical, but it closely relates to another strength and weakness of PHP: it is loosely typed. More on that later.

Thread Support is Weak

True threading support in PHP does not exist. The closest thing is the pcntl_fork method, which copies the current process, rather than create a thread. This means asynchronous processing within a single process is not supported. Threading is useful in event-driven architectures (common in JavaScript) or when doing blocking operations such as network calls.

Because the forked process is a clone of the original, it shares all of the original resources, including database and file resources. This means that the forked process must be self-aware of whether it is a child or not, and must be careful not to modify or close these resources. This encourages spaghetti code that contains large logic forks (“if I am not a clone, else…”). Because of this, forking is messy and error prone. This gets further complicated when PHP is executed by Apache in a web environment. In fact, the PHP manual advises avoiding forking with web servers:

Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.

Not to mention the method is incredibly C-like in that it is very “raw” (unlike other native PHP methods/classes). This increases the barrier to entry significantly, which ultimately serves to have the feature ignored by most shops.

Why is all of this important? Well, at most companies, one language is selected for all in-house development. This is because cross training and hiring is simplified if everybody speaks the same language. There are a few common tasks that are unnecessarily difficult to do in PHP:

  • Asynchronous work — handing off work such as connecting to a remote server to a child and wait for a response
  • Manage thread pools — this sort of work requires significant “by-hand” management of any processes spawned by the parent via pcntl_fork

The threading issue is only a pain point that impacts processes that need to become parallelized. It is a pain most big shops live with, or, alternatively, introduce other languages to help solve.

PHP OOP = Broken

Because of the loosely typed nature of PHP, true, well-formed object oriented programming is broken. I know that for many PHP programmers, “Object Oriented” means putting together classes and reusing code as objects. However, that is truly, sincerely, only a portion of the point of OOP. Some of the most powerful aspects of OOP are lost in PHP’s implementation of the concept. Don’t get me wrong: these decisions were probably the right fit for the niche PHP was filling, but I don’t believe most PHP programmers are fully aware of what they are missing.

While the language, thankfully, has interfaces and abstract classes, they are woefully underused. This is, in part, due to to the developer community being largely self-taught. This creates a misconception about the nature of OOP, which ultimately leads to the devaluation of the most important feature of OOP: interfaces.

I can go into why they are so important in another article, but the point is: without interfaces, true polymorphic code is impossible. Or, rather, extremely susceptible to spaghetti code and fatal errors.

In other languages (Java), code might look like this:

interface Animal { void makeSound(); }
void farm(Animal cat, Animal dog, Animal parrot) {
  cat.makeSound();
  dog.makeSound();
  // Note: Parrot class *DOES* have a method called moveAround()
  parrot.moveAround(); // ERROR!

The interface in this example defines a uniform way to access a class through a standardized API (thus the name, application programming interface). In a strongly typed language where all variables must have a type, the cat variable is defined as an implementation of Animal. This enforces and allows the method call makeSound(). If cat has a meow() and dog has a woof() method, they can not be called here without a compiler error. This is because in this function call, the parrot variable is defined as being an instance of Animal (versus being a Dog, Cat, or Parrot). As such, only Animal methods work here.

More importantly, because the compiler does this type checking, any invalid calls, such as the last one, would error and never compile. Even if the Parrot class has a moveAround() method, it can not be called in the code above. This is an extremely important aspect of OOP since, as a definer of the Animal class, I want to make it very specific how Animals should be treated (you can only makeSound!). If a programmer tries to do something to an Animal that I haven’t defined, they get an error. If they wanted to make that last line work, they would need to use object typecasting:

void farm(Animal cat, Animal dog, Animal parrot) {
  ...
  ((Parrot) parrot).moveAround();

Or by changing the function definition:

void farm(Animal cat, Animal dog, Parrot parrot) {
  ...
  parrot.moveAround();

But note that in this case, the user had to make an explicit choice to stop using Animal’s interface. Yes, parrot is still an Animal, but it doesn’t have to be. This, in short, helps prevent spaghetti code because it forces the developers to think about whether or not they want to deviate from a particular interface. Realistically, if presented with these alternatives, a Java programmer would probably use other types of abstraction techniques (e.g., dependency injection)  to keep this method from needing to be used. However, this example was necessary to illustrate how things are done in PHP.

So how would this look in PHP? Why isn’t this the same there? Well, take a look at the following code that, unlike the Java example, works perfectly fine and raises no red flags.

interface Animal { function makeSound(); }
function farm(Animal $cat, Animal $dog, Animal $parrot) {
  $cat.makeSound();
  $dog.makeSound();
  $parrot.moveAround(); // WORKS FINE 
}

This code works great. We have three arguments all forced to use the Animal interface. Great. As a casual observer, there is really, truly, nothing wrong with this code. It’s a little strange, but if it’s commonly known that Birds can moveAround(), there is no problem. In fact, in most PHP shops, I will bet money that type hinting is NOT used. This will further illustrate how bad the spaghetti is about to get (read on).

Now imagine in six months if we decide we wanted to group up this code so that it uses a single array/collection as an argument. This is where things would look like traditional polymorphic code. I mentioned spaghetti above. Let me show you why:

interface Animal { function makeSound(); }
function farm(array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    if($animal instanceof Parrot) { // or maybe a method_exists() call?
      $animal.moveAround(); // SPAGHETTI
    }
    else {
      $animal.makeSound(); // Hope for no fatal errors!
    }
  }
}

Wow, look at what we just did. A harmless piece of code in PHP six months ago completely breaks when you try to refactor it to use a fairly typical design pattern. More importantly, unless I put in even MORE code to do type checking, there’s a chance that the makeSound() line will actually die in a fatal error if, for example, a string is passed in as an element of the argument array! See example without Parrots:

interface Animal { function makeSound(); }
function farm(Array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    $animal.makeSound(); // Hope for no fatal errors!?
  }
}

PHP is extremely flexible when it comes to hacking out a page, but when it comes to OOP, it’s about as brittle as you get. Refactoring is painful and error prone, and elegant design patterns like the ones you might see in a message-passing language such as Objective-C, Scala, or Erlang don’t work. Remember that by using functions such as method_exists() and is_object(), I can emulate the desired behavior; however, the extra code means more places for bugs and less time spent making the program do what you want it to do. The point is that the OOP constructs in PHP don’t fully work. As a result, certain very important aspects of OOP don’t translate very well to PHP.

Some people may still cling on to the notion that “ultimately, you can still do it, it just requires more code!” But I argue that preventing “more code” is the exact reason why OOP was invented. By writing more boiler plate error checking code, we are wasting time. The issue is exacerbated by the fact that the error checking code isn’t required, unlike say, if you were throwing exceptions. It isn’t immediately obvious in that last example that you need to do error checking for is_object() on the $animal variable. It’s these types of oversights that really damage PHP as the code base gets larger.

Conclusion

What I’m realizing is that PHP isn’t meant to scale. Yes, it can take a lot of web traffic, but that’s not what I mean. I’m talking about scaling in the sense of growing team size and code base. The design of the language promotes coding paradigms that ultimately damage the code base. This is because PHP makes it harder use good OOP practices on legacy code. To illustrate:

  • PHP became popular because it is easy to hack things out, even if that something required doing it the “wrong” way. These problems come back and bite you when the code base grows.
  • PHP can’t support a large development team as effectively because its weak typing allows for sidestepping certain core OOP principles (see above)
  • PHP  allows for invisible future-bugs (see above) to be inserted without any immediate cause for alarm
  • As applications get complex and require threading or distributing of processes, PHP fails to keep up (so other languages get used)
  • Because PHP does not use dynamic dispatching (message passing), calling a method can cause runtime FATAL ERRORS (unacceptable and very hard to debug!)

All of this makes me rethink the popularity of PHP. There are some new languages, still in their infancy, that pose a threat to PHP’s current dominance. I believe that in the next few years, as today’s systems become “legacy,” today’s newcomers will finally be production ready. At that point, we might see companies adopt the newer languages, which will support more modern programming paradigms. We are seeing this today with Ruby, for example.

Of course, I could be wrong. I once told people that PHP was “C of the web.” It’s possible it’s here to stay forever, despite all of its flaws. And, for the record: I do not believe Python or Ruby will be the language that will overtake PHP, but that’s for another post.

I just want everybody to know that I am a PHP developer, so I speak from experience. We should recognize that technology changes and evolves, and it is important that we constantly update our skill to ensure they don’t become obsolete. I’m just pointing out that perhaps PHP isn’t as timeless as C (or, possibly, Java).

Lastly, I will plug my personal belief that being “religious” about a language because it is “the best” is short sighted. New languages are born, literally, every week. It’s only a matter of time before a language comes along that does what your language does more elegantly, faster, and with less code.

Only time will tell. :)

The Basics on Using Models and Controllers in PHP

Today I want to talk about passing in objects as arguments in PHP methods. Many PHP developers do not have this patience. This is obvious when studying libraries written for Java versus those in PHP. It is a horridly underused programming style in PHP, and since PHP supports argument type prototyping in methods, I thought it would be good to go over this particular style of development.

First of all, let’s start with a few “PHP” way of doing a mundane task (this will seem extremely familiar):

function login($username, $password, $remember);

function processMail($to, $from, $subject, $body);

function editPost($id, $title, $body, $newTime);

Of course, good developers would make sure these types of examples are part of a class:

$user = new User;
$user->username = $_POST["username"];
$user->password = $_POST['password'];
$user->login();

The examples can continue, and I hope that you good developers use this type of basic OOP development when doing using PHP. :) But there are examples where the standard “newbie” OOP model seemingly falls apart. The system quickly breaks down when new requirements are added to the system. For example, what if we want to do a “remember me” option in the login? What about logging in as an administrator versus a regular user? Okay, now what if we have different login session lengths depending on the user type? How long do you think that login function will be? Think about how it will accommodate IP bans, login logging, banned users, suspended users, max failed attempt lockouts, etc. The list goes on, and depending on your implementation, things can get very ugly. Your login function might become a huge bloated monster sitting in your User class.

The problem is that what you’re actually doing is mixing a data model [user data] with controller logic [how to login using the user data]. The solution is to separate these two entities into two classes, which is what you would see in most modern MVC frameworks.

Here is the seemingly more complex, but far more elegant solution:

$authenticationController = new UserAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

The prototype for the login method would look like this:

function login(User $userObject);

In this example, I am hoping to show you possibilities. First, notice that the arguments for login() are down to one. But the more interesting part of the implementation comes with the proper abstraction between the User data and the authentication process. In my example, I Just logged in a regular user. So if I had to map out my class structure, it might look like this:

(Abstract class) AuthenticationController
=> GuestAuthenticationController
  => UserAuthenticationController
    => AdminAuthenticationController

The old way of doing things would look something like these:

function adminLogin($username, $password, $remember);

$user->adminLogin();

But using the Java-esk model, we’d end up with something like this:

$authenticationController = new AdminAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

This means the login method is likely broken up in a few pieces inside the AuthenticationController. The Guest user’s login() method would always return false. The UserAuthenticationController would piggy back on AuthenticationController::login() by looking at the User::rememberMe variable and take it into account. But the AdminAuthenticationController doesn’t allow people’s logins to be remembered due to security reasons, so it doesn’t take that variable into account. And in that crazy case where there is nothing different about the admin login, the method would remain untouched (inherited from the parent), but any other changes (such as session length) would still kick in for the admin user with no additional coding.

All of this is done without modifying the core User class. The user class remains clean for its own further abstraction possibilities. New fields such as profile, name, DOB, etc., could be added with no modifications to the controller.

Yes, my version requires the most lines of code, but it is also the easiest to maintain and understand. Why? Because it isn’t cluttering up the User data class with methods that have nothing to do with the user. If you’ve ever written a generic “user” class, you know how large and cluttered such a class can become when you start piling in the methods for login, logout, preferences, session management, lookups, and other needs. I haven’t even talked about the fact that virtually all “operations” that involve a user also involve the database, which adds its own headaches. Being able to keep the hard work in other more data-manipulation-oriented classes is for your own good.

If down the road, it is determined that logging in should also require pre-approved IP addresses, what will your code need? Will your login method need an IP address passed in too? Or will the IP address be generated inside the login method? In my example, I would update the core Authentication class and be done. What happens when a new requirement is added that requires that the login also passes a CAPTCHA test? What about when logins need to be logged in a flat file? What happens when we change logins to use the email field instead of the user field?

Today, I only talked about logins. The ideas I propose here are not new; they are simply good ideas that get ignored by web developers. Remember, you’re application developers that happen to work in a browser. Don’t think that regular application design principles don’t apply to you: they do, more than ever.