July 26, 2008

On the Web 2.0 Bubble

Everybody, listen. There’s a Web 2.0 bubble right now. I know it’s difficult for some people to acknowledge, and many people may even casually agree with me without actually believing the statement in full. But it’s true, and the quicker you realize this, the better it will be for your pocket books.

Lately, I’ve been doing stock trading, and have come to learn first hand about the energy and commodities bubbles that were slamming the market. And when that thing was going crazy, it helped deflate the banking bubble, which was a direct result of the housing bubble. And in many ways, the housing bubble was a result of the dot-com bubble bursting due to people exiting the stock market in search for a new investment. And everybody in the web industry likes to think they are wise to bubbles because they learned their lesson in the dot-com boom. But it is increasingly evident that this is not the case.

As we know, Facebook has a supposed valuation in the double digit billions, thanks to Microsoft’s minority stake investment in that company. And just the other day, it was news that Google was seriously considering buying Digg for $200M (the deal ultimately fell through).

There is a clear bubble here, but it’s sometimes hard to see because most startups are at least producing some revenues.

An Example Exercise

The problem here is that people are approaching this with the mind set of “this will be somebody else’s problem after I sell it.” It’s important we try to figure out what happens to the eventual owner of the startup.

  1. Take your favorite Internet 2.0 “company”: Digg, Myspace, Twitter, etc. Decide how much you think that company is worth. $5 million? $10 million? $50 million? $500 million? The sky is the limit!
  2. Take half your life savings. Take all your discretionary spending money for the next 5 years and take it out of your budget.
  3. Now imagine there is a company that exists that has some money to throw around into a startup. The company is, at least for now, profitable. You will now put the funds from step #2 into the stock of a subsidiary of this company. This subsidiary will be exposed to 100% of the gains/losses that will arise from this takeover.
  4. Remember that number you threw up there in step #1? That’s how much your new company is going to pay to acquire that 2.0 startup. Your stock’s valuation is now 100% tied to the income generated by this newly acquired startup. If the startup loses money, your stock goes down and if the startup makes money, your stock goes up.
  5. Nobody will accept an offer to buy the startup from your company at a break-even or higher price until it has reported a yearly net revenue of 10% of the purchase price and is profitable.

That last part is the key because it effectively stops the hot potato game and forces you to examine if the company is truly viable. Some people would accuse that of being an unfair restriction, but I will show you why this is the key part in understanding why there is a 2.0 bubble.

Defining the Bubble

Let’s take a second to define a bubble:

An investment yields a return, much like a chicken can produce eggs, a savings account produces a yield, and a farm produces crop. This return is not always immediate, and is not always in the same terms as the input. Also, it is almost a law that returns are proportional to risk (some investments have negative returns). But ultimately, it is called an investment because it will (usually and) eventually generate more value than what you put in.

Now consider an investment that does not create a return. Such an example would be the web stocks of the dot-com boom. Back then, fundamentals like earnings, operating margins, and profitability were ignored when evaluating a stock. Companies that bled millions of dollars a year saw their stocks rising at record levels. This is because the investment - the stock - was being traded to somebody else for a profit because that next person believed they could trade it for an even higher profit.

A bubble is defined as a trend where merely owning something long enough to sell it is profitable. It is a giant game of hot potato. Everybody is essentially a middle man between the original owner and the eventual owner — adding to the price tag at every step. Eventually, people wise up and no longer want to trade the hot potato, causing the bubble to burst.

And most importantly, let’s define a bursting bubble:

A bubble is defined as bursting when the value of the traded item reverts to its true market value.

Understanding the Exercise

So let’s talk about the exercise again: if you thought the company was worth a paltry $50M, then your assets are stuck inside that stock until the startup can earn $5M in revenue AND be profitable while doing so. Why did I pick such restriction? Because those are reasonable things to assume when buying any other type of company. Why would another company offer to buy the startup if it failed to produce respectable profit margins or revenues?

Given this extremely reasonable reality-check requirement, would you want to tie your personal investment to the startup being able to produce a profit? Would you end up selling your stocks in fear that they would go to zero? I know some of you would still blindly accept this gamble, but I am sure it helps you see the absurdity of at least some of the valuations that are going on right now. How can Facebook be worth over $10B if it isn’t producing revenues somewhere in the $500M-$1B range? Reality check: Facebook has revenues of $150M a year.

Speculation Should Still be Grounded on Fundamentals

These types of companies are “worth” that much because the valuation is highly speculative. People aren’t investing for what Facebook and other 2.0 companies are worth today, it’s all about tomorrow. I agree that it is important that tomorrow’s profits are taken into today’s valuations, BUT isn’t this reasoning eerily similar to the reason people listed as to why they bought over-priced houses and profitless dot-com stocks? Both were purchases made while completely disregarding the fact that the *current* valuation of the items were negative.

But since that day of profitability is so far away into the future, you end up playing a giant game of corporate hot potato. Most people would agree that a profit of 10% is far better than a loss of 90%. So as soon as you find a sucker to pay 10% more than you paid, you bail. And of course that guy who bought your stake is thinking the exact same thing — sell this to somebody else for 10% profit before something bad happens. That’s a bubble, my friends.

In Conclusion

In 2001, the bubble was all about going IPO so that the general public could hold the hot potato.

Today, the bubble is all about selling to a big corporate entity that will hold the hot potato.

There is no difference.

If you are currently thinking about entering the 2.0 scene, think carefully about what your end goal is. If it isn’t “to be profitable and retire”, then it’s likely just another bubble startup that will become completely worthless (and your invested time and money completely wasted) once the bubble pops. And believe me: given our current economy, that bubble is going to pop in the next year or two.

Finally, an extremely interesting speech about bubbles given in 2006 (gets good around part 2):

Part 1

Part 2

Part 3

Part 4

Part 5

Part 6

Filed under: Business, Predictions — Michi @ 5:58 pm

July 23, 2008

The Basics on Using Models and Controllers in PHP

Today I want to talk about passing in objects as arguments in PHP methods. Many PHP developers do not have this patience. This is obvious when studying libraries written for Java versus those in PHP. It is a horridly underused programming style in PHP, and since PHP supports argument type prototyping in methods, I thought it would be good to go over this particular style of development.

First of all, let’s start with a few “PHP” way of doing a mundane task (this will seem extremely familiar):

function login($username, $password, $remember);

function processMail($to, $from, $subject, $body);

function editPost($id, $title, $body, $newTime);

Of course, good developers would make sure these types of examples are part of a class:

$user = new User;
$user->username = $_POST["username"];
$user->password = $_POST['password'];
$user->login();

The examples can continue, and I hope that you good developers use this type of basic OOP development when doing using PHP. :) But there are examples where the standard “newbie” OOP model seemingly falls apart. The system quickly breaks down when new requirements are added to the system. For example, what if we want to do a “remember me” option in the login? What about logging in as an administrator versus a regular user? Okay, now what if we have different login session lengths depending on the user type? How long do you think that login function will be? Think about how it will accommodate IP bans, login logging, banned users, suspended users, max failed attempt lockouts, etc. The list goes on, and depending on your implementation, things can get very ugly. Your login function might become a huge bloated monster sitting in your User class.

The problem is that what you’re actually doing is mixing a data model [user data] with controller logic [how to login using the user data]. The solution is to separate these two entities into two classes, which is what you would see in most modern MVC frameworks.

Here is the seemingly more complex, but far more elegant solution:

$authenticationController = new UserAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

The prototype for the login method would look like this:

function login(User $userObject);

In this example, I am hoping to show you possibilities. First, notice that the arguments for login() are down to one. But the more interesting part of the implementation comes with the proper abstraction between the User data and the authentication process. In my example, I Just logged in a regular user. So if I had to map out my class structure, it might look like this:

(Abstract class) AuthenticationController
=> GuestAuthenticationController
  => UserAuthenticationController
    => AdminAuthenticationController

The old way of doing things would look something like these:

function adminLogin($username, $password, $remember);

$user->adminLogin();

But using the Java-esk model, we’d end up with something like this:

$authenticationController = new AdminAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

This means the login method is likely broken up in a few pieces inside the AuthenticationController. The Guest user’s login() method would always return false. The UserAuthenticationController would piggy back on AuthenticationController::login() by looking at the User::rememberMe variable and take it into account. But the AdminAuthenticationController doesn’t allow people’s logins to be remembered due to security reasons, so it doesn’t take that variable into account. And in that crazy case where there is nothing different about the admin login, the method would remain untouched (inherited from the parent), but any other changes (such as session length) would still kick in for the admin user with no additional coding.

All of this is done without modifying the core User class. The user class remains clean for its own further abstraction possibilities. New fields such as profile, name, DOB, etc., could be added with no modifications to the controller.

Yes, my version requires the most lines of code, but it is also the easiest to maintain and understand. Why? Because it isn’t cluttering up the User data class with methods that have nothing to do with the user. If you’ve ever written a generic “user” class, you know how large and cluttered such a class can become when you start piling in the methods for login, logout, preferences, session management, lookups, and other needs. I haven’t even talked about the fact that virtually all “operations” that involve a user also involve the database, which adds its own headaches. Being able to keep the hard work in other more data-manipulation-oriented classes is for your own good.

If down the road, it is determined that logging in should also require pre-approved IP addresses, what will your code need? Will your login method need an IP address passed in too? Or will the IP address be generated inside the login method? In my example, I would update the core Authentication class and be done. What happens when a new requirement is added that requires that the login also passes a CAPTCHA test? What about when logins need to be logged in a flat file? What happens when we change logins to use the email field instead of the user field?

Today, I only talked about logins. The ideas I propose here are not new; they are simply good ideas that get ignored by web developers. Remember, you’re application developers that happen to work in a browser. Don’t think that regular application design principles don’t apply to you: they do, more than ever.

Filed under: PHP — Michi @ 3:19 am

July 22, 2008

Google’s Real Goal Behind All Their Free APIs

Ever wonder why Google gives away so many web-developer tools? Tools that otherwise seem like complete money-and-bandwidth-pissing schemes (notice how most of these don’t directly show ads):

This is all about obtaining browsing behavior in a long term bid to increase ad efficiency. Nothing else.

  1. It is not about making things more “open”
  2. It is not about making web development easier
  3. It is not about making an online operating system
  4. It is not about competing with Microsoft
  5. It is not about making the Google brand more ubiquitous
  6. It is not about showing ads in new places

If any of these above things happen, they are a (likely planned) side effect. For example, if a particular API makes something easier, that is good because it will encourage other developers to adopt it as well. But as I will explain shortly, the commonly held beliefs about Google doing Good or Google making the web more open are simply not the reason for these initiatives.

If you notice, all of their APIs use JavaScript. This means all of their APIs have the ability to note what computer a given request is coming from. This means that on top of your search preferences, they can eventually begin to correlate your browsing habits based on the sites that you visit that use Google APIs.

For example, if my blog were to use a YouTube embed, it would be possible for Google to read a cookie originally placed on your machine by YouTube and correlate it as traffic coming from this site. This means they can unique track every YouTube video your computer has ever watched since the last time your cleared your cookies. YouTube is just an example because most of Google’s APIs are far less obvious to the end user. For example, the unified AJAX libraries could be used by a good half of the “2.0″ web sites out there without impacting performance (and in many cases would make the sites load faster for the end user). But because everything is going through Google, it’s possible (although I’m not saying that are) for them to track which sites you visit.

If this isn’t extremely valuable information, I don’t know what is. Don’t forget that the AdSense API is, in itself, a means for Google to track every website you’ve ever been to that uses AdSense, and for a way for Google to know exactly which type of ads interested you in the past. Once they know what sites you visit, they can surmise what a given site is about, and then determine, for example, what sort of products would interest you.

It’s the classic advertising chicken and egg problem: If I knew what my customers wanted, I could sell it to them, but they won’t tell me.

…And Google found the chicken. For the time being, they haven’t started using this information (at least noticeably), but I am sure they will as market forces move to make competition in that area more necessary.

Say goodbye to privacy. =( Oh wait, I’ve been saying that for quite some time now.

Filed under: Business, Predictions — Michi @ 3:02 am

May 31, 2008

Getting Around Overwriting form.submit()

Since my dear reader Sameer requested it, I’m here making an update. I’ve got a cool JavaScript fix for everybody! I mentioned in a post a long time ago, but JavaScript has this semi-unexpected “feature” where you can accidentally overwrite the submit() function from a form. As in:

<form id=”myform”>
<input name=”submit” value=”submit me” type=”submit” />
</form>
<script>
document.getElementById(’myform’).submit(); // THIS FAILS - Object not a method
</script>

Apparently, by creating a form element called “submit” you overwrite the native function that exists in every form element in JavaScript. Because it’s native, it also means you can’t just willy-nilly redefine it. And to make things worse, you cannot (at least not in a cross browser manner), successfully re-assign the submit() method because some browsers will disregard any attempt to reassign its value. As in:

<script>
document.getElementById(’myform’).submit = ‘This gets ignored’;
</script>

Fortunately, there is a fix. This fix requires modifying the actual DOM. Because this tends to be inconsistent across browsers, I’m doing this fix in MooTools (which is my JS library of choice). However, the fix is fairly straight forward and can easily be done with (or without) any JS framework, as you will see. The steps are:

  1. REMOVE the form element in question. This is an absolute requirement to make the solution cross browser compatible. This can be skipped, but it will cause quirks. However, the good news is that we can assume that 99.99% of all form elements named “submit” are due to designers being ignorant — thus, such cases are exclusive to submit buttons. Luckily, these are almost NEVER needed in the server side code and really just act as wall flowers.
  2. Check if step #1 completed successfully
  3. If it did not, create a new Form element and copy its submit function over
  4. Submit

The code looks like this:

<script>
var formObject = document.getElementById(’myform’);
// Removes the node
formObject.submit.remove();
// Functions don’t have tagName defined

if(’undefined’ == (typeof formObject.submit.tagName)) {
    // create a form and assign its submit function
    formObject.submit = new Element(’form’).submit;
}
formObject.submit()
</script>

Let me know if you encounter any problems.

Filed under: Javascript — Michi @ 1:54 am

February 17, 2008

Neat Idea: Creating Alphanumeric IDs

UPDATE: For those of you looking for a great way to generate highly unique ID that is shorter than what you might get using a hex number, try this (it will generate a ~17 character ID):

list($hex, $dec) = explode(’.', uniqid(null, true));
$id = (base_convert($hex, 16, 36) . base_convert($dec, 10, 36));

Ever needed to create an ID has that looked something like f39a2xm91? You might not have, but some day you’ll want to. The easy way out is to use the native md5() function, but that creates a long 32 character hash which may be a total waste of (database) space. These types of IDs are often used to mask integer IDs so that your users can’t just type in user_id=10000, user_id=10002, user_id=10003, and so forth to look at your records. Some might even call it security through obscurity. Well, let’s be clear: this sort of activity does not add security, but it does make for making “browsing” behavior more difficult.

Either way, if you desire to move away from the classic integer format IDs, I have a different solution for you:

base_convert($someId, 10, 36);

This will convert the number 10,001 into 7pt, 10,002 into 7pu, and 100,000,000 into 1njchs. As you can see, you can store a heck of a lot of numeric information in a very tiny amount of (character) space. I am not saying this will save you database storage space, but it will make your URLs shorter.

One of the main benefits is this is that you can store more data in a smaller human-readable space, thereby allowing you to create smaller unique IDs. So for example, in our logging system at work, I use this method to generate issue IDs that end-users can send to us when they have a problem. This issue ID is based on the last eight digits of current time with microseconds concatenated with (using “.”) a random seven digit number in front. I then base-36 encode this resulting number (stripping out the decimal point).

Note: my solution is specific to the problem I was facing. It’s not necessarily a full proof way to generate unique values, but it’s what you would call “good enough”. Do not use the solution unless your solution does not hinge on absolute unique values.

A warning about base_convert is that large numbers breaks down in PHP, so be careful (we’re talking very large numbers). This means pasting together the current timestamp, the user ID, the session ID, and a fourteen digit number into one 50 character long number will probably result in some precision errors (not a huge deal for most implementations, but be warned). From the PHP manual:

This is related to the fact that it is impossible to exactly express some fractions in decimal notation with a finite number of digits. For instance, 1/3 in decimal form becomes 0.3333333. . ..

 

So never trust floating number results to the last digit and never compare floating point numbers for equality.

The “base” refers to the numbering system used to convert the number. In a base 11 system, the counting goes to 10, then the letter A, and then loops around back to 1. So in other words, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, A, 1, 2… In base a base 16 system, you would go all the way to F before going back. So the larger the base, the more “compressed” a number can become.

I used base a 36 scheme (0 - 9, A - Z), but you can use smaller bases to come up with longer conversions. For example, a base 21 conversion (0 - 9, A - K) will convert 10,001 into 11e5, 10,002 into 11e6 and 100,000,000 into 13a3k7g. So in short, if you have a database where your record IDs start at above 7 or 8 digits, maybe you can think about base encoding them into shorter IDs.

Just a neat idea I wanted to share.

Filed under: PHP — Michi @ 4:41 pm

February 3, 2008

My Thoughts on Microsoft Buying Yahoo for $44.6B

The big news of Friday morning was that Microsoft offered Yahoo $44.6 billion for the company. On a financial level, this is a sweet deal for Yahoo. It’s not the most financially sound investment Microsoft has offered, which is why their stocks dipped 6% on Friday. No reply has been made from Yahoo, but I can definitely see them taking this offer seriously. My thoughts are summed up in three bullet points:

  • Yahoo’s management will possibly accept the offer since it is so lucrative.
  • The purchase will piss off some of Yahoo’s top talent and cause them to defect, possibly probably to Google.
  • The purchase will help Google gain a greater lead during one of the most crucial eras since the Internet began: the rise of mobile computing.

The internal culture of Yahoo is not exactly friendly to Microsoft. Yahoo is seen as an ally to the open source community while Microsoft is exactly the opposite. Yahoo is a major contributor to open source (ex. PHP’s lead developer is on Yahoo’s payroll), has an open philosophy which has shown itself in their JS frameworks, Flickr, Pipes, and various other projects, and is a major user/contributor to the open source stack in general. Microsoft is clearly not on the same page.

I’ve read speculation that the looming recession will cause developers to stick around despite a take over from a boss they don’t like. However, my belief is that great developers aren’t scared to leave since they are in high demand no matter what is going on in the economy. Some of the very best and brightest at Yahoo will leave. Any sort of exodus of major talent would destroy the current internal direction. Worse, some of these great minds would likely go knocking on Google’s doors, which is straight up ironic considering Microsoft’s intentions. This leaves gutted, possibly begrudging or de-motivated teams, recipes for not producing innovation.

Which leads to my final point: Microsoft’s goal is to beat Google by merging with Yahoo’s resources. It is my belief that this move could ultimately prove counterproductive. The integration process of merging departments, axing un-needed employees, changing internal processes, shifting internal priorities, introducing new management, and replacing fleeing key talent will cause major stalls over in Yahoo… At Google’s benefit. Microsoft is no stranger to mergers and acquisitions, but Yahoo would be a major, major purchase with a sizeable employee count. Microsoft will have its hands full for months.

All this is going to happen during a period I consider to be a key moment in the rise of mobile computing. A large chunk of search traffic will begin to come from mobile browsers, and the web will shift to the mobile platform. During such a crucial stage of computing, this sort of disruptive purchase may help Microsoft and Yahoo miss the bus.

So while I wouldn’t be surprised if the floundering leadership at Yahoo took the offer, I also expect this to work out as the most counterproductive and costly purchase in Microsoft’s history.

Filed under: Business — Michi @ 5:58 pm

January 28, 2008

Debugging Tips for Database Abstraction

Today I want to talk about database script debugging in large systems. The main problem is that in large applications, it becomes difficult to find the source of rogue queries that, for example, broke in a recent system update.This may not readily apply to most of you, but bear with me: some day it will.

Pretend for a moment you have a database architecture where you have 2 masters (dual replication) and 2 read-only slaves. Now pretend that you have a large application with 100 different pages/scripts. You have 5 web servers with mirror copies of the application. This would be a fairly typical setup for a small, but growing company.

One day, you come into work and find out that you had a bad transaction lock that caused your system to hang all weekend. So you look at the process list and you know what query is causing the problem (because it’s still stuck). The problem is that it looks suspiciously like the queries you’d find on virtually every page in your application. How do you fix this problem? An different (but related) problem is when an update initially executed on one master database server replicated to a slave and got stuck on the slave but executed fine elsewhere. What happened? Which master server got the initial query? This sort of debugging is very difficult to track down without more information such as where the query was initially sent and from what page it originated.

The primary challenge is figuring out which query came from what page in your application. The solution is to add logging straight into your queries. The implemented looks something like this:

//Get the current page or script file
$source = $_SERVER['REQUEST_URI'] ? $_SERVER['REQUEST_URI'] : $_SERVER['SCRIPT_FILENAME'];
//Replace out any comment tags and add in the database being connected to
$metaData = str_replace(array(’/*’, ‘*/’), array(’/ *’, ‘* /’), $source) . ” ($databaseHost)”);
//Escape the query so the URI can’t be used to inject data
$metaData = mysql_real_escape_string($metaData);
//Execute the query
$result = mysql_query(”/* $metaData */ ” . $query, $connection);

This solution inserts a comment into your query that gives you useful information that can be seen when looking at the raw query. MySQL uses C++ style comment blocks (the /* */) which are ignored by the parsing engine. This means you can pass data to the engine which can be useful for debugging. These comments are also replicated down to the slaves, which can be useful when you find a slave having problems with a query that came from a master server. For those of you unaware, the “URI” refers to the full URL that was typed in the address bar to access a page.

But make sure that you correctly sanitize the URI so that somebody can’t arbitrarily end your comment block (with a */) and inject their own nonsense into your query. Also, considering issues like multi-byte character attacks, I don’t even want to take the risk of not further escaping the data with a call to mysql_real_escape_string.

The solution we use at my work logs the web server IP, database server IP, and script path/URI. Other potential ideas are local timestamps, version information, user IDs, and session IDs.

In conclusion, this solution will help you identify the source (and sometimes the destination) of queries that are causing problems. This has been used in our production environment at work often when trying to determine what pages are producing extremely slow queries. This solution should work with any database, although my example is written for MySQL.

Happy debugging!

Filed under: MySQL, PHP — Michi @ 1:24 pm

January 26, 2008

The Wonders of Makeup (non-geeky post)

This is totally un-techy, but I came across a very interesting post about putting on makeup.

To summarize the article… Take a look at the “after” picture.

Now look at the “before” picture. I still can’t believe it’s the same person.

Being a guy, I was never conciously aware that makeup could change someone’s looks so dramatically (aside from the professional jobs on movie sets). Amazing!

Filed under: Off-beat — Michi @ 11:17 pm

January 23, 2008

PHP Best Practice: Don’t use INC extensions

I have been bad about updating, and this goes back to an old habit that probably has to do with human nature: as time between updates increases, there’s a desire to write a “big” update, which is increasingly difficult as news-worthy events happen and are ignored. There’s so many things for me to update about that I could touch on, such as the iPhone SDK update, news on IE8 passing the ACID2 test, my predictions from a year ago that were spot on (until about a week ago when all stocks tanked), and Sun buying MySQL. But I wont. Perhaps next time. So this post is small, but serves as a feeler post to help me get back into the routine. The truth is that I have several programming post drafts setting on my machine that could have been posted a long time ago if I had given them a final read-over. Those things take a lot longer than they look from the casual observer.

Today’s post is a best practices post. The tip is simple: When creating a naming convention, never rely on the .inc extension. The .inc is used in some shops to denote files that serve as libraries. This is a terrible practice for a number of reasons.

First, it means deploying your library ANYWHERE requires adding the extension to your server’s configurations so that it knows these files are for PHP executables. This isn’t a deal breaker in most cases, but beware that if you use shared hosting environments, this sort of thing can be annoying and stall development.

The second far more practical reason is for security. When these library files are moved to a new server which has yet to be configured, they are wide open for public viewing. Because the server doesn’t know they are PHP files, they are served up as text files, essentially exposing your code base for the world to see. I’ve seen this issue pop up in production environments where a new web server was brought online without being fully configured, causing pages to become exposed. This is the sort of business that helps cause source code leaks (remember the Facebook code leak late last year?).

Of course, this points to the greater issue that library files shouldn’t be web accessible, but I have also seen this paradigm used in common CMS applications where you have a .php file include a .inc file that contains the bulk of the page logic. Here, again, you would be exposing highly sensitive application logic to the world.

If you really want to denote files differently, I prefer to use file prefixes. As in, classes might get a prefix like “class.[rest-of-filename]“. Or perhaps “function.[rest-of-filename]“. There’s even “include.[rest-of-filename]“. The point is, a prefix can’t kill you because the files retain the .php extension. :) Happy coding!

Filed under: PHP — Michi @ 1:38 pm

December 10, 2007

BUG: Constructors, Interfaces, and Abstracts Don’t Mix Well

I just discovered a bug today in PHP 5.1 (haven’t confirmed if it was fixed in newer versions). When trying to enforce interface arguments on constructors, PHP behaves unexpectedly. Normally, interfaces allow you to enforce argument counts or types in child class methods, but not with the constructor (and probably destructor).

Crash course on interfaces: An interface lets you as a developer dictate a standard for a class. For example, you might write an interface class for interacting with your class. Then other people who want to interact with your class would “implement” your interface class. This would force their classes to have a certain set of methods, of which you dictate their names and argument counts (and types). This way, your class is always guaranteed these implementer classes have certain key methods. In the real life example, it’s like saying an interface for a Car would have methods like brake($amount), gas($amount), steer($direction), etc, and the User class would be able to have a guaranteed way of interacting with the Car object (i.e., $user->getCar(’Ferrari’)->steer(’left’)). Abstract methods exist in abstract classes and are essentially the same thing. Read more about these here and here.

First, here is an example of a typical interface:

class ExampleClass {}

interface TestInterface {
	public function output(ExampleClass $var);
}

class Test implements TestInterface {
	// error, no output() method was defined
}

The following fails too:

class ExampleClass {}

interface TestInterface {
	public function output(ExampleClass $var);
}

class Test implements TestInterface {
	public function output($var) {} // error, wrong argument type
}

Here is the same example but with the __construct method instead:

class ExampleClass {}

interface TestInterface {
	public function __construct(ExampleClass $var);
}

class Test implements TestInterface {
	// error, no __construct() method was defined
}

Up to here, it works as expected. However, if you define the constructor, the __construct method argument datatype/count checks go out the window:

class ExampleClass {}

interface TestInterface {
	public function __construct(ExampleClass $var);
}

class Test implements TestInterface {
	public function __construct() {} // NO ERROR
}

Despite the data types and argument count being off, PHP doesn’t care. Even if I define an argument in the constructor, the datatype check is ignored. So the best you can do is force a __construct() definition to be required, but you can’t dictate its arguments (i.e., interfaces for constructor methods are useless). And finally, for those of you really astute readers:

class ExampleClass {}

abstract class AbstractTest {
	abstract public function __construct(ExampleClass $var);
}

class Test extends AbstractTest {
	public function __construct() {} // NO ERROR
}

This problem produces the SAME results if instead of an interface, abstract methods in an abstract parent class are used.

Filed under: PHP — Michi @ 4:04 pm
Next Page »