6 Predictions of 2007 – More Spam, Less Paper, Bigger Google

I thought it would be cool to look back on this post in 2008 and see how I did. These are 6 predictions I believe may come true by the end of this year.

Google will grow 20% to $600 a share.

I’ve already explained this in depth. In short, Google Checkout, the radio ad agency they purchased, YouTube, and an entry into the CPA ad market will fuel this growth. Of course, this growth won’t be until the end of the year when they report their Q3 and Q4 earnings. Q1 earnings may disappoint due to the new costs of running YouTube. But these costs will be offset with the sponsorship of YouTube by various content owners in the remaining quarters.

The single biggest stock spike will come when Google formally begins public CPA ad network trials.

Ruby remains the new Python and does not surpass either ASP, VB, C, C++, C#, or PHP, and does not enter the enterprise market in any significant way.

Sure, a few startups such as Digg may start out on Ruby and make it, but I predict now that no major entrenched corporation that goes online, nor one that is already online, will switch to Ruby. The only enterprise Ruby applications that will exist will be small startups that grew large. Ruby will replace Python as PHP replaced Perl (in the mind share sense).

Ruby on Rails made headway while there was no competition, but now there is plenty in .NET, Java, and PHP. When the Rails hype dies down, people will have to compare Apples to Apples again — Ruby as a language compared to others. While many have discussed its beauty and elegance, comments like that certainly didn’t help Python much either.

Dell rebounds, Apple grows more, Microsoft grows for once, and Vista makes it to laptops.

Now that there is a new operating system out, we will see some renewed spending on computers. Pay particular attention to Q4 where Dell should report large earnings on its laptops, right around when Microsoft gets out its first service pack. Microsoft’s Zune will continue to flounder while its operating system will make major inroads — on laptops. Luckily for them, the Xbox division will do finally turn in some profits, offsetting the cost of the Zune. The corporate desktop scene will hardly change at all this year for Microsoft as nobody can justify the huge costs of getting top of the line hardware for a new operating system when you can buy great XP machines from Dell for $300.

Meanwhile, Apple will release a major new product in the next three months. One will be “iTV”, and the other will be a new top of the line iPod. This may be the “iPhone” or just a new video iPod. Either way, this will continue to boost Apple’s stellar iPod sales, keeping its revenue strong and the halo effect stronger. We should see continued growth in the Apple market share as new people decide to give Apple a try.

Electronic paper will see its first true mainstream applications in the US, but it won’t catch on for another year.

Why a prediction about electronic paper? Because I think it will become hugely prevalent within the decade, creeping into virtually everything that touches electricity.

There is one product that could appear this year that will make e-paper big (and invalidate half my prediction): e-photo frames. Right now, there are those annoying “plug-in” photo frames. An e-paper version would mean the photo could sit without a power source, only requiring it during uploading. Since so many photos are digital these days, this would be a huge plus for people looking for an easy way to frame their photos themselves.

Otherwise, e-paper made an appearance in India late last year on a cell phone, and I think we’ll see production here in the US. But, it probably won’t sell very well due to an over-emphasis of the feature. I further predict that it will NOT appear in the old-media publishing industry (newspapers) because of its new-technology-averse nature and large trial cost (distributing readers). I give it a 50/50 chance that the credit industry picks this technology up this year in select trial markets. There is a small chance that a portable music player maker picks this technology up – but it won’t be Apple.

I think there is a very high likelihood that we will see a product that will allow someone to copy pages into a digital “handout” used in presentations (with a next button). Again, this product will probably languish in obscurity because photocopying just isn’t that inconvenient. E-paper’s real power will come when they have large scale production in swing, allowing for e-billboards and e-whiteboards (with pressure sensitive e-writing). That, and, of course, reading tablets. But these won’t hit the mainstream market for another year or two.

IE will still be #1, but monopoly abuse, no more.

That’s right, Internet Explorer 7 will continue to dominate. However, the share will soon look the iPod’s market share: 75% internationally (including IE6, that is). The prediction I am making here is that the downward spiral of IE will slow and even stop in some markets. Part of why is because Microsoft will try to make a new IE-only client-side application framework. IE7 just isn’t that bad, and Firefox’s growth is slowing down now that all willing early-adopters (and their friends) are tapped. That said, 2007 will still see a continued decline for IE.

Overall, IE will be weakest in Europe while Firefox will continue to gain share until Firefox averages about 25% there (IE gets 70%). In the US, Firefox will rise to 20% while IE will lose another 5% to end up at 75%, thanks to the continuing security leaks being reported. But don’t be mistaken — Microsoft is not going to be losing the browser war anytime soon. They just won’t be given a free ride anymore.

Microsoft will try to re-exert its monopoly power to make client-side IE-only web applications. This means the first IE-only Microsoft desktop web applications will show up late this year – complete with a new .NET libraries – and they will do well because they will center around Office. Most other software development businesses will not follow Microsoft and ignore 20% of the market, although some will. Firefox and friends won’t have a formal answer to the new technology until 2008, although portions of it will be emulated in Flash based implementations such as Flex.

IP technology will hit our homes, but not our living rooms.

While IP phones are making real progress in replacing conventional lines, IP TV will not replace the television set. While YouTube will grow in prominence, it will remain “just” a website until 2008 when mobile broadband technology (in the US) is mature enough to allow handheld streaming. While major content producers will sign up for YouTube, they won’t get on board with their arms wide open for at least a year due to the potential disruptive effect the medium will have on the traditional cash cow: prime time TV ad spots.

None of this will ever become main stream until regular TV viewing takes a dip, which won’t happen until a vast portion of the population becomes more comfortable with streaming videos online over watching commercial breaks on the TV sets. TiVo will have its day too, just not until near the end of the year when cable providers and networks realize they will be screwed out of the TV pie if they don’t react quickly. TV may eventually become a second monitor used only for media viewing, but if this does happen in a big way this year, it will either happen with Apple’s “iTV,” or not at all.

At the very least, I predict that YouTube will host a big event, such as the Super Bowl, a live news broadcast, or anything else that is live (or very close to live) and thus can be streamed with commercials and be *just like regular TV* in that it will be shown in a parallel time slot with the TV counterpart. I know they did a New Years thing this year (just online), but I’m thinking bigger and less anti-social sounding.

    And those are my predictions for 2007.

    Dynamic Constants

    Yesterday, I hit a wall in PHP that took a little brain power to solve. I posted the solution on the web for everybody to see. A little back-story:

    Constants in PHP (the const keyword) can not have dynamic values. Defines can. In other words, only ONE of the two following lines is valid PHP code:

    define(‘THE_CURRENT_TIME’, time()); // OKAY
    const THE_CURRENT_TIME = time(); // FAIL

    This gets annoying when you are building constants that are pieced together from other constants. Constants are useful because they are, essentially, a way to name two constants the same thing. For example, two classes can both have a constant called NAME, and you don’t have to worry. This is better because separate developers don’t have to know what defines are already being used.

    But PHP doesn’t allow constants to have dynamic values. This is annoying. Why not? Probably so that two separate instances of a class don’t have differing values for a given constant (imagine a const with the definition equal to rand()). Sure, it makes sense, but then why aren’t there read-only variables in classes!?

    Scraping my post:

    In realizing it is impossible to create dynamic constants, I opted for a “read only” constants class.

    <?php
    abstract class aClassConstant {

        /**
         * Setting is not permitted.
         *
         * @param string constant name
         * @param mixed new value
         * @return void
         * @throws Exception
         */
        final function __set($member, $value) {
            throw new Exception(‘You cannot set a constant.’);
        }

        /**
         * Get the value of the constant
         *
         * @param string constant name
         * @return void
         */
        final function __get($member) {
            return $this->$member;
        }
    }
    ?>

    The class would be extended by another class that would compartmentalize the purpose of the constants. Thus, for example, you would extend the class with a DbConstant class for managing database related constants, that might look like this:

    <?php
    /**
     * Constants that deal only with the database
     */
    class DbConstant extends aClassConstant {
       
        protected $host = ‘localhost’;
        protected $user = ‘user’;
        protected $password = ‘pass’;
        protected $database = ‘db’;
        protected $time;
       
        /**
         * Constructor. This is so fully dynamic values can be
         * set. This can be skipped and the values can be
         * directly assigned for non dynamic values as shown 
         * above.
         *
         * @return void
         */
        function __construct() {
            $this->time = time() + 1; // dynamic assignment
        }
    }
    ?>

    You would use the class like thus:

    <?php
    $dbConstant = new DbConstant();
    echo $dbConstant->host;
    ?>

    The following would cause an exception:

    <?php
    $dbConstant = new DbConstant();
    $dbConstant->host = ‘127.0.0.1’; // EXCEPTION
    ?>

    It’s not pretty, nor ideal, but at least you don’t pollute the global name space with long winded global names and it is relatively elegant.

    Variables must be *protected*, not public. Public variables will bypass the __get and __set methods!! This class is, by design, not meant to be extended much further than one level, as it is really meant to only contain constants. By keeping the constant definition class separate from the rest of your classes (if you are calling this from a class), you minimize the possibility of accidental variable assignment.

    Managing this instance may be a slight pain that requires either caching a copy of the instance in a class variable, or using the factory pattern. Unfortunately, static methods can’t detect the correct class name when the parent name is used during the call (e.g., DbConstant::instance()). Thus there is no elegant, inheriting solution to that problem. Thus, it is easier to simply manage a single instance that is declared using conventional notation (e.g., new DbConstant…).

     It’s not the best solution I’ve ever came up with, but it’s a hell of a lot better than having giant config files with 50 character long defines in it.

    Why an abstract class? Well, the whole point of constants, as I mentioned, is so that you don’t have a big pile of names in one namespace (so you can have two constants called “NAME,” for example). Any benefit of constants would be lost if I made a single unified “constants class.” Thus, I made it abstract. An abstract class can not be used without first being extended. This forces a developer who might use my code to properly separate out the various constants he/she may have into their own class groupings.

    I Hate Magic Quotes

    Today, I’m going to give away some source code! Celebrate! I wrote the code to address a relatively common problem among new programmers: the over-reliance on Magic Quotes.

    Do you know what Magic Quotes are? It’s the annoying feature in PHP that goes around randomly (okay, not so random, per se) modifying your data to protect you from yourself. If you have it turned on and someone types in crap in your web form, well… Let me show you.

    Original:

    I hate magic quotes because it’s awesome at screwing up otherwise good content. And if you’re unlucky and decide to edit your text, it tends to add even more backslashes into your pretty content. Thus, stuff like ‘\’ becomes \’\\\’.

    Freak version after I decide to edit the content:

    I hate magic quotes because it\’s awesome at screwing up otherwise good content. And if you\’re unlucky and decide to edit your text, it tends to add even more backslashes into your pretty content. Thus, stuff like \’\\\’ becomes \\’\\\\\\’.

    Awesome, huh? Do you see that freak-show at the end there? That’s right: if you forget to strip backslashes out of your content before you let people edit stuff they’ve saved, it can get progressively worse, adding in backslashes like your mother added veggies to your meal when you were a kid. It’s that bad. And it’s very common.

    WHY?

    Of course, the logic for the feature is obvious. The designers of PHP decided that it was better for the content to get jacked up than for millions of developers everywhere getting fired for letting 15 year old hackers run “DROP DATABASE” in their corporate servers (for those of you who don’t know what I mean, that command equates to Armageddon on your servers).

    But I still hate it. It smells of noobishness, and it encourages sloppy coding. Having to strip slashes out of your code is not the way you should be doing things. There are four reasons I argue against magic quotes.

    1. If you are having to use magic quotes, you’re already committing tons of SQL sin.
    2. Not all servers you work with will have magic quotes on by default. Programming for security should be a defensive practice, and, thus, programmers should be trained to assume the least secure environment.
    3. Magic quotes alone won’t protect you from SQL inject attacks. Character encoding can be used to pass in un-escaped single quotes.
    4. This feature will be gone in PHP 6.

    I’m not going to go too much into #1 because that list is way too long to cover. In short, you should be using prepared statements to minimize SQL injection, and using a single, unified database abstraction class to handle all your querying to centralize security weaknesses.

    The second point is important. You can’t be a great body guard if you assume nothing is ever suspicious. Same goes for being a programmer writing secure SQL. Of course, the whole point of magic quotes is to prevent the body guard from forgetting to check one of the closets, even though he checked every other room in the 1000 room house. People forget, and that’s unfortunate. Magic quotes is that paranoid body guard that goes around handcuffing anything that moves, and it’s your job to go around and free each person. It’s the guy that walks around nailing every door shut and insists everybody lives in a bullet proof glass box. There has to be a better way, and there is.

    The third point is the one novices rarely know. You can pass in invalid characters into a query that then get converted, thanks to Mr addslashes, into a single quote. This relates to the inconsistency of converting single-byte characters into multi-byte characters. As the article quoted here mentions:

    Whenever a multi-byte character ends in 0×5c (a backslash), an attacker can inject the beginning byte(s) of that character just prior to a single quote, and addslashes() will complete the character rather than escape the single quote. In essence, the backslash gets absorbed, and the single quote is successfully injected.

    Yeah, it’s gibberish to me too. The point is, converting certain hex values in certain foreign languages screws up and can leave you with a hanging single quote. Magic quotes don’t save you from that.

    Lastly, PHP 6 isn’t going to have magic quotes on by default. Might as well take off those training wheels now.

    The Fix

    I have two solutions for you.

    The first is to write a database abstraction layer. What? A database abstraction “layer” is a fancy word for a class that manages your database connection and data manipulation. I gave a brief example of how to write one of these a while back. Another is to create a function (or method, if you’re talking about classes) that does INSERT and UPDATE querying for you. Such a function’s prototype would look like this:

    function perform($tableName, $data, $whereClause)

    The $tableName variable is a string, such as “user”. The $data variable is an array where each key is a column name and each value is the value to be assigned to that column. This data is sanitized (addslashes and what not) before being inserted. The $whereClause variable is used as a suffix to an UPDATE query (ex. “WHERE user_id = 1”). This method commonly exists in many open source projects.

    The problem with this method is that you still have to escape variables manually for the WHERE clause. And it doesn’t even cover how to do SELECT statements safely.

    So I sat here thinking for a bit about this, trying to think of a simple, elegant solution to this problem that would help PHP beginners everywhere. Before I hand out my solution for free, let’s go over the main points:

    1. SQL injection attacks (vulnerabilities) come from putting user provided input directly into SQL queries. User input comes from $_POST, $_GET, and sometimes $_COOKIE.
    2. Data that is passed around by the developer that was retrieved from the database is mostly safe since it has already been sanitized (if sanitization happened before it was loaded in).
    3. Data from files, XML, or other forms of potential stream input need to be sanitized as well. But this would be done manually.

    The Class

    That said, I wrote a class that solves the main point. With it, all POST, GET, request, and cookie data can be accessed through a nice clean abstraction layer. The goal is that if you’re using this, you’d avoid using un-sanitized data, unless you meant to. For example, to access the $_GET[‘name’], $_POST[‘name’], $_REQUEST[‘name’], or $_COOKIE[‘name’] variables, you’d call:

    $safe = new DbSafe();
    // O’Reilly becomes O\’Reilly
    $name = $safe->get(‘name’);
    $name = $safe->post(‘name’);
    $name = $safe->request(‘name’);
    $name = $safe->cookie(‘name’);

    If you wanted to get the original unmodified values, you’d call:

    $safe = new DbSafe();
    // O’Reilly is still O’Reilly
    $name = $safe->get(‘name’, TRUE); // notice the second parameter
    $name = $safe->post(‘name’, TRUE);
    $name = $safe->request(‘name’, TRUE);
    $name = $safe->cookie(‘name’, TRUE);

    That even takes into consideration whether or not magic quotes are on. In other words, if magic quotes are on and your variables are getting slashed up, the code I show above would spit out the original version that was typed in by the user. What good is my library if it didn’t do some auto-detection, eh? =)

    If you wanted to escape a value manually, you’d say:

    $safe = new DbSafe();
    // It’s becomes It\’s
    $escapedValue = $safe->escape($value);

    Or…

    // It’s becomes It\’s
    $escapedValue = DbSafe::escape($value);

    If you wanted to escape an entire array, you’d say:

    $safe = new DbSafe();
    $escapedArray = $safe->escapeArray($array);

    Or

    $escapedValue = DbSafe::escapeArray($value);

    All of these examples would convert a string (or an array of strings) that said:

    Hello, my name’s Michi

    To:

    Hello, my name\’s Michi

    When you saved this into the database, that little backslash disappears so next time you read it, it looks like this:

    Hello, my name’s Michi

    No need to strip anything! If you want to directly access the values without stupid slashes being automatically added in (“magically,” if you will), my class supports that as a secondary measure.

    Is this class the end-all-be-all for secure programming? No. Really, the better solution that I won’t give away today is to write a strong database abstraction layer. But this will do most of your dirty work without requiring magic quotes, and without making developers think PHP has some kind of built in “security.” Remember, you can’t always rely on magic quotes being on, nor should you.

    You can get the source here.

    Left Join Snafu

    How embarrassing. I learned something new today that I really should have known for some number of years now. Left joins can increase the result set size. 

    Here’s what I thought left joins do: When you combine two tables together with a left join, the source table (the one on the left) becomes the “anchor” for the results, guaranteeing that each and every record in the left table shows up in the result. If there are results in the right table that don’t correspond, those results are omitted. If there are results in the left table that don’t have corresponding records with the right table, those records are shown either way. For example…

    Let’s say table A has 10 records pertaining to people’s names. And table B has five records pertaining to where those people live. No people live in two places.

    If you did a left join on these two tables, you’d end up with five people and their addresses and five people (NULL sets) with no address information.

    And…

    Let’s say table A has 10 records pertaining to people’s names. And table B has 12 records pertaining to where those people live, where each person in A has a record in B. But two of those records don’t match up with anything in table A because some person records were accidentally deleted (oh no!).

    If you did a left join on these two tables, you’d end up with 10 people with information about where each one lives. The extra records in B are simply ignored. 

    Okay. That part was easy. Everybody knows that, even your grandmother. Let’s take this a few notches up.

    Now if table A has 10 records pertaining to people’s names. And table B has 15 records pertaining to where people live. And this time, those extras are no mistake! Because a bunch of people live in two places, thanks to vacation homes.

    If you did a left join on these two tables, what happens? Well, embarrassingly, I predicted this sucker wrong. Assuming all 10 people from A are mentioned in B with some mentioned twice or more, the result would have 15 records!! What!? 15!? Yeah, that was my reaction too. I thought MySQL would spit back 10 and ignore duplicates in B.

    Let’s do one more example. How many records will we find if we join the following scenario:

    Table A has 10 records pertaining to people’s names. And table B has 15 records pertaining to where people live. One guy has 15 vacation homes and everybody else is homeless (no records in B).

    Ok. Do a left join. Not an inner join. Not a regular join. A left join. How many results do we get, huh?

    Our result would be 24! Who the hell guessed that? Well, probably some of my more pretentious Computer Science readers, but certainly not me (so that’s what you learn in CS, huh?). It is 24 because you have 15 duplicate records for the one rich guy and 9 default records for the homeless saps. 

    Thus, the maximum number of records a left join can yield is sizeof(record set A) + sizeof(record set B) – 1. Why is this never explicitly mentioned!

    For a long time, I thought left joins meant the result set can never be more than the row count of the result set in the left table. I don’t know how I managed to go through this many years without realizing my error, but I suppose through good query structuring and table use, I never encountered a problem with this until now… And, to my credit, it wasn’t a query I wrote either.

    I have never seen this behavior mentioned in any documentation (even MySQL documentation). It seems to be an implicitly assumed function of the command. In fact, I found several examples out in “tutorials” about left joins, that conveniently left out mentioning this fact, but still showed it as an unexplained portion of their results. Nice.

    For all of you non-Computer Science gurus, I hope you learned something new from reading this post. Wasted about an hour of my time.

    Error Reporting

    For those you striving to become great PHP developers, make sure you code with error reporting set to report on “strict” mode. This particular error reporting type gives suggestions on your code for commonly made mistakes due to sloppiness. The ideal configuration setting for error reporting is:

    E_STRICT | E_ALL

    Or in PHP, you would start the page with:

    error_reporting(E_STRICT | E_ALL);

    Voting Machine Software

    So I’m sure you’ve read a thing or two about all those crazy electronic voting machines being inaccurate. One thing I find slightly perplexing is why the misrepresented votes seem to always be in favor of the Republican party. I don’t get it. It’s not like the voting machine companies would be so blatant or stupid to try to rig an election so outright. Especially in a world that is already so suspicious of electronic machines. But if it were purely a bug, wouldn’t it be equally likely that Democrats benefit? Of course this could always be explained by the fact that perhaps there is a procedure in place for inputting candidates and Republicans and Democrats get placed into the system in a specific order (such as Democrats being added in first). Who knows.

    So with the constant attention those digital voting machines get, a lot of people ask, “WHAT is so difficult about writing software that tallies votes?” Now I’m not one to study the Diebold machines, but I thought it would be interesting to pick at the problem.

    Database Issues

    First of all, the votes must be logged. But not just any log. It must be secure and immune from tampering. And when I say “tamper,” I am talking about from everybody. That includes the developers, the database administrators, the voters, and the polling staff. I can only begin to imagine that they use a bunch of one way hardware encryption and md5 checksums.

    The votes would need to be isolated from each other from the data integrity perspective: if vote #35252 breaks the system, all prior votes (#1 through #35251) must remain unscathed. Although most modern databases use transactions to ensure data integrity, I would imagine there is no fool proof means without creating a replica of the vote on a second or third physical location.

    Of course, such data replication causes problems in the event data is inconsistent. What happens if the primary fails and the vote was only recorded on one of the two slaves. Do you count that half vote? What if a replication error had occurred where one slave copied something differently from the primary? Which is right? These things happen (database corruption) and they usually tend to clump up together to result in catastrophic failures.

    Purposeful Fraud Issues

    Let’s attack this from another angle. The main culprit to election day problems will probably be human “error.” An electronic machine must protect against this. Unlike a punch card that the actual human physically pokes, a digital machine does the card punching for you (on its hard drive), which is almost like telling someone to punch in your vote as you specify.

    There’s been instances of a programmer placing bugs in slot machines that gave them jackpots if they bet in a certain order. There have been cases of system administrators leaving back doors into the servers. There’s a huge list of historical events that show that no system, no matter how hard a company tries, is secure from malicious employees. But that is exactly what this system must be designed to fight. How would you ensure it is safe? Peer code reviews? Multi-part passwords that require three separate people with three separate passwords to authenticate? Physical keys, like the one you see in movies, where both people have to have different keys turned at the same time to open a machine? Okay, so let’s say you somehow secure your employees. The problem doesn’t stop there.

    I’m setting up the machines. “Let’s see,” I say to myself with a grin, “Kerry is going to be candidate 1, and Bush will be candidate 2… for now. At the end of the night, I go back and say, “Oops, I meant 1 equates to Bush and 2 equates to Kerry!” With any regular database, this is entirely possible, and everybody’s votes just got reversed. Of course a smart voting machine would never let you change around the names for a created record. But then again, hackers don’t need to worry about that.

    So the voting machine company decides that you “can’t” change the name of a candidate after it’s been put into the system. What happens if I were to put in a second “Bush” to dilute his votes between his mystical twin? Or what happens if I create a new candidate half way through the election under his name? Well, in some instances, the software might just show him twice (this is good) or in others, it would show him once (this is very bad). In crappier software, that of course means voters would be voting for one OR the other “Bush,” but nobody would know exactly which.

    Of course the voting company would protect us from ourselves by ensuring candidates can’t be added in after the machine is shipped out. But therein lies another problem.

    Synchronizing Issues

    Let’s say you’re running the voting company that is running an election across a few dozen districts. Of course, all the votes must be tallied. A “Bush” vote in one county must group up with a “Bush” vote in another. But how? The human answer is to use the name, but realistically, we know that another “Bush” might be running under a different position in some counties. You can’t just use the name as the qualifier because it is not unique. So you would use IDs, I presume.

    But of course this means every machine must use an ID that is not internal to it. You would say, “All 1’s are Kerry’s and all 2’s are for Bush!” Now that this is decided, you would have shipped out all of the machines to only accept votes for Kerry = 1 and Bush = 2. And when the machine gets back, you would save it into the main system as 1 = Kerry and 2 = Bush.

    But where’s the sanity check? Who knows what happened while that box was out there in the wild. How do you know that 1 is indeed still representing Kerry for that box? How do you know that everybody that voted “Kerry” on that box got saved in as a “1”? This is even more of a problem if you do the counting right in the same place that the voting is taking place.

    And even if you did use names, despite it being a horrible idea, how do you know that a “Kerry” vote got saved as “Kerry?” For all you know, there is a bug, and all Kerry votes are getting saved as “Bush” and all Bush votes are getting saved as “Nadar” because someone forgot that array indices start at 0, not 1 (theoretical technical explanation for how these bugs could arise).

    So of course, that means you would write a binary log of all activities that box experienced. But what is this log for? Auditing? Shouldn’t auditing be happening at every step of the way regardless? If anything, problems are much harder to catch in the digital version of voting so this audit trail would rarely if ever be used except in the most extreme cases. Okay, so I’ve convinced you that it should be used all the time, right? Okay, but then what?

    Is it being replicated? Is it safe from incomplete transactions? Will a corrupted insert break the entire file? What happens if the power cuts out right as it is writing a record? Is the whole file toast? Suddenly you realize the log file must also use a database to ensure its integrity. Possibly on a separate process to ensure it is isolated from the main vote records.

    But what the hell is the point of all this? If there is going to be a discrepancy, shouldn’t it have been caught during testing? Why go through all this trouble double logging and replicating all of this data?

    Conclusion

    The last point is the most important. You’ll notice that through simple logic, we suddenly had to have tons of auditing overhead to do something so simple. And despite your best testing efforts, things that should be absolutely positively without error are still being audited to ensure their integrity. So what happens when you overlook one of these “no-brainer” assumptions?

    You get voter fraud.

    This only covers some theoretical problems that I might face when trying to put together a voting machine. I would assume a well-funded corporation would generate a list or problems 10x this length. While tallying votes may be simple in concept, if your application must be 200% bug free and hacker proof, developing the application becomes immensely difficult.

    This still doesn’t explain my original thought about the Republican vote bias though.

    Cookies and Frames

    Sometimes, I just don’t have the time to blog even if I have something interesting to write about. I also try to only update with relevant posts. FYI.

    I spent last week fixing up some JavaScript on a site that has frames. While I have plenty of past experience that tells me using frames is bad, this one took the cake. Here’s why.

    1. Everything is dandy until you have to have a frame access the other’s variables. This did not go smoothly in IE7. I was very flustered with that since it was “supposed” to work.
    2. JavaScript functions need to be placed globally. This was a problem when I had two frames work closely together until one frame got logged out by the system and popped to the login screen (complete with missing function declarations)!!

    The moral of that story is not to use frames!

    Also, I got to play with cookie manipulation in JavaScript. This was new to me. Which, by the way, was the eventual solution I used to have my two frames pass data back and forth. Anyway, cookies are WEIRD in JavaScript.

    They are accessed through the “property” document.cookie. However, unlike regular properties, when you assign a cookie, you use a different format  than when you output it. In other words, assignment looks like:

    document.cookie = “some cookie string that’s gibberish I don’t want to explain now”

    Whereas reading the cookie (document.cookie) returns the data values for *all cookies* for the current domain. Oh, and in one giant semi-colon delimited string. Annoying.

    My advice on using cookies in JavaScript? Write a class that manages them first.

    What NOT to Name Your Form Fields

    This one is a programming post. Did you know you should never, ever, EVER name a field in your form “submit”. If you recall my previous posts, you’d recall that JavaScript can treat any old regular variable as a function. If you stupidly name your submit button “submit,” which I’ve seen done all the time, you overwrite your form’s ability to call the submit method!!!

    Don’t do this:

    <input type=”button” name=”submit” value=”push me” />

    Or…

    <input type=”text” name=”submit” value=”” />

    Etc. It all messes up your JavaScript!

    In other words, the following otherwise working function calls completely break (and you get cryptic errors about undefined functions or incorrect parameter counts):

    document.form[‘form-name’].submit();
    this.form.submit();
    document.getElementById(‘form-id’).submit();

    They all fail because “submit” now refers to your form field that you created, which clearly isn’t the function you thought you were calling!

    Catching up (8, 9, 10)

    I’ve been a little busy with moving, so that is my excuse for being AWOL. Let’s continue.

    I learned that Ikea dressers require far more effort to build than beds. Probably twice as much time and effort.

    I learned that the California Department of Corporations has wacky accounting that seems to work against their best interest. I just received a full refund for a processing fee they kept requesting last year. Strange.

    I learned that it is unexpectedly rare to find web developers who have tried creating their own object-oriented database abstraction layer. I found it is even more rare to find developers who took this abstraction layer and made the (in my opinion) relatively obvious step toward creating a generalized abstraction layer that removes the need to write SQL 90% of the time. For those of you who haven’t thought about this before, creating such a layer, the pride and joy of the rails movement, is relatively simple. While there are many schools of thought on how to accomplish this, I think a simple place to start is to setup something like this:

    // Notice my example assumes any table you want.
    $object = new DBLayer(‘tablename’);
    // Runs an equivalent of SELECT * FROM tablename WHERE
    // primary_key_field = 30;

    $object->load(30);
    // Overload PHP5’s __set() method (see documentation)
    // store this in an internal array so that table fields like
    // ‘tablename’ don’t accidentally erase object settings.
    // Thus, “$this->mData[‘username’] = $value;”
    // Just see the documentation of __set(). Trust me.
    $object->username = ‘new username’;
    // runs an equivalent of UPDATE tablename SET username=’new
    // username’ WHERE primary_key_field = 30;

    $object->save();

    There are many ways to figure out the primary key. One idea is to standardize the primary key name so that “tablename” always has a primary key of “tablename_id”. Another idea is to dynamically determine it by running a “DESC tablename” and caching the results. Think it over. It’s an interesting, but highly insightful challenge. My example may be a little advanced, but this is the starting point of those shiny “rails frameworks” you hear about.