Update: Rewrite Rules on Apache 1.3 are Greedy By Default

Well, that was annoying! Some of you may have noticed intermittent outages on my blog while I was trying to fix the URL’s. Scary stuff. Anyway. This post will be gibberish if you don’t understand regular expressions. If you’re one of these people, I suggest you turn back now. 🙂 This is, after all, a technical blog too. 😛

What I discovered was that I was incorrect in my post last night about the catch-all .htaccess entry that would redirect all traffic from michikono.com/blog. Here’s the wrong rewrite rule:

RewriteRule ^/?(.*) http://www.michiknows.com/$1/ [R=301,L]

The goal was to take whatever text came after “michikono.com/blog” (such as the post name), and stick it on the end of “michiknows.com” so that all the old articles translate over without outages. Unfortunately, I noticed a few problems.

The correct code is as follows:

# match just blog
RewriteCond %{REQUEST_URI} blog/?$
RewriteRule . http://www.michiknows.com/ [R,L] 

# match blog posts
RewriteRule (.*) http://www.michiknows.com/$1 [R,L]

Why two regular expressions? Well, I couldn’t use the reluctant modifier (“?”) to make it catch a case when there was no trailing slash. Thus, no matter what I did, it would act as if there no no trailing slash. This broke stuff such as the RSS feed!!

The problem was that Apache 1.3 uses a greedy catch all by default that can not be disabled. In other words, the “*” can’t be set to be non-greedy by adding a “?” behind it. This is possible in virtually all other implementations of regex. The warning flag is that when you put a question mark behind a “+” or “*”, it will give you an error!

So my new solution breaks the problem into two steps.

  1. First, I check specifically for a hard link to the blog home page, which may or may not contain a trailing slash. If so, it will just forward it to this site with a trailing slash.
  2. Then I setup a second catch all rule that just does a straight search and replace. It doesn’t bother with the trailing slash stuff at all since it just snips the entire URL and tags it on.

Why does it seem like the second rule could replace the first? Because the greedy operator acts weird and doesn’t behave as I want. I tried for an hour straight. Believe me. No matter what hack or work around I used, there was always a case that no longer worked (usually the home page bug). And to be honest, regular expressions with Apache are just plain horrible to work with. This solution finally worked, and is what I will settle with (even as I write this post, I tried three other solutions that should work in any other regular expression environment — but failed to generate positive results).

So if you ever decide to move your blog, try the above solutions before giving up.

Digital Transmission Right – The Anti-DRM Proposal of Bennett Lincoff

So the big news today is the new open letter against DRM. If you don’t want to read a 28 page PDF white paper, keep reading; I have summarized the article’s most important points.

An IP law attorney named Bennett Lincoff has thrown his hat into the ring, but unlike everybody before him, he has offered a solution. Lincoff is suggesting introducing a new type of digital music distribution right  (“digital transmission right”). Here are its upsides:

  • All other traditional distribution rights as applied to digital dissemination are abolished.
  • Regular consumers can copy their own music to other devices or mediums without fear.
  • Downloading non-DRM music would be the norm. No music would need DRM ever again.

But it doesn’t equate to unrestricted free distribution. And this is the key part.

  • Any website that would be distributing music would need to obtain a license.
  • Any centralized P2P network would need to obtain and license.
  • Webcasts would need to obtain licenses.
  • Users of social networking sites would need to obtain licenses unless the website already has one (key point).

How do the record companies make money? Consumers would flock to sites that have the newest, highest quality content where they can share and mingle. This means sites like MySpace, YouTube, and others would need to obtain licenses to continue to operate legally. Seeing as I just named two extremely popular web sites that largely owe their popularity due to copyrighted material, you can see how there is definitely a market for this. The record companies make money by selling this right to distribute to these services.

The juicy stuff starts on page 12. Here’s a blurb about lawful operation of such services:

Licensed services, being lawful, would be able to operate openly, attract investment capital (without exposing investors to copyright infringement liability), and offer users the most sophisticated functionalities. … Service providers who obtain through-to-the-user licenses would have a competitive advantage over those who do not even though they would be required to pay license fees. The availability of through-to-the-user licenses under the digital transmission right would provide a positive economic incentive for service providers to secure the authorization they need.

The above quote also mentions another key point in his proposal, the ability of service providers to obtain “through-to-the-user” licenses, which would essentially be a license authorizing its users to share with each other through its service. In other words, because the license is explicit and replaces the old rights, it is now very clear that not obtaining a distribution license would be stupid. Thus, he argues licenses would be adopted widely.

Operators of centralized P2P networks would be jointly and severally liable with their network participants who share recorded music with others on the network. … Alternatively, a single license held by the operator of the network could authorize all digital transmissions of the licensed recordings through the network.

This means P2P would stay alive. Consumers are able to share and download music without fear of having personal liability – if they are at a properly sanctioned website.

Let’s be clear here: he is not advocating a free file-sharing utopia where nobody pays a dime. He suggests that P2P sites may need to charge to offset this new cost. But let’s think about this. If I can download all the music I want every month, with zero liability, no DRM, at CD quality, the ability to copy and burn it to as many separate devices as I want, and it costs money… Well, it doesn’t sound too shabby, right?

And this means iTunes will still be around. Nothing about it would change except now you would be able to download music without DRM. It also means Rhapsody and other streaming services will be forever changed. This is because his proposal no longer distinguishes between streams and downloads. All that matters is that it was transmitted. Today, there is no difference between a stream and a download except usually the stream has more DRM in it.

I just touched on a very important concept that Lincoff stresses. Because consumers are now able to copy media for personal use without restriction, a whole new market would open up. iPods work with your DVR that works with your laptop that works with your car audio system that works with your computer that works with your Zune that works with… well you get the point. Right now,     interoperability between music devices is a huge obstacle that makes pirating more attractive than ever. He is arguing that with such barriers gone, new and innovating devices will flood the market, further pushing music into everybody’s daily lives, thus further increasing demand for legal channels to obtain the music.

This guy is smart, and has thought through his arguments. For example, he stresses how the natural ecosystem of the Internet will favor legal channels if this distribution model is used.

Decentralized P2P file-sharing networks, on the other hand, do not have network operators … Accordingly, each participant in a decentralized P2P file-sharing network would be responsible for securing authorization for their own conduct on that network. … And again, it stands to reason that the vast majority of consumers who are interested in P2P would likely seek out networks that had secured licenses that authorize their file-sharing activities…

A decentralized file-sharing network would stick out like a sore (and very liable) thumb because you as a user would be liable for whatever you share. Besides, why use such a service when there are other legal alternatives? Well, let’s be honest, some always people will. But through the logic of the above example, it’s clear that there will be a very strong demand for a legal downloading service.

Wait, but how is this different from today? I mean, it’s still illegal to be copying music over a decentralized network, right? Nothing is different!

Wrong. Sharing is legal on sanctioned web sites. Public perception will be different.

As in, YouTube could get a license letting users upload copyrighted content. Napster could buy a transmission license and exist exactly as it did in 1999 (assuming they can make the money back). It means my Rhapsody account lets me download whatever I want whenever I want.

Perhaps the most interesting point he makes is about litigation becoming accepted.

Under these circumstances there would be no justification for public outcry over the industry’s litigation campaign against those who continue to infringe.

He has a point. Pirating is less attractive than it has ever been. His solution both addresses the economic (convenience) and social (“it’s okay because the RIAA is greedy”) reasons people pirate. He argues that action is urgent by reminding the reader that (slow) broadband adoption rates are the only thing keeping pirating at a still relatively low point:

The worst outcome for the music industry would be if worldwide broadband penetration overwhelms the industry’s ability to police unauthorized distribution of recordings before a full, fair and feasible solution for the digital music marketplace is in place.

While I haven’t gone into it here, the last 10 or so pages are dedicated to discussing royalties. He has thought through how the royalty structure could work in a global digital economy. He suggests royalties are paid by assessing both where the transmission originates and ends. He also discusses how the royalties should be divided between the owners of the works. Amazing.

In short, his approach doesn’t just swing blindly at the music industry and DRM. Rather, he takes an intelligent, unbiased, and fair stance that shows that a true compromise can yet be reached. I like his solution, how about you?

Who Else Wants to Hide Their WordPress Folder?

Tonight, I solved a very old problem in WordPress security among novice users. I will show you how to hide your WordPress admin directory while still being able to use it! When I say “hide,” I mean you can rename the wp-admin folder to whatever you want!

The Code (for people who don’t want to read)

Copy and paste the following into your .htaccess file (located wherever your WordPress folder is) to “rename” your wp-admin folder! If you are having trouble editing your .htaccess file, you should Google around for that as it’s beyond the scope of this article (or post a question in the comments and maybe another person can help).

  • Change YOURSECRETWORDHERE to something else. It can be any word you want. Just make sure it’s unique and somewhat long. Make it, like, your pets name or something random. Read this post to understand why this matters.
  • Change ADMINFOLDER to the new folder name you want. Letters, numbers, underscores, and dashes only. That ^ in front of it is on purpose. Don’t delete that.

RewriteEngine On
RewriteBase /
##### Michi’s code is BELOW #####
RewriteCond %{REQUEST_URI} wp-admin/
RewriteRule .*\.php [F,L]

##### Michi’s code is ABOVE #####
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

Note: there are a few drawbacks to this hack. Read the bottom of this post for those.

The Explanation

My adventure started when I read a pretty terrible piece of advice that suggested using the .htaccess file to restrict who sees your admin section by IP. Great, so if I’m at work, I can’t login. So if my IP changes, I can’t login. If I’m at Starbucks, I can’t login. That’s retarded. That’s not a solution!

But it’s on the right track. The .htaccess file can do a lot.

Oh, and if any WordPress developers ever read this, please make the word press admin folder be a variable name you can change! It is retarded that it is a hard coded.

The .htaccess file shines best when it is used for URL rewriting rules. For you non-programmers, the next block explains a little about what I just said. If you don’t care, skip it.

It is good for making URLs access files that don’t necessarily exist on the server exactly as they appear in the URL. For example, Digg.com uses URL rewrites to hide file and variable names. So the URL digg.com/videos certainly does not point to a file or folder actually called “videos”. Rather, it probably turns into something like digg.com/somefilename.ext?type=videos. The point is, you can hide what’s actually happening behind the scenes. I hope you get the idea.

Disabling the wp-admin Folder and Creating a Secret Mirror Folder

There are two steps in blocking access to the wp-admin folder. Disabling it is easy, but making it still functional is the hard part. Additionally, there are CSS files and other dependencies in that folder that must still be used. So after disabling it, a condition must be added that makes it only be disabled when appropriate.

RewriteCond %{REQUEST_URI} wp-admin/
RewriteRule .*\.php [F,L]

  1. The first line says “If the word wp-admin is found in the URL…”
  2. The second line says, “And if the query is missing our password…
  3. The third line says “And it’s a PHP file… Deny access.”

We’ll get to that password thing in a minute. At this point, if you visit wp-admin/, it will not work. Half way there!

The next part is the guts of it all. We get to set our very own admin folder! I want to call my admin folder “secret_room”. So here’s how the code would look:

RewriteRule ^secret_room/(.*) wp-admin/$1?%{QUERY_STRING}&YOURSECRETWORDHERE [L]

This next block is for you technically oriented people:

The first part basically makes sure the rule doesn’t trigger itself later (recursive condition). This is basically saying “if the URL starts with ‘secret_room,’ then replace that part with wp-admin. Then, add in the query string (things after the question mark). Finally, add in the secret word.”

Now, if I go to the folder secret_room/, it will work just like wp-admin used to!

Don’t use “secret_room.” That’s my example. You use whatever folder name you want. Letters, numbers, underscores, and dashes only.

But we’re not done yet. That secret word thing needs to be customized. Why? Well, try this. Go to your blog’s wp-admin folder, but this time, add on “?YOURSECRETWORDHERE” on the end and it will work too (as in, myblog.com/wp-admin/?YOURSECRETWORDHERE)! Curious why? If you’re a little geeky, read the next block. Otherwise, skip it.

Well, this hack works by changing the URL you type in by adding that “secret word” on the end of it. It only does this when someone visits the “secret_room” folder. But it doesn’t add it on when you just type in the wp-admin/ folder (or any other location). Then, when someone is looking at a wp-admin folder, it looks to see if that secret word is in the URL. If you went to the URL by hand, you likely did not type that word in. But the “secret_room” always makes sure the secret word is attached. This is how it distinguishes between visiting wp-admin directly, and visiting it through the mirror folder. Remember that any re-writing of the URL happens behind the scenes, so your browser won’t show you what’s going on.

Since I just gave this same code to about 10,000 people, it’s in your best interest to change your secret word to be unique to you. Note that nobody will ever see it, including you. You will forget what it is, and realistically, it doesn’t matter what the hell you set it to. As long as it’s not the default one I just gave to you. Ideally, it should be long and something highly unlikely to appear in a URL. Try your name, then maybe add your favorite color. I don’t know. Just do something random. Case matters.

Here is what the final .htaccess, ideally, should look like:

RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} wp-admin/
RewriteRule .*\.php [F,L]
RewriteRule ^secret_room/(.*) wp-admin/$1?%{QUERY_STRING}&YOURSECRETWORDHERE [L]
# BEGIN WordPress
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
# END WordPress

Benefits and Drawbacks to Hiding wp-admin

This hack has its drawbacks.

  • The “edit” link on your posts will no longer work. You may want to remove it from your theme.
  • The admin link on your side bar will no longer work. You may want to remove it from your theme.
  • The standard login link will no longer work. Instead, use a bookmark as it will redirect you back to your hidden login page after you finish logging in.

Note that the first two drawbacks can be addressed by editing wp-includes/link-template.php: line 248 and 263. Change “wp-admin” to your new folder name. However, this hack would need to be re-done if you upgrade WordPress. If you make these hacks, it will only be visible to users who have permission to see these links anyway.

There are a few significant upsides:

  • If ever again there is another vulnerability that hits the WordPress wp-admin folder, you are very likely immune.
  • Upgrading WordPress doesn’t un-hide the folder. It will persist through upgrades.

Remember, this hack will not protect you from having an insecure admin password. Although, it could protect you from a hacker since he won’t know where to go after successfully logging in (hah!).

Lastly, be careful when doing this. If you type something wrong, you’ll get server errors (I believe error code 500). Make sure you type it in exactly as you see it in these examples first. Then change one part at a time.

Changing the Admin User

One other point I noticed when tightening up my security was the default admin user name. Now, hah, this is assuming they actually brute force my password and then figure out how to get to the admin folder… good luck.

I noticed that I had an admin user account under the login name “admin”. Well, that’s a no-brainer. I went into the database and ran the following query:

UPDATE wpt_users SET user_login = ‘[my new username]’, user_nicename = ‘[my new username]’ WHERE wpt_users.ID = 1 LIMIT 1;

That solves another part of the problem. Now hackers have to guess not only my password, but also my username.

In Closing…

If you like what you’ve read, I’d appreciate it if you could Digg/Reddit/Stumble this article. 🙂

Michi Knows – Dedicated Blog URL

Update: These re-write rules are wrong. See this post for correct rewrite rules.

My friend gave me a suggestion to make myself a “brand” out of this blog by calling it “Michi Knows”. Apparently, he stumbled into the name while thinking about the URL of my website. Funny story.

Anyway, I liked the name so I bought it. What’s another domain, right? Besides, it was bugging me that my blog had no title and had weird formatting.

Porting Word press over to this domain wasn’t all fun and games. Most importantly, I have tons of incoming links I wasn’t about to 404. My goal was to make sure every single link out there coming to my site would correct redirect. Here’s how I did it.

There is a file called .htaccess that web servers use to setup rules for processing requests. Or in other words, when someone visits a server, that file is checked and any rules listed in it dictate what to do next. My blog was previously located in a folder called blog/. I placed a .htaccess file there and put in the following content:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^/?(.*) http://www.michiknows.com/$1/ [R=301,L]

Ignore that weird HTML tag thing. What’s important is the stuff in between.

  1. The first line is telling the server “hey, we’re gonna do some funky stuff with the URL.
  2. The second line tells the server “hey, everything is relative to where we are now” (blog/)
  3. The third line says “change anything you see into http://www.michiknows.com and then put whatever stuff came after “blog/” on the end of the new URL.

Thus, the above code would change:




Notice the distinction. The (.*) means “anything”, and the $1 refers to the “anything” found in between the parenthesis. The “R” means redirect, the “permanent” is more for stuff like search engines. The L stands for “Last” – as in: “this is the last rule, stop processing more”.

Let me know if anything in my new blog is broken. 🙂

Here’s some things I considered before moving that I hope you all consider if you ever move blogs:

  • Moving means killing any page-rank I may have gained. My blog had a page-rank of 5 in only a few months. I figure I can do it again. 🙂
  • Moving means losing a lot visibility in search results. Now that the content is on an unknown domain, results can omit me a lot. I don’t even know what Technorati is going to do!
  • Any tracking you have of users is destroyed due to the changing of cookie domains. Yep, now everybody is anonymous again.
  • Incoming link tracking in Word Press completely breaks. It currently says “No results found.”

Anyway, no big deal. I look forward to finally having a blog-only domain. 🙂 Thanks Jackson for the name idea.