Welcome to the Mobile Web 2009

I just realized that I will soon be learning a great deal about something I know next to nothing about, but am very interested in: mobile web development.  And that will be a large focus, albeit not the sole focus, of this blog in the foreseeable future.

I know next to nothing about mobile development, and am somewhat in the dark ages with my V3c (affectionately known as the RAZR) which has no data plan.  In fact, I would be somewhat intimidated to try to type in web addresses into its web browser.  It’s not the ideal way to browse the web, to say the least.

On the other side of the spectrum we have wonderphones such as the Apple iPhone, which is supposedly pure joy to use.  I’ve recently been indoctrinated into the Apple cult by purchasing a Mac Book Pro earlier this year, but I suppose my devotion isn’t quite strong enough, as I don’t yet have the faith to baptize myself into the world of all things iPhone.

It’s not the price of the iPhone that scares me.  I know it’s a good phone, so I would be happy dumping down upwards of $500 just to own it (the maps feature alone would save me hundreds of dollars in gas, as I’m a particularly terrible driver/navigator).  As a new AT&T customer, the price for me would actually only be $99.  But what scares me is the $100 monthly bill for the plan (I’m probably slightly exaggerating, but it’s close to that!).  By the end of the year, that totals the price of a brand new laptop.  And I’d much rather have the laptop.

Anyhow, back to the iPhone itself.  I hadn’t realized it, but the iPhone is part of something much bigger than just mobile web development, but applicable to web development in general.  Its Safari browser has been continually implementing cool new HTML 5 features.  And as these features become supported, web developers have been taking advantage of them.

That being said, there’s always the downside: many many phones such as my trustworthy RAZR lag behind and don’t even come up to par.  In the desktop world, it’s the bane of developers known as Internet Explorer 6.0.  In the mobile world, I’m almost afraid to ask.  The sheer number of different combinations of screen sizes and mobile browsers makes it humanly impossible to develop for ALL mobile browsers.

I suspect that most individual mobile developers decide to develop for only one phone, which happens to be the only phone at their disposal (more and more this is becoming the iPhone).  For the individual, this is really the only sane thing to do.

On the other hand, for big companies to focus only on one phone is extremely risky.  While the phone might be popular, the popular wisdom has always been not to put all of your eggs into one basket, right?  And besides, why develop for only one phone when you can develop for many phones, expand your reach, and ultimately increase sales (I say “ultimately”, but that’s basically the drive that was there from the beginning).

Because this reach is unattainable for the individual developer, the task of developing and researching these phones is the burden of the big corporation.  I like what Yahoo! is doing in principle with Blueprint, and I’m sure we’ll see many of these same services pop up soon from other companies (they very well might have already, I really just don’t know).  It seems that making progress in this area will require either a huge effort on the part of one company or the collaboration of the community.  But so far the community (aka individual developers) seems to be unaware or unconcerned with the larger phone market share.  They’ve jumped into mobile development with a very limited focus, not realizing that literally most people don’t use the iPhone, however cool it may be, and how promising it is.

So what is the state of the mobile web?  It’s in a state which I suspect it will always be in, just as desktop browsers: looking forward to the future but also clinging to the past in the name of backwards compatibility.

A dead simple JavaScript Binary Search Tree (BST)

Nicholas Zakas has a nice writeup on creating a binary search tree in JavaScript as part of his computer science series, where he’s explaining basic CS principles such as bubble sorting and linked lists entirely in JavaScript.  This is a great idea in many ways, most of all because it further separates learning programming from learning a compiler (such as a strictly-typed language like C++).

For a beginner, Nicholas’s writeup will look a bit daunting.  It contains some advanced JavaScript principles that beginners won’t be familiar with (namely prototype), along with some things that make the search tree look unnecessarily complicated on first glance.  There’s a fine line between creating great code and creating understandable code.  In this case, we should opt for the latter, since this is an educational exercise (also considering that you would probably never want to use a binary search tree in JavaScript in practice, but that’s another subject).

So how then should we go about this?  I like the approach borrowed from philosophy that builds on first principles: start out small and gradually build onto your theory.  Likewise, we’ll start with a barely functional binary search tree (BST) and go from there:

[sourcecode language="javascript"]var SimpleBinarySearchTree = function () {
	// define our private variables
	var root = null;

	// node constructor
	var node = function () {
		var node = {
			value:  null,
			left:   null,
			right:  null
		};

		return node;
	};

	// create the root node
	root = new node();
	root.left = new node();
	root.right = new node();

	// recursive function that finds the correct place to insert the node
	var insert = function (value, curNode) {
		// if curNode isn't defined, set it to root
		curNode = typeof (curNode) !== 'undefined' ? curNode : root;

		if (curNode.value === null) {
			// empty spot has been found, so insert here!
			curNode.value = value;
			curNode.left  = new node();
			curNode.right = new node();
		} else {
			// still looking for a place to insert
			// if value is less than curNode, go down to the left node, otherwise go down to the right node
			insert (value, value < curNode.value ? curNode.left : curNode.right);
		}
	};

	// display to the console using inorder traversal
	var display = function (curNode) {
		// set default parameter for curNode to root
		curNode = typeof (curNode) !== 'undefined' ? curNode : root;

		if (curNode.value !== null) {
			display (curNode.left);
			console.log (curNode.value);
			display (curNode.right);
		}
	};

	// declare public functions
	return {
		insert: insert,
		display: display
	}
}

// time to test it out!
var bst = new SimpleBinarySearchTree();
bst.insert (5);
bst.insert (1);
bst.insert (10);
bst.insert (3);
bst.insert (9);
bst.insert (2);
bst.display ();
[/sourcecode]

Ok!  Now we're really getting down to it.

When we create a new SimpleBinarySearchTree(), an empty root node is created (line 17).  Then when we insert a value, there's a check to see if the root node has a value yet (line 26).  If it doesn't, then the root note is assigned that value.

When we insert more values after this, we start at the root node and compare it with the new value (line 34).  If the new value is less than the root value, we go down to the left, otherwise if the new value is greater than the root value, we go down to the right.

You will notice that insert() is a recursive function, since it calls itself.  If it hasn't found where to insert the new value, it goes down to the left or right of curNode as appropriate (see above) and redefines curNode to be that left or right node, repeating the process until it finds an empty node into which to insert the new value.  This is much easier to understand if you have encountered recursion before.  In Nicholas's version, a while loop is used instead, which is easier to comprehend, but not as compact.  I opted for more compact.

Our display() function is also recursive, and works off the principle of inorder traversal, which means that it first visits a node's left node, the node itself, then the node's right node.  With the trick of recursion shown above, we will see all numbers outputted to the console in order.

This is under the assumption your browser has a console.log feature!  If it doesn't, just replace line 45 with alert(curNode.value);

Ok!  So that's the bare minimum binary search tree.  We have a function to insert values, and a function to check out and confirm that our output is in fact displayed in order!

At this point the tree is missing some nice functions that would be useful in practice:

-a function to find if a value is currently in the tree

-a function to find and remove a value from the tree

-a function to delete the entire tree

-a function to find the depth of the tree

-a function to find the number of elements in the tree

But these will have to come in another blogpost :)

In the meantime, checkout these links:

Nicholas Zakas's original post: Computer science in JavaScript: Binary search tree, Part 1

Edits to Nicholas's code which removes the root and implements recursion

$ is not defined error

I first ran into this error when I started using jQuery.  It turns out it’s a somewhat common (beginner’s) mistake of trying to run a script without first waiting for jQuery to finish loading, which results in a race condition.  The easiest fix for this is to make sure jQuery has its own separate script tag right above your own jQuery-dependent script:

[sourcecode language="html"]
[/sourcecode]

This will ensure that myscript.js is loaded only after jQuery is loaded.  Of course there are other ways to do this, notably John Resig’s degrading script pattern, but the above solution will at least solve the “$ is not defined” error, which is priority.

However, this may not be enough.  Theoretically it’s possible that jQuery doesn’t get loaded (due to some network error) but your script does.  In that case we come full circle to our original “$ is not defined” problem.  So how do we prevent this?

The first thing I thought of was to make sure $ is defined by wrapping it in an if-then statement.  Note that this does NOT fix the problem:

[sourcecode language="javascript"]//note: this method DOES NOT work!
if ($) {    //check if $ is defined - but this check results in an error!
    $("#myselector").click (function () {});    //my jQuery code here
}[/sourcecode]

With the above code, we run into the same problem!  Arg!  But don’t despair.  It turns out we need to prevent this sort of error by using the good old try-catch block:

[sourcecode language="javascript"]try {
    $("#myselector").click (function () {});    //my jQuery code here
} catch (e) {
    console.log (e.message);	//this executes if jQuery isn't loaded
}[/sourcecode]

And it works!  If $ is undefined (and therefore jQuery isn’t loaded), the error is caught and handled instead of exploding.

Since we don’t live in a perfect world, it would be a good idea to assume that jQuery will not always be loaded and that this “network error” scenario may well occur, so it would be good practice to run a check like this before executing code that depends on jQuery (and likewise for other scripts with other dependencies!).

Global variables and you

Lately I’ve been a bit addicted to answering questions on Yahoo! Answers.  I think it’s because of the excitement I feel when I can finally find a question I can answer!  :)

Anyhow, one of the recent questions was regarding global variables in JavaScript.  One person responded incorrectly that writing “var” in front of variable names makes that variable global, according to documentation from Mozilla, which they misread.  So I jumped on my blog to correct this, and I was going to write here that writing “var” in front of your variable name doesn’t have any effect on the variable being global, and then I found that I myself had misread the documentation:

You can declare a variable in two ways:

With the keyword var. For example, var x = 42. This syntax can be used to declare both local and global variables.

By simply assigning it a value. For example, x = 42. This always declares a global variable and generates a strict JavaScript warning. You shouldn’t use this variant.

Oh!  So apparently writing “var” does matter!  For some reason I was under the impression that the fancy “module pattern” a la Crockford guaranteed the protection of variables within modules.  But I was wrong!  Check out the following code:

[sourcecode language="javascript"][/sourcecode]

What I learned by testing for myself is what apparently wasn’t made clear to me previously: writing all your variables within functions isn’t enough to protect them.  Even in Douglas Crockford’s “module pattern” there’s a way to foul up and end up with unintentional global variables.  That was especially surprising – as it turns out, I thought global5 was completely protected from the outside unless I defined it as a return value of the function.  But this isn’t the case – global5 is definitely accessible from outside the function.  Try it for yourself with a simple alert(global5) script!

Fortunately this can be remedied with a simple (yet annoying) solution: get in the habit of both writing variables inside functions and with “var” in front of them.

Converting to XML or JSON with YQL

XML, JSON, YQL.  Is that enough acronyms for you?

XML and JSON you’ve probably heard about, but maybe not YQL, which is Yahoo!’s SQL-esque query language which was released to the public late last year.  It’s primarily advertised as a service to access data from Yahoo! properties such as Flickr and Yahoo! News.  What you may not know is that you can also use it to access any XML/RSS or HTML (!), which Chris Heilmann demonstrated on Ajaxian.  Yes, HTML!  This eliminates a lot of the headache involved with searching for updates on a page that doesn’t offer an RSS feed (although YQL can also read RSS feeds, it only exports to XML and JSON formats, which means there’s more work involved if you want to convert the HTML to RSS)

What makes YQL especially awesome is the YQL Console, which allows you to test queries instantly.  There’s a few example queries to get you started, but they’re all examples using YQL tables from Yahoo sites.  That’s cool, but what if we want to get information from any website?  Easy!

For instance, here’s a query to get the first 3 links on google.com in JSON format (with a callback of googlelinks):

[sourcecode language="html"]

select * from html where url="http://www.google.com/" and xpath='//a' limit 3

[/sourcecode]

The console gives us a big REST url and also shows us the output of the query (which can be accessed by visiting the REST url):

[sourcecode language="javascript"]googlelinks({
 "query": {
  "count": "3",
  "created": "2009-02-25T03:45:00Z",
  "lang": "en-US",
  "updated": "2009-02-25T03:45:00Z",
  "uri": "http://query.yahooapis.com/v1/yql?q=select+*+from+html+where+url%3D%22http%3A%2F%2Fwww.google.com%2F%22+and+xpath%3D%27%2F%2Fa%27+limit+3",
  "diagnostics": {
   "publiclyCallable": "true",
   "url": [
    {
     "execution-time": "28",
     "content": "http://www.google.com/robots.txt"
    },
    {
     "execution-time": "53",
     "content": "http://www.google.com/"
    }
   ],
   "user-time": "85",
   "service-time": "81",
   "build-version": "911"
  },
  "results": {
   "a": [
    {
     "href": "http://images.google.com/imghp?hl=en&amp;tab=wi",
     "onclick": "gbar.qs(this)",
     "content": "Images"
    },
    {
     "href": "http://maps.google.com/maps?hl=en&amp;tab=wl",
     "onclick": "gbar.qs(this)",
     "content": "Maps"
    },
    {
     "href": "http://news.google.com/nwshp?hl=en&amp;tab=wn",
     "onclick": "gbar.qs(this)",
     "content": "News"
    }
   ]
  }
 }
});[/sourcecode]

If you check out our YQL query, you can see the field that specifies the XPath of the data you want to access.  Don’t worry, I hadn’t heard of XPath before this.  You know if you’ve read some of my previous articles that I’m sort of a beginning transitioning into more intermediate stuff, and blogging about it on the way.  Well, here’s something new I learned this time!

XPath basically provides a standard for accessing XML data.  The simple XPath in our query above was //a, which basically says “get all a elements from anywhere in the document”.  And of course our “a” elements are our links!

You can read more on XPath syntax at W3Schools.

My ultimate goal for this was to convert HTML pages into RSS for the purpose of checking for new topics on forums I frequent.  Unfortunately, not all of the forums provide an RSS feed, so I have to take the non-lazy route and actually visit the forum to check for updates.

Again, unfortunately YQL will only convert into either XML or JSON format, not into RSS.  RSS is a type of XML, but it has some constraints.  So the next step here would be to take either the XML or JSON and make something to convert it to RSS.  YQL takes care of the hard part – now it’s left to us to use the data.  :)

Note to self: write an HTML to RSS converter.

Calling out bad crawlers: the Kintiskton nuisance

I have never been involved in creating a web crawler, but as a website owner I’m well aware of the behavior of good crawlers versus bad crawlers.  For instance, a good crawler must not only follow the rules set by robots.txt, but it must also not impose an undue load on the server being indexed.

Famously, Cuil exhibited this bad behavior for at least several months before they claimed to have fixed it.  In any case, I had to ban their IP range because they were just hitting my site too hard (compared to all of the other major crawlers out there).

Today I’m looking at my traffic stats for my WWI flight sim site and I see that yesterday I got over 200% new visitors.  Strange thing was there was no major referring site, only direct hits!  What on earth!  So I check the logs and find that most of the IPs are from the range 65.208.151.112-65.208.151.119, which resolves to kintiskton-gw.customer.alter.net [63.114.61.170] before the tracert dies.

Apparently this IP block is owned by Kintiskton LLC, whatever that is.  When I do a Google search, I can’t find the actual company, only complaints about its crawler abusing people’s websites going back to December 2008 (several months).

The IP block is hosted by Verizon Business, so I shot over an email to abuse@verizon.net.  After several months of this Kintiskton doing their excessive crawling, hopefully Verizon will eventually step up and look into it.  Apparently they haven’t yet…

In the meantime, it’s good old Apache to the rescue.

I’ll be adding this to my .htaccess file:

Deny from 65.208.151.112
Deny from 65.208.151.113
Deny from 65.208.151.114
Deny from 65.208.151.115
Deny from 65.208.151.116
Deny from 65.208.151.117
Deny from 65.208.151.118
Deny from 65.208.151.119

Video: John Resig: “The DOM Is a Mess”

John Resig, who will constantly be younger than me yet much smarter than I’ll ever be, gave a talk at Yahoo last week regarding various issues relating to Javascript development and the DOM.  He had some interesting things to say about how jQuery goes about testing and resolving issues.
These days you should be in good company with a variety of Javascript frameworks to choose from, so us developers have to deal with these browser issues less and less now, but this is still good to know.  It’s fun to know what’s going on under the hood of your favorite framework.
Or if you’re looking at doing DOM manipulation without a framework, this is a good video to show you the landscape and show you what you’re up against.  Always check to see how other people have solved these issues.  Don’t reinvent the wheel!  :)

Where are the web development courses at colleges?

I keep hearing about these off-the-wall classes such as the new “Game Theory in Starcraft” at UC Berkeley, but where are the really useful cutting edge classes that teach you about useful things such as frontend web development?

Sure, there are tons of classes for hardcore C++, Java, and maybe even PHP, but there’s nothing that really trains people with the latest web standards (frontend developing).  And when there is a web course offered, it likely teaches outdated standards such as font tags, table layouts, and such.

You might say that HTML/CSS is quite easy to learn.  Sure, that’s true to some degree, but there’s more to it than simply making a webpage look good.  You can get away with making a webpage look and act halfway decent, show it to your employer and get paid, and that’s great if that’s all you care about.  However, probably unbeknownst to you, your webpages are sub-par and show poor craftsmanship, resulting in poor SEO, poor accessibility (closely tied to SEO), poor load times, and most likely cross-browser compatibility issues.  You owe it to yourself and your visitors to give them something more.

Perhaps some of the cause of this lack of education is due to employers not knowing the difference.  If you build your website layout using HTML tables, what does your employer care, as long as the website looks good?  Unless they were a former web developer, they are not likely to know much about web standards.  They will likely know about SEO, however, since their interest is closely tied to profit, and SEO is closely tied to profit…

Anyhow, back to the lack of frontend web courses in colleges.  This got me thinking about what a 16-week semester would look like for a class that I’ll call “Frontend Web Development”.  (bit of a brainstorm here, feel free to add topics that should be covered in the comments).

  • Week 1: Introduction, a bit about the history of web development, frontend versus backend development, the benefits of having a good frontend (making the case).  Explaining that this isn’t a design class, but it’s helpful to know some basic elements of design for create good-looking webpages.
  • Week 2: HTML: semantic HTML for content (one of three FE components), basic markup of a page (doctype, page header and meta content, heading tags, paragraph tags, anchor/link tags, span, em, strong, image tags).  Difference between block and inline elements.  Lists.
  • Week 3:HTML: SEO and accessibility concerns, tables (inappropriate and appropriate use example), div tags.  Transition into CSS.  Start building layouts with div elements and basic CSS (float, clear).  Intro to Firebug (just before getting deep into CSS).
  • Week 4: CSS: CSS for styling, selectors, basic CSS properties (color, border, margin/padding), knowing when to add extra HTML markup for styling
  • Week 5:CSS: selector specificity and conflicts (and using Firebug to debug).  Cross-browser compatibility: testing in A-grade browsers, common CSS hacks
  • Week 6: CSS: backgrounds, sprites
  • Week 7: More practice with CSS
  • Week 8: Mid-semester: Putting it all together: make a webpage utilizing everything we’ve learned so far.
  • Week 9: Javascript (the interaction leg): discussion of its importance in web 2.0 applications.  Basics of JS syntax, but not much emphasis here (this isn’t a basic programming class).  Discussion of the DOM.
  • Week 10: Javascript: briefly – native DOM manipulations.  Cross-browsers issues, and transition into the need for JS frameworks such as jQuery or YUI.  Show some demos of what they can do (jQuery’s documentation is especially nice, but YUI also has good examples).
  • Week 11: More practice with JS if needed.  Transition into setting up a developer environment (WAMP for Windows, MAMP for Mac?).  These don’t require much understanding.  Show where to put documents.
  • Week 12: PHP (is this getting too ambitious?): basic syntax and examples.  Again, nothing major here.  It’s just an introduction, so we can do some basic stuff and get the students started.  Demonstrate the use of HTML form elements to pass data between pages.  $_GET, $_POST, etc.  Cookies?
  • Week 13: Using PHP functions to load an RSS stream and display it on the page.  Using PHP for XHR in Javascript.  Use JS to fetch RSS streams from the PHP script.  Explain why the PHP intermediary is needed.
  • Week 14: Optimization of your website: using the best image type, placing script tags at the bottom of the document, lazy loading techniques, etc.  Start building a project that uses HTML, CSS, JS, PHP.
  • Week 15: Continue work on project.
  • Week 16: Finish up the project.

Is this too ambitious?  This is more or less the curriculum of the Yahoo! Juku program, and they accomplish this in about the same amount of time.

By the end of the couse, the students should have been exposed to enough material to be able to find more if they were interested, or to at least know the basics about web standards and frontend engineering.

What say you?  Is this too much to cover?  Did I leave out any important concepts (most likely!)?

“Divitus” and “classitus”

When you’re just starting with CSS, one of the things you’ll notice emphasized is the pattern of using divs instead of tables.  This is because historically, divs layout replaced table layouts.  But this has also lead to a misconception: that you should rely mostly on divs to present your content.  This overuse of div elements is called “divitus” (think of a disease..  “inflammation of the div”?..  hmm doesn’t have quite the same ring to it…).

This is due to the mistaken understanding that it’s ok to just use div tags around all the content you want to style.  Even professional web developers make this mistake.  But if you care about your code, you shouldn’t make this rookie CSS mistake.

Bob Dylan’s code is scatterbrained

Here’s a live example from Bob Dylan’s website:

So what’s wrong with it?  It works and it looks good, right?  Well, using the same logic, you might as well go back to tables, which work and look good, right?

Actually you will run into problems with this approach.  One of the most obvious problems with the above code is that it’s not friendly to search engines.  If you run a website and care about search engine rankings, you will not want to put your album title in just a div.  Instead you want to wrap it in an <h2> tag, presuming that the above page has multiple albums listed on it.  If the whole page is about that album, then you should wrap it in an <h1> tag.  This tells search engines the relative importance of the text in the tag.  It makes the text stand out for the search engine.

The second big problem is that there’s just an excessive amount of code here.  Whenever possible, you should always cut down on the amount of text, even if it’s the difference between a few bytes.  If you can make it smaller, then make it smaller!  Small improvements add up.

So how exactly do we make it smaller?

Original Code

First, here’s the original page with the tabs tidied up and CSS content removed, leaving just the selectors:

[sourcecode='html']

foreveryoung_rogers.jpg
Forever Young
[/sourcecode]

There’s a few things here that are just outright wrong:

  • Generally, if there’s only one child element, you probably can combine the parent and the child.  In the above code, this case would apply to <div class=”image”> (parent) and the sole <img> inside it.  This would also apply to <div class=”text”> and <div class=”inside”>
  • As mentioned above, <div class=”title”> should probably be replaced by an <h2> or <h3> tag (or <h1> if that title describes the entire page’s content)

Here’s how I would improve the code, while preserving all the necessary CSS selectors:

Revised Code

[sourcecode language='html']

    foreveryoung_rogers.jpg    

Forever Young

    Read the story
[/sourcecode]

Notice that I took it one step further and even removed a lot of the class tags.  An epidemic of too many class tags?  What could it be called.  You guessed it – “classitus”.  (web developers aren’t too creative when thinking of programming epidemics, apparently)

In this case I went on the assumption that there will just be one image in our blockof html, so I went with the more generic selector “.node img {}” which in English reads as “all images within an element with a class of ‘node’”.  Now be mindful that if we add a second image that also falls under this criteria, it will be styled the same way.  If we want to style the image an entirely different way, that’s when you want to add a special class for that image: “.node img.special-image {}”.

I used the same assumption on the link.  I saw that there is only one <a> element there, so the only CSS selector we need is “.node a {}”.  And if we want to add a second link with different styling?  No problem – we use the same solution as with the images: “.node a.special-link {}”.

Also notice that the original code was using an old-style clearfix (<div class=”clear”></div>) which required additional HTML.  There’s a newer fix which is solely CSS-based and requires only one CSS selector.  So nowadays it’s handy to create a “clearfix” class in all your CSS stylesheets and add it to elements when necessary.  You’ll notice I’ve added the clearfix class to the div in the revised code (there’s only one div now!).

So there you have it!  We went from 6 divs to only 1, from 9 classes to 4, and along the way our code got more semantic and search-engine friendly!

Related:
CSS Beginner Mistakes: “Divitus”, Absolute Postioning (css-tricks.com)

Z-index: don’t expect me to be working without no positioning!

I’m thinking of starting a section for “D’oh Moments” – aka moments where I made a simple mistake but it took me an unusually long time to solve the problem.

In this case it’s z-index, which I was trying to implement to fix a problem on IE (where most of the problems end up being, as you know).  The problem is that z-index had absolutely not effect.  D’oh!  That’s because one element wasn’t being positioned.  Simply apply this tag:

[sourcecode language='css']position: relative;[/sourcecode]

Apply that to whatever you’ve positioning with z-index, and voila, z-index magically works!  Noob mistake caused by me not reading the documentation close enough…