Conquer the command line with a hands-on bash workshop!

2005-06-30: queensland disability action week

Queensland readers may be interested to know that the Queensland Government Disability Services yearly action week is coming up shortly. Disability Online: Disability Action Week (17 - 23 July 2005): Disability Action Week celebrates the experiences of people with a disability, raises community awareness about barriers to equal access and promotes best practice. Through its focus on local action, the week has become a catalyst for lasting positive change in many Queensland communities.

2005-06-27: opera background image "bug"

I don't remember who said it, but someone once pointed out that if you solve a tricky problem... blog it. You'll help out the next sucker who is trying to solve it.

Recently I ran into a background image positioning bug in Opera 7/8. Well, ok, I thought it was a bug; but actually it was user error. Basically, I had a background image positioned center center. In everything except Opera, it was sitting - as expected - bang in the center of the page.

In Opera, the image was sliding up off the top of the browser canvas. I eventually figured out that Opera was positioning the image according to the length of the content, not the visible canvas area. Throw in some really long content and you'll find the image way down in the middle.

The problem was I'd forgotten to set the background-attachment, despite the fact that I did want it to be fixed. Since I hadn't specified what I wanted, Opera defaulted to the vertical centre of the element. Which, when you think it through, is the right thing to do - I'd said it should be in the middle of the element and the element was as long as the content. The other browsers use for 'we think you wanted...' approach and default to the length of the viewport.

So I explicitly set the background attachment:

background-image: url("img/watermark.gif");
background-repeat: no-repeat;
background-position: center center;
background-attachment: fixed;

...and all is well.

In my defence, half the office had a look at the problem and they didn't spot it either :) The moral of the story: beware of default settings and the lazy habits they breed. It will catch you out eventually. In this case, the overly-forgiving defaults of IE and Firefox had let me slip into the bad habit of expecting a background setting that I hadn't actually specified.

lies, damn lies and browser statistics

Browser statistics are troubled little beasts. They are affected by a multitude of inaccuracies, assumptions and technical limitations. Unless you really know what to look for, you're going to get some dud information.

A lot is being written at the moment about Firefox. Depending on where you look, it's either ripping through the market or it's languishing with just a small band of devotees. Both sides have stats to back their arguments.

i read it on the internet, it must be true

It doesn't take much to figure out that stats are only as good as their source; not to mention that you can make a statistic prove whatever you want:

  • "Satisfaction rates are extremely high, with nine out of ten people recommending this product."
  • "There is a high level of dissatisfaction, with 10% of customers saying they would not recommend this product to friends."

Both statements are absolutely true and backed up by statistics. Hardly elegant pieces of spin, but it's a demonstration of the obvious - it's worth mentioning as a reminder.

Similarly, Coyote Blog reminds us that you should not assume your own personal circumstances are representative of the entire market. Coyote Blog: Browser Market Share? Depends on Who You Ask: I have been a marketer for almost 20 years, and one of the classic mistakes in marketing is to rely too much on your own experience and preferences.

very fast, very dumb machines

On top of human error, there's only so much information that the machines can produce anyway. Computers do not think, they act on absolutely literal information. This is why they can look 'stupid' when they make mistakes a human would never make (but they make them so damn fast!).

When a stats package crunches server logs, it only gets the information that was available to the server. That can be fooled by odd browser identification strings, the vagaries of browser vs. rendering engine and spoofing techniques. Many browsers can identify as something else, in order to get past bad browser sniffing. A few specifics to illustrate the point:

  • Out of the box, Opera is set to 'identify as IE6' in order to get around crappy browser sniffing which only recognises IE and Netscape.
  • Some versions of Safari identify as Mozilla due to a very strange ID string, which gets truncated by most log processors before the 'Safari' bit.
  • There never was a 'Netscape 5', but you'll see it in your logs. Most people attribute it to pre-1.0 versions of Mozilla. I imagine they are correct.
  • Search bots can take up a large amount of your site traffic, throwing the percentages out. I've seen some sites which get more Googlebot hits than anything else, so the percentage of hits from actual humans are lower than they should be.
  • Netscape 8 can fire up a completely different rendering engine - IE or Firefox, for example - and who knows what that is going to do to your logs.

You know, it's a surprise geeks are into coffee and not vodka for morning tea.

throw me a frickin' bone here

Still, everyone gets asked for stats eventually. Developers may tear their hair out at the latest product which only works in IE, but managers will want some proof that IE is not the only browser being used... before you tar and feather anyone. Managers are paid to ask those annoying questions and besides, they have to justify the tar budget later on.

There are a few sources out there:

  • W3Schools | Browser Statistics. W3Schools are quite open that their stats probably show an abnormally low level of IE users, since their target market is web developers.
  • TheCounter.com | Global Stats gives statistics via page hit counters. Although serious business sites generally don't have stat counters; this is at least a large sample so it should avoid the sharp differences of sites like W3Schools. It is worth noting - as Stats Weenie observes - TheCounter seems to vary detection rules fairly frequently.
  • Browser News: Statistics gives a rundown of five sources and give a prominent warning about trusting stats. You might leave more confused than you started, but at least that's accurate.
  • Many people cite WebSideStory - Data Spotlight, however I find their level of spin a bit annoying (or maybe it's just that the name sits right on the edge of 'they must be killed' levels of annoying). Currently their two most recent posts are 'Firefox keeps growing!' following their earlier statement 'Firefox gains beginning to slow'.
  • The least biased view probably comes from the Wikipedia: Usage share of web browsers - Wikipedia, the free encyclopedia.

Update: two more sources - Market share for browsers, operating systems and search engines | marketshare.hitslink.com and W3Counter - Global Web Stats.

the wood for the trees

Despite the warnings of Coyote Blog, you should not ignore your own site logs. If your site has a huge number of Netscape users, then you wouldn't discount it just because TheCounter shows 1% market share. Such anomalies are not unusual in large organisations, where the Standard Operating Environment (SOE) reigns supreme. If two thousand users were given computers with Netscape 6 loading by default, that's probably what most of them are still using.

Yes, I know, it's not very likely that it will be Netscape. Probably an old version of IE. But you get the point. The nice turn-about is when you run a weblog about web development and IE6 is way down the bottom of the list.

so where does this leave us?

Take everything with a grain of salt and you'll be ok. Generally speaking you will be able to find trends in between all the numbers. You just have to watch out for the gotchas and remember that there are probably errors that you can not detect. Ultimately you are better off designing browser-independent sites and only look at stats for amusement.

Finally... I will say from the various stats and logs that I watch, IE has definitely lost ground in the last 12-18 months; with most going to Firefox. I have a site with a very broad audience. A year ago on that site IE had a 95% share, now it has 74%; Firefox has 10%; Safari and Netscape 7 have 5% each; Opera and Mozilla have 2%. Odds, sods and bots round out the figures.

But... well, don't quote me. It's just stats.

2005-06-26: don't blame poor security, blame it on ipods

TechWeb | News | Pod Slurping Dangerous To Enterprises: Nearly a year ago, an analyst from Gartner recommended that enterprises should think about banning Apple's iPods -- and similar small-sized portable storage devices -- for fear of data walking out the door.

What I would have recommended is that organisations use real security for sensitive documents. Staff and intruders have been able to do everything described here since computers had floppy drives... meaning basically the whole time computers have been in general use.

Suddenly getting tense about it now is stupid - just because the storage device is nifty and popular doesn't make it any different from any other portable storage device. Word documents do not require gigs worth of storage space, hell they still fit on floppy disks!

If you want to prevent data theft, store your data in secured locations. Workstation hard drives are not secure. In fact, the industry joke is the only secure computer is the one that is switched off, unplugged and locked in a safe. Even then, the joke isn't talking about the physical security of the box.

2005-06-21: force validity or get tag soup

Juicy Studio: Validity and Accessibility: The validity of documents on the web is by no means an absolute measure of its accessibility. Validity does, however, provide a solid structure on which to build accessible content. With this in mind, why does the Web Content Accessibility Guidelines (WCAG) working group consider validity to be an ideal, rather than a fundamental principle to guide us to tomorrow's Web?

Why indeed... in some senses, this is the WCAG being pragmatic; but in other senses it is relaxing a requirement for a benefit you'll only understand if you're an expert. Those who currently abuse automated checks will now be able to abuse them even further. So, major vendors will be able to provide content management systems which barf out vile code; yet they'll be able to pass automated checks and will claim their products produce accessible content. Since the people purchase the major products rarely know how to evaluate such claims, the vendors get away with it.

Essentially the only way to force the big players into caring about standards is to include them in accessibility requirements. Accessibility requirements make it into legislation in many countries, then at the very least they should turn up in requirements for government jobs.

Standards do not get into legislation on their own since they do not benefit a specific group. They benefit everyone, but their absence does not disadvantage anyone badly enough for the law makers to get involved.

Obviously clients should be pushing back and requiring standards, accessibility etc as part of the package when they spend millions of dollars on the latest CMS or portal... but to date they just don't. They take whatever scraps the vendors throw to them since they've already invested too heavily to back out.

So, back to WCAG. Relaxing the requirement for standards does make a kind of sense, for example allowing numbered lists to be restarted on search result pages. The start and value attributes are deprecated, but there is no reliable new way to insert that content since CSS support just doesn't cut it. I really feel this is an error - numbering is content, the content should not rely on stylesheets for its numbering. So anyway, the best thing for accessibility is to use deprecated code which will not validate. Hence WCAG allowing for this eventuality.

Is that the right approach? Well to me it would make more sense to sort out the markup standards so that they never worked against accessibility standards. I can't be the only one, since XHTML 2.0 (draft) does currently include value.

Essentially you've got to force people to do some things; sadly at this stage accessibility and standards tend to fall into that category. People would rather continue with their familiar old techniques than face any kind of learning curve. While making allowances on accessibility might be a workable idea, doing so at the expense of standards compliance really isn't going to help in the long run.

2005-06-18: search engine optimisation is the new black

Where I work, everyone's talking about Search Engine Optimisation (SEO) all of a sudden. Clients who barely knew the web existed a few weeks ago are now concerned about pagerank. What's a web developer to do? Well, if you're a freelance developer the obvious answer is "charge appropriately" but that's another story ;)

I'm yet to understand why an issue as old as SEO suddenly leaps to the top of peoples' buzzword list. However, let's not look a gift horse in the mouth; this is a great time to push web standards. Semantically-correct markup will increase your search rankings, tag soup will decrease it. You don't even have to mention standards, you can just tell the client you are "value-adding by leveraging bleeding-edge methodologies to optimise the search visibility of their pages and increase the ROI on their web presence" [Bingo!].

First off, let's get something straight here. SEO is not rocket science, no matter what the overpriced consultants tell you. The best thing you can do to get good search rankings is make your site accessible, write good content and stick to standards. I had an SEO consultant admit this straight up: If everyone used semantically-correct XHTML, nobody would need us. I'm not kidding and neither was he.

Search engines dig semantics

Why is semantic markup such a big part of SEO? Well, semantically-correct markup defines the structure of a document, which tells the search bots which bits of text are important. The <title> describes the entire document, the headings define the sections, link text indicates what information will be found at the target URL.

Basically, semantics give structure to a document in a way that computers can understand. In this case, the best thing for humans is the best thing for machines too.

Tag soup or semantically-void pages—even the ones that validate—don't have this structure. <font> tags have no semantic meaning, nor do layout tables (no heading cells) or even <div> and <span> tags (no matter how you style them). Only the real thing will do.

Accessibility is good for search engines, too

Search bots will benefit from the same accessibility features as many disabled users. Think of it this way: Search engines are blind, deaf, mobility impaired users with scripting and plugins turned off.

Search bots read the alt="" attribute of your images, since they can't "see" the image. Similarly, they use the alternative content in your <object> tags and not the big, glitzy Flash animation you've embedded. Search bots do not use a mouse, so they won't be onClicking anything. If you use JavaScript to trigger links, windows, navigation etc... search bots won't find any of it.

You can actually get a good idea of what a search bot can find by browsing your site with a text-only browser like Lynx. If you don't have access to Lynx (or an emulation), get into Opera or Firefox and disable everything - images, style (CSS), scripting, java, plug-ins... turn off the lot. Then test your site. Anything you can't see that way is probably not visible to search engines.

Some search engines do try to index PDF, Flash, Microsoft Office documents etc, however they have mixed success. Even if they do read the file, most search engines don't seem to rank such documents as highly as an XHTML equivalent.

Don't try to cheat

Some people ask how they can get onto the first page of results for a specific search term, expecting to hear about some kind of trick. Well, you can purchase "sponsored links" and keyword-based advertising; but there is no way to guarantee a specific place in "organic" search results. Organic results are the results achieved naturally in the rankings by a specific page. There are ways to cheat on organic results, but most of them are detectable and you stand a fair chance of getting blacklisted. Besides, they give you bad 'net karma.

It's important to be realistic about what you will achieve and how fast you'll achieve it. You will not waltz into the top ten for certain keywords ("Search Engine Optimisation" springs instantly to mind). Accept there are no guarantees and no way to ensure a number one rank. Don't hire or pay anyone who claims otherwise.

Popularity

On top of all of this, most of the leading search engines now pay attention to popularity. Google pioneered this idea: a page with more links leading to it probably has 'better' content than a page with less links, or at least more people have felt it was good enough to link to it. If a popular page then links to something else, that link should get more weight than a link from a page that nobody else has linked to.

As it happens, popularity is susceptible to abuse. This can be fun, for example the various Googlebombs that have taken place; however it can also be used to spam search engines for less amusing purposes. This is one reason some SEO techniques walk a very fine line between legitimacy and blacklisting.

An effective approach to SEO

So what's an effective and legitimate approach to SEO?

  1. Write good content.
  2. Use logical, semantically-correct XHTML.
  3. Meet W3C WAI accessibility guidelines.
  4. Let your organic results find their place.
  5. Be realistic in your expectations.
  6. Don't try to cheat (should I really have to tell you that?).

That's really it. If you want to boost traffic and you have the budget, you might want to get into the Search Engine Marketing (SEM) game; which basically means buying ads according to keywords. Spend wisely and measure your results; but get your SEO techniques done first. SEM is a cut-throat business and you should not enter it lightly.

Key points & principles

  • Bad content will get bad rankings. If you're boring, nobody will link to you.
  • Search engines don't load images, multimedia, stylesheets, client-side scripts or plugins.
  • If you can't browse to it in Lynx (with a keyboard), search bots probably won't find it.
  • Don't try to cheat the system.
    • Don't create pages specifically for search engines.
    • Don't try to pull in searches for irrelevant keywords.
    • Don't use "hidden text" (eg. black-on-black) to feed "extra" content to search engines.
  • Meta-data has been so badly abused, most search engines give their contents a very low priority/weight.
  • Consider how results from your site will look in search results. This will be particularly relevant when writing document titles.
  • Remember that Google is only one of many search engines.

Specific techniques

  • Get the <title> right:
    • Write meaningful <title> tags which are specific to each document.
    • If you include site and section names, include those after the page-specific terms to maximise results (annoyingly, Blogger templates don't appear to allow this configuration).
    • Don't use the same <title> text for every document on your site!
  • Structure heading tags properly and write meaningful heading text.
  • Use concise but meaningful link text.
  • Spell out acronyms and abbreviations in full, or embed the full text in <abbr> tags.
  • Provide alternate content for images and multimedia. Forget Flash.
  • Test your site with basically everything switched off, or better yet test it in Lynx.
  • Don't use long, ugly URLs; some search bots can't follow them.
  • Don't bother with meta-data. If you do, be careful.
    • Some search engines still use the description meta-data for search result summaries; so you may wish to include a concise summary of the document contents in a description meta-data tag. An increasing number generate contextual summaries however so weigh up the investment of time.
    • You may want to add extra terms in a keywords meta-data tag; however be careful not to get too far off the main topic; also don't insert a hundred keywords.
  • Target relevant, popular keywords. You can use adword suggestion tools to look up related search keys to the most obvious keywords:

Further reading

Some good places to continue reading on this topic:

2005-06-16: browser bits

2005-06-09: is that a meme in your pocket...

...or is Google just glad to see you? It appears that the @media 2005 Speakers are indulging in some "WordPl@y" (well they're not all on the list, but the correlation is too strong to ignore).

There are still a few on the list yet to post. I have a mental image of Zeldman looking for a network port for his laptop right now, or Joe Clark sitting in a net cafe... :)

2005-06-06: 20 questions

A pair of WSG 'ten questions' posts...

  • Web Standards Group - Ten Questions for Russ Weakley. Russ talks about the print-to-web transition, rebuts the art gallery sites argument and leaps into the fire on the XHTML MIME type issue... and that's just three of the questions!
  • Web Standards Group - Ten Questions for Jason Santa Maria: One of the biggest reasons web standards appealed to me is because I am a very organized and systematic person by nature. I can really get into code and zone out, obsessing over refining and optimizing it more and more. Being able to contain all that information, organized and itemized in one place is like popping mental bubble wrap. Heh. Popping mental bubble wrap.

2005-06-03: the joy of handheld

This is what I managed to output from a Treo handheld: <p>This is the third attempt at posting from a handheld.</p> (I also managed to post the title). Forensic examination has not yet determined what actual characters were inserted, since they weren't standard angle brackets.

The first attempt was using a Dell AXIM X30. After battling through the connection to wireless and our VPN, I finally logged in to Blogger only to have the compact IE browser crash as soon as I clicked the 'create post' button. Tried again, same result. No dice. Give up on the AXIM.

The second attempt was on a PalmOne Treo 650 connected to a networked PC workstation (no wireless this time). I was doing well, except for the fact I couldn't post until it finished downloading the entire page (include graphics and script). Then somehow I accentally closed the browser, or something. Lost the pecked-out post. Spend a few minutes lost in the hell of the Treo operating system, wondering how the hell you get out of any given screen (there's no close button, on screen or physical; no obvious way of finding out what's running...).

Eventually just click the browser icon again (Blazer I think it was called). The Blogger page reappears, but it's reloading everything. Somehow the post title came back, but the post was lost. Infuriated, I tap out a one-sentence post. Laboriously inserted what I thought were HTML tags, after stumbling onto keyboard help and figuring out how to insert angle brackets. Post... it works.

Load up this page.... discover the angle brackets weren't angle brackets, so to speak. Too peeved to bugfix. Log into my workstation and have the edit screen up in a matter of seconds.

My wrist hurts from tapping at the Treo keys and I'm glad I didn't pay for a handheld. I have come to the conclusion that handhelds are currently tools for making a six-hour airport layover feel like ten minutes, since all you can do in six hours is type one email that would take ten minutes on a laptop. I could definitely SMS faster on my trusty Nokia 3310, using Predictive Text Input and the number pad. That may have a lot to do with practice; but both of the handhelds reproduce a keyboard in micro scale, so you're tapping tiny buttons with a stylus or pecking at miniscule keys rammed up against each other.

The operating systems for handhelds look Windows-ish, but behave differently. For example the lack of 'minimise' or 'close' buttons, depending on the browser. Input is painful as all hell and you can't mask usernames and passwords properly.

Handhelds? Fuggeddit. Lug that laptop around, at least it has a keyboard and you won't get sued for accidentally killing someone when you hurl that stupid stylus away in frustration.

Perhaps in time you can work out the weirdness well enough to be able to do something useful with a handheld; but it's no good for serious interaction (eg. posting to a blog). As far as web content is concerned, you want to click and not much else. You even want to avoid entering URLs if you possibly can because it is so painfull slow.

There is still some attraction of having a net-capable device that small, but you wouldn't want to rely on it.

2005-06-02: appy berfday

Apparently ten years ago, something was in the water:

Two very different people launched two very different sites; and ten years on they're both still going. I've read both of them pretty regularly for the past five years, since I became a professional web developer. I have occasionally described myself as "more Zeldman than Nielsen" in my approach to web development. I frequently disagree with Jakob, rarely disagree with Jeffrey and shouldn't really use their first names ;) Zeldman is generally "make it better" but Nielsen often slips to "make it more stupid" ;) I keep to a limit between making a website usable and letting users be really dumb.

It takes a lot of commitment to run a website for ten years (my longest-running site has been up for nine years) and even more commitment to keep it relevant and influential.

So, happy birthday to Jeffrey Zeldman Presents and Alertbox; and kudos to the guys who have made them happen.

Blog Archive