InsectNation

we are accidents waiting to happen


Writing a web site: hints and tips

This page is a collection of more advanced tips for good web design and implementation. The emphasis is on more advanced sites and design than introduced in the website starter-guide but there are hopefully tips here which are useful to all who want to implement their sites for maximum portability and functionality.

In the (so far) absence of time to write anything that might pass as coherent prose on this subject, here's a bulleted list of do's and don'ts:1

  • Don't use <br/> tags to make vertical separations: use CSS's border and padding attributes instead.
  • Write in XHTML, rather than HTML 4. It's actually easier, because the rule set is much cleaner and the tag nesting is consistent, as well as allowing you to expand to more funky technologies like XSLT without difficulty.
  • Don't use underscores to represent spaces in file (i.e. page) names: the underscore is harder to see and more awkward to type than a hyphen (minus).
  • Validate your HTML and CSS with the W3C tools!
  • Build pages according to semantics first and then style them with CSS afterwards: separating style from content is possible most of the time.
  • Hide the underlying technology from the end user: content-negotiation can be useful here since it means you can avoid having file extensions in your local links.
  • If your site is (going to be) large, having a sensible directory structure which follows the site content structure is a good plan. This also allows you to isolate data like common includes, script libraries, XSLT files and CSS files from the content pages. If using PHP, you can still use the require- and include-type directives as if you have a flat directory structure by setting the PHP include_path parameter. This is either specified in the php.ini configuration file (or via a .htaccess file if PHP is being run as an Apache module).
  • If you use PHP to preprocess your web pages, the auto_prepend and auto_append facilities may make your life much more pleasant. These will automatically append a specified header or footer file to your pages, meaning that the actual content is more independant of the containing site. Implementation independance is a good thing! :-) These configuration directives can be set via php.ini or .htaccess files as for the include path.
  • If you have more than one page in your site, use an external CSS file and, if you can, put the common tops and bottoms of your pages in external header and footer files.
  • Write your site in a web-oriented text editor rather than, e.g. FrontPage or Dreamweaver. If you do use a more funky tool, you should still do your coding by hand — no other method will give you the degree of control that's required for good semantic design and standards compliance.
  • Have some familiarity with the ethos and details of the W3C web standards — the antithesis of good design is the "This site is best viewed at 800x600 in Internet Explorer 6" attitude. The standards aren't perfect, but sticking to them is the best way to ensure that your site can be viewed everywhere.
  • Test your site in a text-only browser like Lynx. It may not be your "target market", but a site that still makes sense in Lynx is probably doing the right thing when it comes to semantic structure.
  • Please, don't use graphics to render paragraph text unless you absolutely have to. On images which have a small amount of text, repeat the textual content in the <img> tag's alt attribute so that text browsers and users with accessibility requirements can also read it.
  • Flash is no longer cool on web pages. It's hacky, slow, obsfucated and can't be guaranteed to work on the client's browser. Flash has a very poor record for functionality: it looks pretty and keeps designers happy, but from a usability point of view it's awful.
  • Java and JavaScript have much the same failings as Flash. They are often turned off or have their functionality restricted on the client browser. The main valid use of JS used to be implementing rollover behaviour on link images, but this is now much more efficiently implemented via CSS, using the :hover pseudoclass.
  • The image alt tag is not meant to pop up a "tooltip" as IE does — it's intended to provide a textual alternative to the image. If you want a tooltip, which provides additional information about the object, then use the title attribute.
  • Designers love small text, so many websites suffer from tiny, unreadable paragraphs. The natural reaction to this is to resize the text (e.g. Ctrl-+ in Netscape/Mozilla), which often breaks the rest of the site and makes it look much worse than it would have done with larger text in the first place. Keep in mind that the point of a website is to disseminate information. Looking good is secondary, so keep the text the default size (or larger) unless you have a very good reason.
  • It's often better semantically to use <em> and <strong> tags rather than <i> and <b> to add emphases to text. Both are valid, but give it some thought :-)
  • Rendering blocks of code is often done on technically oriented sites, but the indentation isn't maintained within <code> tags. This leads many people to use a <code><pre>blah</pre></code> construction — uugh! Better is to use CSS again: declare something like code.display { white-space:pre; display:block; margin-top:1em; margin-bottom:1em; } and use that instead: it makes CSS styling of code blocks much more powerful and is, again, semantically better.
  • Every web page officially needs a DOCTYPE declaration as its first line of code (view the source of this site for an example). However, browsers will usually work without one and do their best to still render the page accurately. The DOCTYPE tells the browser what kind of HTML to expect: the HTML version, XHTML or whether you want the "Transitional" (read as "sloppy" or "forgiving" according to prejudice) or "Strict" (read as "draconian" or "accurate") so if you want to make sure that the client browser views the HTML as you intended, rather than running in "quirks mode" as they usually will if it's omitted, then use a DOCTYPE declaration. If you include a common header file for your pages, then this is dead easy — you only have to add it once!
  • For some reason having a "splash page" for your site (usually a funky graphic with "click here to enter the site" written on it somewhere) was once very popular. In reality it's dumb — most people will find your site through a search engine or a recommended link and the splah page is either unseen or impedes them in their search for content. Don't use 'em — I find that putting the most regularly updated part of the site ("latest news" or suchlike) as the front page works best, since that's what most people want to see.
  • Don't use JavaScript to validate forms — form validation isn't something to be done client-side — it should be done on your server, where you have control over the validation process. A good way to do this is to read the submitted data into an associative array and if there's an error in any field, then write the data back to the form page (with an appropriate error code) as a query string. This behaviour is easily written into a function which can be used for all form processing and allows robust and flexible error handling.
  • Don't use cookies to store preferred language information — many web users regularly flush their cookie cache (for example I set my Firefox browser to flush the cookie cache at the end of each session). A better method is to use content negotiation, where pages can be stored on the server as e.g. index.en.html or index.fr.html. The web server will then select the appropriate page to view based on the langs HTTP header sent by the browser when requesting the page.
  • Use get rather than post for sending information to pages which might be reloaded: if you use post and someone wants to reload the page (for example, a photo gallery where the photo set being displayed is dependant on form-submitted data) then they'll get a dialogue box asking if they want to resubmit the data. If you want the page to behave like a normal web page, then encode the metadata in the URL as a query string rather than in the POST header.
  • Use post rather than get for sending information to scripts where security is an issue: the opposite case from the previous example is where you don't ever want the page to be reloaded, such as in processing payment details from a commercial site. In this case use POST: not only will it ask the user if they really want to send the data twice, it's also more awkward to spoof input data. Note that this doesn't amount to genuine security — anyone who wants to spoof headers can do so, which is why you should never trust client-side validation, but it makes it more awkward.

That's it for now. If you have any other suggestions, please drop me an email :-)

[1] What the hell is the punctuation meant to do in that phrase, huh?