Convert HTML to XHTML compliant code.

I’ve been working with some web pages that were written in 2005. I made some changes, but the page wasn’t displaying the way I wanted. The code has lots of nested tables with DIVs inside tables, so I probably just messed up on opening or closing something. The easiest way to find these kind of mistakes is to validate the code and fix the errors. Because the code is so old, it doesn’t validate as XHTML transitional so there were hundreds of errors. Most of the issues are related to capitalization, but a few are because the tags are not closed. I fixed one file by hand, but since I have lots of files that I am working with, I created this sed script to automate the process.


#### ConvertHTML.sed created 2016-01-09
#### Updated 2016-01-15
#### The global flag g is to be required for multiple occurrences on the same line
#### Sometimes the code is in JavaScript functions, so use single quotes instead of double quotes when replacing
#### TH is part of WIDTH, so need to use < and >
#### SELECT and TABLE are MySQL commands so make sure to use the < and >

s/HTML>/html>/g
s/HEAD>/head>/g
s/TITLE>/title>/g
s/BODY>/body>/g
s/META NAME=/meta name=/g
s/<LINK REL=/<link rel=/g

# Change the case and add the type
s/<SCRIPT LANGUAGE="JavaScript">/<script language="Javascript" type="text\/javascript">/g
s/<SCRIPT/<script/g
s/SCRIPT>/script>/g

#### Tables Be careful with TD, TR, TH, parts are in other tags
s/<TABLE/<table/g
s/<TD/<td/g
s/<TR/<tr/g
s/<TH/<th/g

s/TABLE>/table>/g
s/TD>/td>/g
s/TR>/tr>/g
s/TH>/th>/g

s/COLSPAN/colspan/g
s/ROWSPAN/rowspan/g

s/VALIGN=/valign=/g
s/=TOP/='top'/g
s/=BOTTOM/='bottom'/g
s/=CENTER/='center'/g

s/=top/='top'/g
s/=bottom/='bottom'/g
s/=center/='center'/g

s/ALIGN=/align=/g
s/=RIGHT/='right'/g
s/=LEFT/='left'/g
s/=right/='right'/g
s/=left/='left'/g

s/CELLPADDING/cellpadding/g
s/CELLSPACING/cellspacing/g
s/BORDER/border/g

# Make the tag conform
s/NOWRAP>/nowrap='nowrap'>/g
s/NOWRAP /nowrap='nowrap' /g

s/<HR>/<hr \/>/g
s/<BR>/<br \/>/g
s/<BR\/>/<br \/>/g
s/CENTER/center/g
s/<DIV/<div/g
s/DIV>/div>/g

s/H1/h1/g
s/H2/h2/g
s/H3/h3/g
s/H4/h4/g
s/H5/h5/g
s/H6/h6/g

s/<P/<p/g
s/P>/p>/g

s/CLASS=/class=/g
s/ID=/id=/g
s/STYLE=/style=/g

s/<SELECT/<select/g
s/SELECT>/select>/g
s/<IMG/<img/g
s/ALT=/alt=/g
s/SRC=/src=/g

s/A HREF/a href/g
s/<\/A>/<\/a>/g
s/_NEW/_blank/g

s/<B>/<b>/g
s/<\/B>/<\/b>/g
s/STRONG/strong/g
s/SPAN/span/g

s/<UL/<ul/g
s/UL>/ul>/g
s/<LI/<li/g
s/LI>/li>/g

s/HEIGHT=/height=/g
s/WIDTH=/width=/g
s/SIZE=/size=/g
s/FONT/font/g
s/COLOR=/color=/g
s/TYPE=/type=/g
s/Type=/type=/g
s/VALUE=/value=/g
s/NAME=/name=/g
s/<INPUT/<input/g
s/<FORM/<form/g
s/FORM>/form>/g
s/<OPTION/<option/g
s/OPTION>/option>/g
s/<INPUT/<input/g

s/<TEXTAREA/<textarea/g
s/TEXTAREA>/textarea>/g
s/ROWS/rows/g
s/COLS/cols/g

s/VALUE=/value=/g
s/METHOD=POST/method="post"/g
s/ACTION=/action=/g
s/TARGET=/target=/g

# JavaScript Calls
s/onLoad/onload/g
s/onMouse/onmouse/g
s/onmouseOut/onmouseout/g
s/onmouseOver/onmouseover/g

s/onChange/onchange/g
s/onSubmit/onsubmit/g
s/onClick/onclick/g
s/onError/onerror/g
s/ONERROR/onerror/g

s/cellspacing=\([0-9]*\)/cellspacing=\'\1\'/g
s/cellpadding=\([0-9]*\)/cellpadding=\'\1\'/g
# These can be percent
s/width=\([0-9]*\)%/width=\'\1%\'/g
s/height=\([0-9]*\)%/height=\'\1%\'/g
s/border=\([0-9]*\)%/border=\'\1%\'/g

s/width=\([0-9]*\)/width=\'\1\'/g
s/height=\([0-9]*\)/height=\'\1\'/g
s/border=\([0-9]*\)/border=\'\1\'/g
s/colspan=\([0-9]*\)/colspan=\'\1\'/g

# Should be able to match one or more in the previous with \+ but it isn’t working
s/\'\'\'/\'/g
s/\'\'\"/\"/g

# make the selected tag conform. Mine are in perl statements and conditionals
s/selected\"/selected='selected'\"/g

# Lots of image tags aren’t closed
#s/<img \([0-9a-zA-Z\=\/\.\'\"]*\)>/<img \1 a\/>/g

To run the code, save it in a file—mine is called ConvertHTML.sed, then pipe the output to a temporary file for review.


sed -f ./ConvertHTML.sed original.html converted.html

Fix the img tags for the closing slash and for alt=”. Then check for validation. Once you are happy with it, copy it to your original code. I just started using this file, so I’ll probably make updates for tags that I missed. I put the date at the top so you can tell if it is the latest version.

Put footer at the bottom of the page

This only works if the page content is not bigger than the window.
Try it yourself or check it out at FixedFooter.


<!DOCTYPE html>
<html lang="en">

<head>
  <title>Fixed Footer Example</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

  <style type="text/css">
  html, body, #container {
      height: 100%;
      margin: auto;
      text-align: center;
  }

  #footer {
        position:absolute;
        bottom:0;
        width:100%;
        margin: auto;
        height:60px;
        text-align: center;
    }

  </style>
</head>
<body>

<div id="container">
    <h1>This is the main content</h1>
    <p>If it is bigger than the window, then the footer will play over it.</p>
    <p>The footer can be inside or outside of the container.</p>
</div>

<div id="footer">
    <p>This is the footer content.</p>
</div>

</body>
</html>

Rollover Text on Images

Here are two ways to do image rollovers. One is pure CSS and the other uses javascript. You can view them here.

The CSS version uses a background image and the hover is a transparent png with just the text in it.

The javascript version swaps out images on rollover and click.


<style type="text/css">

.image {
    height:300px;
    width:500px;
    display: block;
    float:left;
    clear: left;
    margin-top:20px;
}

.rollover img {
    opacity: 0;
}

.rollover img:hover {
    opacity: 1.0;
}

</style>

<div class="image" style="background:url(Roscher_Photo_1.jpg);">
    <a class="rollover" href=""><img src="Roscher_Photo_1_Text.png" alt="Rollover text" /></a>
</div>

<div class="image">
    <a href="Roscher_Photo_1.jpg"
    onmouseover="if (document.images) document.imagename1.src='Roscher_Photo_1_Annotated.jpg';"
    onmouseout="if (document.images)  document.imagename1.src='Roscher_Photo_1.jpg';">

        <img src="Roscher_Photo_1.jpg" name="imagename1"
        alt="Moving the cursor over the image will bring up an annotated version.
        Clicking on the image will bring up the highest resolution version available.">
    </a>
</div>

In-page style tag

I’ll often use an in-page style tag when I am developing a page, then move the style to a .css file when I have everything just like I want it. There are some cases though, when you must use an in-page style. Changing the background image is one.

For my responsive web page, I want to change the background image depending on the page that is being viewed. I could use an in-line style tag—style=”background: url(images/Hawk.jpg);” but I need more properties to make the background center and scroll. For readability, it is easier to put it in an in-page style tag and modify an existing style. At the same time I use some php to decide which image to display. Putting it right before the element that uses it also makes it easy to debug and change. (When you’re done, you need to move it to the tag if you want your page to validate.)


<style>
    .intro {
        background: url(./images/<?php echo $page ?>.jpg) no-repeat bottom center scroll;
        -webkit-background-size: cover;
        -moz-background-size: cover;
        background-size: cover;
        -o-background-size: cover;
    }
</style>
<header class="intro">