Encoding Text for Web Pages
If you want to hide web page content from robots, and yet have it available to all browsers, even those that have JavaScript turned off, the content can be encoded with the
&#______;
format, where the underline is the ASCII decimal number of the character being encoded.
For example, the letter "A" would be A
Master Text Encoder, a free CGI program from /a/24t/pl.pl?mte can help you with that task. Simply paste the text content into the control panel text box and click the button. The next page contains the encoded text ready for copying and pasting into your web page.
The encoding is to hide content from robots that can't decode content encoded with this method. Site visitors can see the content because browsers decode it before displaying it in the browser's window.
Content that might be encoded are:
-
Email addresses to hide them from spammer's email harvesting robots.
-
URLs in link directories to hide them from robots that would retrieve your links for its master's directories.
-
Content that's being retrieved by other sites for use on their own web pages. While the content can be retrieved and displayed in the encoded state, if the robot looks for content with only specific words or phrases before grabbing the text, it will see only encoded text.
-
Just because you can; to have fun.
Any content can be encoded. But HTML tags can't. The value of attributes can be encoded, but not the attribute names and not the tag names.
The underlined sections of this example are sections that can be encoded:
<p>__________________________________
_____________________________________
<a href="_____________">_________</a>
_________________________________</p>
Some browsers may correctly display web pages with attribute and tag names encoded. But that's not the case with all browsers.
When Master Text Encoder encodes text, it encodes spaces and end-of-line characters, too. Thus, the encoded text is all one line.
If you wish to break the line of encoded text into multiple lines, put the line break between a semi-colon character (which is the last character of an encoded sequence) and an ampersand character (which is the first character of the following encoded sequence).
Even if you're not currently in position where you feel a need to encode some of your content, Master Text Encoder is still fun to play with. It can be downloaded at /a/24t/pl.pl?mte Instructions are in the script itself.
Question:
Did you find this article interesting and understandable? How can it be improved?
Your response is anonymous.
When done typing, click anywhere outside the box. [more info]
Will Bontrager
©2004 Bontrager Connection, LLC
Please note:
Articles on this website are presented "as is". However -
If you have a question about a CGI script, HTML, CSS, PHP, or JavaScript
Ask one of our Experts and you'll have your answer!
Click here for details.