March 16, 2005

BBEdit HTML Entity Maker

I am constantly pasting text into BBEdit in the middle of markup. This copy often needs the entities to be encoded before the page can be published or it won't validate. I created this simple perl filter to automatically substitute the correct entitly code for each un-encoded character in a selection of text.

Place the following script in a text file and save it to:

BBEdit > BBEdit Support > Unix Support > Unix Filters

You'll now have a new item in the #! menu of BBEdit under Unix Filters. Just select a bunch of text containing un-encoded entities and choose this menu item to convert it.

#!/usr/bin/perl -w
#
# Used as a Unix Filter in the BBEdit #! menu to form html entities
#
# Written by Joshua McFarren 1/20/05
# This work is licensed under a Creative Commons License.
# http://creativecommons.org/licenses/by-sa/2.0/

use strict;
my $input = "";
while (<>) { $input .= $_; }

$input =~ s/\(R\)/&reg;/g;
$input =~ s/\(C\)/&copy;/g;
$input =~ s/\(TM\)/&trade;/g;
$input =~ s/ & / &amp; /g;
$input =~ s/ - / &ndash; /g;
$input =~ s/([^!])--([^>])/$1&mdash;$2/g;	# safe em

$input =~ s/À/&Agrave;/g;
$input =~ s/Á/&Aacute;/g;
$input =~ s/Â/&Acirc;/g;
$input =~ s/Ã/&Atilde;/g;
$input =~ s/Ä/&Auml;/g;
$input =~ s/Å/&Aring;/g;
$input =~ s/à/&agrave;/g;
$input =~ s/á/&aacute;/g;
$input =~ s/â/&acirc;/g;
$input =~ s/ã/&atilde;/g;
$input =~ s/ä/&auml;/g;
$input =~ s/å/&aring;/g;
$input =~ s/Ç/&Ccedil;/g;
$input =~ s/ç/&ccedil;/g;
$input =~ s/È/&Egrave;/g;
$input =~ s/É/&Eacute;/g;
$input =~ s/Ê/&Ecirc;/g;
$input =~ s/Ë/&Euml;/g;
$input =~ s/è/&egrave;/g;
$input =~ s/é/&eacute;/g;
$input =~ s/ê/&ecirc;/g;
$input =~ s/ë/&euml;/g;
$input =~ s/Ì/&Igrave;/g;
$input =~ s/Í/&Iacute;/g;
$input =~ s/Î/&Icirc;/g;
$input =~ s/Ï/&Iuml;/g;
$input =~ s/ì/&igrave;/g;
$input =~ s/í/&iacute;/g;
$input =~ s/î/&icirc;/g;
$input =~ s/ï/&iuml;/g;
$input =~ s/Ñ/&Ntilde;/g;
$input =~ s/ñ/&ntilde;/g;
$input =~ s/Ò/&Ograve;/g;
$input =~ s/Ó/&Oacute;/g;
$input =~ s/Ô/&Ocirc;/g;
$input =~ s/Õ/&Otilde;/g;
$input =~ s/Ö/&Ouml;/g;
$input =~ s/Ø/&Oslash;/g;
$input =~ s/ò/&ograve;/g;
$input =~ s/ó/&oacute;/g;
$input =~ s/ô/&ocirc;/g;
$input =~ s/õ/&otilde;/g;
$input =~ s/ö/&ouml;/g;
$input =~ s/ø/&oslash;/g;
$input =~ s/Ù/&Ugrave;/g;
$input =~ s/Ú/&Uacute;/g;
$input =~ s/Û/&Ucirc;/g;
$input =~ s/Ü/&Uuml;/g;
$input =~ s/ù/&ugrave;/g;
$input =~ s/ú/&uacute;/g;
$input =~ s/û/&ucirc;/g;
$input =~ s/ü/&uuml;/g;
$input =~ s/ÿ/&yuml;/g;
$input =~ s/Ÿ/&Yuml;/g;
$input =~ s/¡/&iexcl;/g;
$input =~ s/¢/&cent;/g;
$input =~ s/£/&pound;/g;
$input =~ s/¥/&yen;/g;
$input =~ s/§/&sect;/g;
$input =~ s/¨/&uml;/g;
$input =~ s/©/&copy;/g;
$input =~ s/ª/&ordf;/g;
$input =~ s/«/&laquo;/g;
$input =~ s/¬/&not;/g;
$input =~ s/®/&reg;/g;
$input =~ s/¯/&macr;/g;
$input =~ s/°/&deg;/g;
$input =~ s/±/&plusmn;/g;
$input =~ s/´/&acute;/g;
$input =~ s/µ/&micro;/g;
$input =~ s/¶/&para;/g;
$input =~ s/·/&middot;/g;
$input =~ s/¸/&cedil;/g;
$input =~ s/º/&ordm;/g;
$input =~ s/»/&raquo;/g;
$input =~ s/–/&ndash;/g;
$input =~ s/—/&mdash;/g;
$input =~ s/‘/&lsquo;/g;
$input =~ s/’/&rsquo;/g;
$input =~ s/‚/&sbquo;/g;
$input =~ s/“/&ldquo;/g;
$input =~ s/”/&rdquo;/g;
$input =~ s/„/&bdquo;/g;
$input =~ s/†/&dagger;/g;
$input =~ s/‡/&Dagger;/g;
$input =~ s/•/&bull;/g;
$input =~ s/…/&hellip;/g;
$input =~ s/‰/&permil;/g;
$input =~ s/‹/&lsaquo;/g;
$input =~ s/›/&rsaquo;/g;
$input =~ s/¿/&iquest;/g;
$input =~ s/Æ/&AElig;/g;
$input =~ s/æ/&aelig;/g;
$input =~ s/ß/&szlig;/g;
$input =~ s/÷/&divide;/g;
$input =~ s/Œ/&OElig;/g;
$input =~ s/œ/&oelig;/g;
$input =~ s/ƒ/&fnof;/g;
$input =~ s/ˆ/&circ;/g;
$input =~ s/˜/&tilde;/g;
$input =~ s/Ω/&Omega;/g;
$input =~ s/π/&pi;/g;

print "$input";
Creative Commons License   This work is licensed under a Creative Commons License Technorati tags: , , , ,
Posted by joshua at March 16, 2005 10:00 PM
Post a comment









Remember personal info?