XHTML Tutorial

HTTP Status Messages

18 May 2009 Leave a comment

by navid100 in html, XHTML Tags: html, html tutorials, HTTP Status Messages, lessons, tutorials, web, XHTML, XHTML Tutorial

When a browser requests a service from a web server, an error might occur.

This is a list of HTTP status messages that might be returned:

1xx: Information

Message:	Description:
100 Continue	Only a part of the request has been received by the server, but as long as it has not been rejected, the client should continue with the request
101 Switching Protocols	The server switches protocol

2xx: Successful

Message:	Description:
200 OK	The request is OK
201 Created	The request is complete, and a new resource is created
202 Accepted	The request is accepted for processing, but the processing is not complete
203 Non-authoritative Information
204 No Content
205 Reset Content
206 Partial Content

3xx: Redirection

Message:	Description:
300 Multiple Choices	A link list. The user can select a link and go to that location. Maximum five addresses
301 Moved Permanently	The requested page has moved to a new url
302 Found	The requested page has moved temporarily to a new url
303 See Other	The requested page can be found under a different url
304 Not Modified
305 Use Proxy
306 Unused	This code was used in a previous version. It is no longer used, but the code is reserved
307 Temporary Redirect	The requested page has moved temporarily to a new url

4xx: Client Error

Message:	Description:
400 Bad Request	The server did not understand the request
401 Unauthorized	The requested page needs a username and a password
402 Payment Required	You can not use this code yet
403 Forbidden	Access is forbidden to the requested page
404 Not Found	The server can not find the requested page
405 Method Not Allowed	The method specified in the request is not allowed
406 Not Acceptable	The server can only generate a response that is not accepted by the client
407 Proxy Authentication Required	You must authenticate with a proxy server before this request can be served
408 Request Timeout	The request took longer than the server was prepared to wait
409 Conflict	The request could not be completed because of a conflict
410 Gone	The requested page is no longer available
411 Length Required	The “Content-Length” is not defined. The server will not accept the request without it
412 Precondition Failed	The precondition given in the request evaluated to false by the server
413 Request Entity Too Large	The server will not accept the request, because the request entity is too large
414 Request-url Too Long	The server will not accept the request, because the url is too long. Occurs when you convert a “post” request to a “get” request with a long query information
415 Unsupported Media Type	The server will not accept the request, because the media type is not supported
416
417 Expectation Failed

5xx: Server Error

Message:	Description:
500 Internal Server Error	The request was not completed. The server met an unexpected condition
501 Not Implemented	The request was not completed. The server did not support the functionality required
502 Bad Gateway	The request was not completed. The server received an invalid response from the upstream server
503 Service Unavailable	The request was not completed. The server is temporarily overloading or down
504 Gateway Timeout	The gateway has timed out
505 HTTP Version Not Supported	The server does not support the “http protocol” version

HTML URL Encoding Reference

18 May 2009 1 Comment

by navid100 in html, XHTML Tags: html, html tutorials, HTML URL Encoding, lessons, tutorials, web, XHTML, XHTML Tutorial

URL encoding converts characters into a format that can be safely transmitted over the Internet.

URL – Universal Resource Locator

Web browsers request pages from web servers by using a URL.

The URL is the address of a web page like: http://www.w3schools.com.

URL Encoding

URLs can only be sent over the Internet using the ASCII character-set.

Since URLs often contains characters outside the ASCII set, the URL has to be converted. URL encoding converts the URL into a valid ASCII format.

URL encoding replaces unsafe ASCII characters with “%” followed by two hexadecimal digits corresponding to the character values in the ISO-8859-1 character-set.

URLs cannot contain spaces. URL encoding normally replaces a space with a + sign.

Try It Yourself

If you click the “Submit” button below, the browser will URL encode the input before it is sent to the server. A page at the server will display the received input.

Try some other input and click Submit again.

URL Encoding Functions

In JavaScript, PHP, and ASP there are functions that can be used to URL encode a string.

In JavaScript you can use the encodeURI() function. PHP has the rawurlencode() function and ASP has the Server.URLEncode() function.

Click the “URL Encode” button to see how the JavaScript function encodes the text.

Note: The JavaScript function encodes space as %20.

URL Encoding Reference

ASCII Character	URL-encoding
space	%20
!	%21
“	%22
#	%23
$	%24
%	%25
&	%26
‘	%27
(	%28
)	%29
*	%2A
+	%2B
,	%2C
–	%2D
.	%2E
/	%2F
0	%30
1	%31
2	%32
3	%33
4	%34
5	%35
6	%36
7	%37
8	%38
9	%39
:	%3A
;	%3B
<	%3C
=	%3D
>	%3E
?	%3F
@	%40
A	%41
B	%42
C	%43
D	%44
E	%45
F	%46
G	%47
H	%48
I	%49
J	%4A
K	%4B
L	%4C
M	%4D
N	%4E
O	%4F
P	%50
Q	%51
R	%52
S	%53
T	%54
U	%55
V	%56
W	%57
X	%58
Y	%59
Z	%5A
[	%5B
\	%5C
]	%5D
^	%5E
_	%5F
`	%60
a	%61
b	%62
c	%63
d	%64
e	%65
f	%66
g	%67
h	%68
i	%69
j	%6A
k	%6B
l	%6C
m	%6D
n	%6E
o	%6F
p	%70
q	%71
r	%72
s	%73
t	%74
u	%75
v	%76
w	%77
x	%78
y	%79
z	%7A
{	%7B
\|	%7C
}	%7D
~	%7E
	%7F
€	%80
	%81
‚	%82
ƒ	%83
„	%84
…	%85
†	%86
‡	%87
ˆ	%88
‰	%89
Š	%8A
‹	%8B
Œ	%8C
	%8D
Ž	%8E
	%8F
	%90
‘	%91
’	%92
“	%93
”	%94
•	%95
–	%96
—	%97
˜	%98
™	%99
š	%9A
›	%9B
œ	%9C
	%9D
ž	%9E
Ÿ	%9F
	%A0
¡	%A1
¢	%A2
£	%A3
	%A4
¥	%A5
\|	%A6
§	%A7
¨	%A8
©	%A9
ª	%AA
«	%AB
¬	%AC
¯	%AD
®	%AE
¯	%AF
°	%B0
±	%B1
²	%B2
³	%B3
´	%B4
µ	%B5
¶	%B6
·	%B7
¸	%B8
¹	%B9
º	%BA
»	%BB
¼	%BC
½	%BD
¾	%BE
¿	%BF
À	%C0
Á	%C1
Â	%C2
Ã	%C3
Ä	%C4
Å	%C5
Æ	%C6
Ç	%C7
È	%C8
É	%C9
Ê	%CA
Ë	%CB
Ì	%CC
Í	%CD
Î	%CE
Ï	%CF
Ð	%D0
Ñ	%D1
Ò	%D2
Ó	%D3
Ô	%D4
Õ	%D5
Ö	%D6
	%D7
Ø	%D8
Ù	%D9
Ú	%DA
Û	%DB
Ü	%DC
Ý	%DD
Þ	%DE
ß	%DF
à	%E0
á	%E1
â	%E2
ã	%E3
ä	%E4
å	%E5
æ	%E6
ç	%E7
è	%E8
é	%E9
ê	%EA
ë	%EB
ì	%EC
í	%ED
î	%EE
ï	%EF
ð	%F0
ñ	%F1
ò	%F2
ó	%F3
ô	%F4
õ	%F5
ö	%F6
÷	%F7
ø	%F8
ù	%F9
ú	%FA
û	%FB
ü	%FC
ý	%FD
þ	%FE
ÿ	%FF

URL Encoding Reference

The ASCII device control characters %00-%1f were originally designed to control hardware devices. Control characters have nothing to do inside a URL.

ASCII Character	Description	URL-encoding
NUL	null character	%00
SOH	start of header	%01
STX	start of text	%02
ETX	end of text	%03
EOT	end of transmission	%04
ENQ	enquiry	%05
ACK	acknowledge	%06
BEL	bell (ring)	%07
BS	backspace	%08
HT	horizontal tab	%09
LF	line feed	%0A
VT	vertical tab	%0B
FF	form feed	%0C
CR	carriage return	%0D
SO	shift out	%0E
SI	shift in	%0F
DLE	data link escape	%10
DC1	device control 1	%11
DC2	device control 2	%12
DC3	device control 3	%13
DC4	device control 4	%14
NAK	negative acknowledge	%15
SYN	synchronize	%16
ETB	end transmission block	%17
CAN	cancel	%18
EM	end of medium	%19
SUB	substitute	%1A
ESC	escape	%1B
FS	file separator	%1C
GS	group separator	%1D
RS	record separator	%1E
US	unit separator	%1F

HTML Symbol Entities Reference

18 May 2009 2 Comments

by navid100 in html, XHTML Tags: html, HTML Symbol Entities, html tutorials, lessons, tutorials, web, XHTML Tutorial

HTML Symbol Entities

This entity reference includes mathematical symbols, Greek characters, various arrows, technical symbols and shapes.

Note: Entity names are case sensitive.

Math Symbols Supported by HTML

Character	Entity Number	Entity Name	Description
∀	∀	∀	for all
∂	∂	∂	part
∃	∃	&exists;	exists
∅	∅	∅	empty
∇	∇	∇	nabla
∈	∈	∈	isin
∉	∉	∉	notin
∋	∋	&ni;	ni
∏	∏	∏	prod
∑	∑	∑	sum
−	−	−	minus
∗	∗	&lowast;	lowast
√	√	√	square root
∝	∝	&prop;	proportional to
∞	∞	∞	infinity
∠	∠	&ang;	angle
∧	∧	&and;	and
∨	∨	&or;	or
∩	∩	∩	cap
∪	∪	∪	cup
∫	∫	∫	integral
∴	∴	&there4;	therefore
∼	∼	&sim;	simular to
≅	≅	&cong;	approximately equal
≈	≈	≈	almost equal
≠	≠	≠	not equal
≡	≡	&equiv;	equivalent
≤	≤	≤	less or equal
≥	≥	≥	greater or equal
⊂	⊂	⊂	subset of
⊃	⊃	⊃	superset of
⊄	⊄	&nsub;	not subset of
⊆	⊆	&sube;	subset or equal
⊇	⊇	&supe;	superset or equal
⊕	⊕	&oplus;	circled plus
⊗	⊗	&otimes;	cirled times
⊥	⊥	&perp;	perpendicular
⋅	⋅	⋅	dot operator

Greek Letters Supported by HTML

Character	Entity Number	Entity Name	Description
Α	Α	Α	Alpha
Β	Β	Β	Beta
Γ	Γ	Γ	Gamma
Δ	Δ	Δ	Delta
Ε	Ε	Ε	Epsilon
Ζ	Ζ	Ζ	Zeta
Η	Η	Η	Eta
Θ	Θ	Θ	Theta
Ι	Ι	Ι	Iota
Κ	Κ	Κ	Kappa
Λ	Λ	Λ	Lambda
Μ	Μ	Μ	Mu
Ν	Ν	Ν	Nu
Ξ	Ξ	Ξ	Xi
Ο	Ο	Ο	Omicron
Π	Π	Π	Pi
Ρ	Ρ	Ρ	Rho
	undefined		Sigmaf
Σ	Σ	Σ	Sigma
Τ	Τ	Τ	Tau
Υ	Υ	Υ	Upsilon
Φ	Φ	Φ	Phi
Χ	Χ	Χ	Chi
Ψ	Ψ	Ψ	Psi
Ω	Ω	Ω	Omega

α	α	α	alpha
β	β	β	beta
γ	γ	γ	gamma
δ	δ	δ	delta
ε	ε	ε	epsilon
ζ	ζ	ζ	zeta
η	η	η	eta
θ	θ	θ	theta
ι	ι	ι	iota
κ	κ	κ	kappa
λ	λ	λ	lambda
μ	μ	μ	mu
ν	ν	ν	nu
ξ	ξ	ξ	xi
ο	ο	ο	omicron
π	π	π	pi
ρ	ρ	ρ	rho
ς	ς	&sigmaf;	sigmaf
σ	σ	σ	sigma
τ	τ	τ	tau
υ	υ	υ	upsilon
φ	φ	φ	phi
χ	χ	χ	chi
ψ	ψ	ψ	psi
ω	ω	ω	omega

ϑ	ϑ	&thetasym;	theta symbol
ϒ	ϒ	&upsih;	upsilon symbol
ϖ	ϖ	ϖ	pi symbol

Other Entities Supported by HTML

Character	Entity Number	Entity Name	Description
Œ	Œ	&OElig;	capital ligature OE
œ	œ	&oelig;	small ligature oe
Š	Š	&Scaron;	capital S with caron
š	š	&scaron;	small S with caron
Ÿ	Ÿ	&Yuml;	capital Y with diaeres
ƒ	ƒ	&fnof;	f with hook
ˆ	ˆ	&circ;	modifier letter circumflex accent
˜	˜	&tilde;	small tilde
		&ensp;	en space
		&emsp;	em space
			thin space
‌	‌	&zwnj;	zero width non-joiner
‍	‍	&zwj;	zero width joiner
‎	‎	&lrm;	left-to-right mark
‏	‏	&rlm;	right-to-left mark
–	–	–	en dash
—	—	—	em dash
‘	‘	‘	left single quotation mark
’	’	’	right single quotation mark
‚	‚	&sbquo;	single low-9 quotation mark
“	“	“	left double quotation mark
”	”	”	right double quotation mark
„	„	&bdquo;	double low-9 quotation mark
†	†	&dagger;	dagger
‡	‡	&Dagger;	double dagger
•	•	•	bullet
…	…	…	horizontal ellipsis
‰	‰	&permil;	per mille
′	′	′	minutes
″	″	″	seconds
‹	‹	&lsaquo;	single left angle quotation
›	›	&rsaquo;	single right angle quotation
‾	‾	&oline;	overline
€	€	€	euro
™	™	™	trademark
←	←	←	left arrow
↑	↑	↑	up arrow
→	→	→	right arrow
↓	↓	↓	down arrow
↔	↔	↔	left right arrow
↵	↵	&crarr;	carriage return arrow
⌈	⌈	&lceil;	left ceiling
⌉	⌉	&rceil;	right ceiling
⌊	⌊	&lfloor;	left floor
⌋	⌋	&rfloor;	right floor
◊	◊	&loz;	lozenge
♠	♠	&spades;	spade
♣	♣	&clubs;	club
♥	♥	&hearts;	heart
♦	♦	&diams;	diamond

HTML ISO-8859-1 Reference

18 May 2009 Leave a comment

by navid100 in html, XHTML Tags: html, html tutorials, lessons, tutorials, web, XHTML, XHTML Tutorial

Modern browsers supports several character-sets:

ISO-8859-1

ISO-8859-1 is the default character set in most browsers.

The first 128 characters of ISO-8859-1 is the original ASCII character-set (the numbers from 0-9, the uppercase and lowercase English alphabet, and some special characters).

The higher part of ISO-8859-1 (codes from 160-255) contains the characters used in Western European countries and some commonly used special characters.

Entities are used to implement reserved characters or to express characters that cannot easily be entered with the keyboard.

Reserved Characters in HTML

Some characters are reserved in HTML and XHTML. For example, you cannot use the greater than or less than signs within your text because the browser could mistake them for markup.

HTML and XHTML processors must support the five special characters listed in the table below:

Character	Entity Number	Entity Name	Description
“	"	"	quotation mark
‘	'	' (does not work in IE)	apostrophe
&	&	&	ampersand
<	<	<	less-than
>	>	>	greater-than

Note: Entity names are case sensitive!

ISO 8859-1 Symbols

Character	Entity Number	Entity Name	Description
			non-breaking space
¡	¡	¡	inverted exclamation mark
¢	¢	¢	cent
£	£	£	pound
¤	¤	¤	currency
¥	¥	¥	yen
¦	¦	¦	broken vertical bar
§	§	§	section
¨	¨	¨	spacing diaeresis
©	©	©	copyright
ª	ª	ª	feminine ordinal indicator
«	«	«	angle quotation mark (left)
¬	¬	¬	negation
			soft hyphen
®	®	®	registered trademark
¯	¯	¯	spacing macron
°	°	°	degree
±	±	±	plus-or-minus
²	²	²	superscript 2
³	³	³	superscript 3
´	´	´	spacing acute
µ	µ	µ	micro
¶	¶	¶	paragraph
·	·	·	middle dot
¸	¸	¸	spacing cedilla
¹	¹	¹	superscript 1
º	º	º	masculine ordinal indicator
»	»	»	angle quotation mark (right)
¼	¼	¼	fraction 1/4
½	½	½	fraction 1/2
¾	¾	¾	fraction 3/4
¿	¿	¿	inverted question mark
×	×	×	multiplication
÷	÷	÷	division

ISO 8859-1 Characters

Character	Entity Number	Entity Name	Description
À	À	À	capital a, grave accent
Á	Á	Á	capital a, acute accent
Â	Â	Â	capital a, circumflex accent
Ã	Ã	Ã	capital a, tilde
Ä	Ä	Ä	capital a, umlaut mark
Å	Å	Å	capital a, ring
Æ	Æ	Æ	capital ae
Ç	Ç	Ç	capital c, cedilla
È	È	È	capital e, grave accent
É	É	É	capital e, acute accent
Ê	Ê	Ê	capital e, circumflex accent
Ë	Ë	Ë	capital e, umlaut mark
Ì	Ì	Ì	capital i, grave accent
Í	Í	Í	capital i, acute accent
Î	Î	Î	capital i, circumflex accent
Ï	Ï	Ï	capital i, umlaut mark
Ð	Ð	Ð	capital eth, Icelandic
Ñ	Ñ	Ñ	capital n, tilde
Ò	Ò	Ò	capital o, grave accent
Ó	Ó	Ó	capital o, acute accent
Ô	Ô	Ô	capital o, circumflex accent
Õ	Õ	Õ	capital o, tilde
Ö	Ö	Ö	capital o, umlaut mark
Ø	Ø	Ø	capital o, slash
Ù	Ù	Ù	capital u, grave accent
Ú	Ú	Ú	capital u, acute accent
Û	Û	Û	capital u, circumflex accent
Ü	Ü	Ü	capital u, umlaut mark
Ý	Ý	Ý	capital y, acute accent
Þ	Þ	Þ	capital THORN, Icelandic
ß	ß	ß	small sharp s, German
à	à	à	small a, grave accent
á	á	á	small a, acute accent
â	â	â	small a, circumflex accent
ã	ã	ã	small a, tilde
ä	ä	ä	small a, umlaut mark
å	å	å	small a, ring
æ	æ	æ	small ae
ç	ç	ç	small c, cedilla
è	è	è	small e, grave accent
é	é	é	small e, acute accent
ê	ê	ê	small e, circumflex accent
ë	ë	ë	small e, umlaut mark
ì	ì	ì	small i, grave accent
í	í	í	small i, acute accent
î	î	î	small i, circumflex accent
ï	ï	ï	small i, umlaut mark
ð	ð	ð	small eth, Icelandic
ñ	ñ	ñ	small n, tilde
ò	ò	ò	small o, grave accent
ó	ó	ó	small o, acute accent
ô	ô	ô	small o, circumflex accent
õ	õ	õ	small o, tilde
ö	ö	ö	small o, umlaut mark
ø	ø	ø	small o, slash
ù	ù	ù	small u, grave accent
ú	ú	ú	small u, acute accent
û	û	û	small u, circumflex accent
ü	ü	ü	small u, umlaut mark
ý	ý	ý	small y, acute accent
þ	þ	þ	small thorn, Icelandic
ÿ	ÿ	ÿ	small y, umlaut mark

HTML ASCII Reference

18 May 2009 Leave a comment

by navid100 in html, XHTML Tags: html, HTML ASCII, html tutorials, lessons, tutorials, web, XHTML, XHTML Tutorial

The ASCII character-set is used to send information between computers on the Internet.

The ASCII Character Set

ASCII stands for the “American Standard Code for Information Interchange”. It was designed in the early 60’s, as a standard character-set for computers and hardware devices like teleprinters and tapedrives.

ASCII is a 7-bit character set containing 128 characters.

It contains the numbers from 0-9, the uppercase and lowercase English letters from A to Z, and some special characters.

The character-sets used in modern computers, HTML, and Internet are all based on ASCII.

The following table lists the 128 ASCII characters and their equivalent HTML entity codes.

ASCII Printable Characters

ASCII Character	HTML Entity Code	Description
		space
!	!	exclamation mark
“	"	quotation mark
#	#	number sign
$	$	dollar sign
%	%	percent sign
&	&	ampersand
‘	'	apostrophe
(	(	left parenthesis
)	)	right parenthesis
*	*	asterisk
+	+	plus sign
,	,	comma
–	-	hyphen
.	.	period
/	/	slash
0	0	digit 0
1	1	digit 1
2	2	digit 2
3	3	digit 3
4	4	digit 4
5	5	digit 5
6	6	digit 6
7	7	digit 7
8	8	digit 8
9	9	digit 9
:	:	colon
;	;	semicolon
<	<	less-than
=	=	equals-to
>	>	greater-than
?	?	question mark
@	@	at sign
A	A	uppercase A
B	B	uppercase B
C	C	uppercase C
D	D	uppercase D
E	E	uppercase E
F	F	uppercase F
G	G	uppercase G
H	H	uppercase H
I	I	uppercase I
J	J	uppercase J
K	K	uppercase K
L	L	uppercase L
M	M	uppercase M
N	N	uppercase N
O	O	uppercase O
P	P	uppercase P
Q	Q	uppercase Q
R	R	uppercase R
S	S	uppercase S
T	T	uppercase T
U	U	uppercase U
V	V	uppercase V
W	W	uppercase W
X	X	uppercase X
Y	Y	uppercase Y
Z	Z	uppercase Z
[	[	left square bracket
\	\	backslash
]	]	right square bracket
^	^	caret
_	_	underscore
`	`	grave accent
a	a	lowercase a
b	b	lowercase b
c	c	lowercase c
d	d	lowercase d
e	e	lowercase e
f	f	lowercase f
g	g	lowercase g
h	h	lowercase h
i	i	lowercase i
j	j	lowercase j
k	k	lowercase k
l	l	lowercase l
m	m	lowercase m
n	n	lowercase n
o	o	lowercase o
p	p	lowercase p
q	q	lowercase q
r	r	lowercase r
s	s	lowercase s
t	t	lowercase t
u	u	lowercase u
v	v	lowercase v
w	w	lowercase w
x	x	lowercase x
y	y	lowercase y
z	z	lowercase z
{	{	left curly brace
\|	\|	vertical bar
}	}	right curly brace
~	~	tilde

ASCII Device Control Characters

The ASCII device control characters were originally designed to control hardware devices.

Control characters have nothing to do inside an HTML document.

ASCII Character	HTML Entity Code	Description
NUL		null character
SOH		start of header
STX		start of text
ETX		end of text
EOT		end of transmission
ENQ		enquiry
ACK		acknowledge
BEL		bell (ring)
BS		backspace
HT		horizontal tab
LF		line feed
VT		vertical tab
FF		form feed
CR		carriage return
SO		shift out
SI		shift in
DLE		data link escape
DC1		device control 1
DC2		device control 2
DC3		device control 3
DC4		device control 4
NAK		negative acknowledge
SYN		synchronize
ETB		end transmission block
CAN		cancel
EM		end of medium
SUB		substitute
ESC		escape
FS		file separator
GS		group separator
RS		record separator
US		unit separator

DEL		delete (rubout)

HTML Character Sets

18 May 2009 1 Comment

by navid100 in html, XHTML Tags: html, HTML Character Sets, html tutorials, lessons, tutorials, web, XHTML, XHTML Tutorial

HTML Character Sets

To display an HTML page correctly, the browser must know what character-set to use.

The character-set for the early world wide web was ASCII. ASCII supports the numbers from 0-9, the uppercase and lowercase English alphabet, and some special characters.

Complete ASCII reference.

Since many countries use characters which are not a part of ASCII, the default character-set for modern browsers is ISO-8859-1.

Complete ISO-8859-1 reference.

If a web page uses a different character-set than ISO-8859-1, it should be specified in the <meta> tag.

Try it yourself

ISO Character Sets

It is the International Standards Organization (ISO) that defines the standard character-sets for different alphabets/languages.

The different character-sets being used around the world are listed below:

Character set	Description	Covers
ISO-8859-1	Latin alphabet part 1	North America, Western Europe, Latin America, the Caribbean, Canada, Africa
ISO-8859-2	Latin alphabet part 2	Eastern Europe
ISO-8859-3	Latin alphabet part 3	SE Europe, Esperanto, miscellaneous others
ISO-8859-4	Latin alphabet part 4	Scandinavia/Baltics (and others not in ISO-8859-1)
ISO-8859-5	Latin/Cyrillic part 5	The languages that are using a Cyrillic alphabet such as Bulgarian, Belarusian, Russian and Macedonian
ISO-8859-6	Latin/Arabic part 6	The languages that are using the Arabic alphabet
ISO-8859-7	Latin/Greek part 7	The modern Greek language as well as mathematical symbols derived from the Greek
ISO-8859-8	Latin/Hebrew part 8	The languages that are using the Hebrew alphabet
ISO-8859-9	Latin 5 part 9	The Turkish language. Same as ISO-8859-1 except Turkish characters replace Icelandic ones
ISO-8859-10	Latin 6 Lappish, Nordic, Eskimo	The Nordic languages
ISO-8859-15	Latin 9 (aka Latin 0)	Similar to ISO 8859-1 but replaces some less common symbols with the euro sign and some other missing characters
ISO-2022-JP	Latin/Japanese part 1	The Japanese language
ISO-2022-JP-2	Latin/Japanese part 2	The Japanese language
ISO-2022-KR	Latin/Korean part 1	The Korean language

The Unicode Standard

Because the character-sets listed above are limited in size, and are not compatible in multilingual environments, the Unicode Consortium developed the Unicode Standard.

The Unicode Standard covers all the characters, punctuations, and symbols in the world.

Unicode enables processing, storage and interchange of text data no matter what the platform, no matter what the program, no matter what the language.

The Unicode Consortium

The Unicode Consortium develops the Unicode Standard. Their goal is to replace the existing character-sets with its standard Unicode Transformation Format (UTF).

The Unicode Standard has become a success and is implemented in XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc. The Unicode standard is also supported in many operating systems and all modern browsers.

The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.

Unicode can be implemented by different character-sets. The most commonly used encodings are UTF-8 and UTF-16:

Character-set	Description
UTF-8	A character in UTF8 can be from 1 to 4 bytes long. UTF-8 can represent any character in the Unicode standard. UTF-8 is backwards compatible with ASCII. UTF-8 is the preferred encoding for e-mail and web pages
UTF-16	16-bit Unicode Transformation Format is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire. UTF-16 is used in major operating systems and environments, like Microsoft Windows 2000/XP/2003/Vista/CE and the Java and .NET byte code environments

Tip: The first 256 characters of Unicode character-sets correspond to the 256 characters of ISO-8859-1.

Tip: All HTML 4 processors already support UTF-8, and all XHTML and XML processors support UTF-8 and UTF-16!