The RFC 3986 specifies that the characters in URLs are limited to a set of reserved characters and a set of unreserved characters (US-ASCII). No other characters are allowed in URLs. However, URLs often contain characters that are not part of the set of reserved characters, so they need to be converted into a valid US ASCII format for global interoperability.
URL-encoding (or URL-percent-encoding) is the process of encoding URLs information in a way that allows them to be transmitted securely over the internet.
A two-step process is employed to chart the vast array of characters utilized across the globe:
For instance, the string: François would be written as: Fran%C3%A7ois
Ç, ç (c-cedilla) is a Latin script letter.
Some characters are reserved or prevented from being used in URLs because they can (or cannot) be used as separators by the general syntax in a specific URL scheme. For instance, forward slash /
is used to separate different sections of a URL.
If the data in a URL component contains a character that conflicts with a reserved character that is defined as a separator in the URL scheme, then the character that conflicts must be percent encoded before the URL can be formed. A reserved character in a URL is:
! |
# |
$ |
& |
' |
( |
) |
* |
+ |
, |
/ |
: |
; |
= |
? |
@ |
[ |
] |
%21 |
%23 |
%24 |
%26 |
%27 |
%28 |
%29 |
%2A |
%2B |
%2C |
%2F |
%3A |
%3B |
%3D |
%3F |
%40 |
%5B |
%5D |
Unreserved characters in URLs are characters that are allowed in URLs but don’t have a specific purpose. These characters include uppercase letters, lowercase letters, decimals, hyphens, periods, underscores, and tildes. Here is a table of all the reserved characters in URLs:
A |
B |
C |
D |
E |
F |
G |
H |
I |
J |
K |
L |
M |
N |
O |
P |
Q |
R |
S |
T |
U |
V |
W |
X |
Y |
Z |
a |
b |
c |
d |
e |
f |
g |
h |
i |
j |
k |
l |
m |
n |
o |
p |
q |
r |
s |
t |
u |
v |
w |
x |
y |
z |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
- |
_ |
. |
~ |
According to RFC 3986. characters are encoded and decoded by the following converter:
Input a character and click on Encrypt or Decode button to view the result.