Properly encoding and escaping for the web
When processing untrusted user input for (web) applications, filter the input, and encode the output. That is the most widely given advice in order to prevent (server-side) injections. Yet it can be deceivingly difficult to properly encode (user) input. Encoding is dependent on the type of output - which means that for example a string, which will be used in a JavaScript variable, should be treated (encoded) differently than a string which will be used in plain HTML.
When outputting untrusted user input, one should encode or escape, based on the context, the location of the output.
And what's the difference between escaping and encoding ?
Encoding is transforming data from one format into another format.
Escaping is a subset of encoding, where not all characters need to be encoded. Only some characters are encoded (by using an escape character).
There are quite a number of encoding mechanisms, which make this more difficult than it might look at first glance.
URL encoding
URL encoding is a method to encode information in a Uniform Resource Identifier. There's a set of reserved characters, which have special meaning, and unreserved, or safe characters, which are safe to use. If a character is reserved, then the …