Understanding HTML Injection

HTML injection is a basic security weakness in which data (information like an email address or first name) and code (the grammar of a web page, such as the creation of script tag elements) mix in undesirable ways.

An XSS attack rewrites the structure of a web page or executes arbitrary JavaScript within the victim’s web browser. XSS comes into play when the visitor can use characters normally reserved for HTML markup as part of the search query.

Each of the previous examples demonstrated an important aspect of XSS attacks: the context in which the payload is echoed influences the characters required to hack the page. In some cases, new elements can be created such as script tag or iframe. In other cases, an element’s attribute might be modified. If the payload shows up within a JavaScript variable, then the payload need only consist of code.

More vicious payloads have been demonstrated to:

  • steal cookies so attackers can impersonate victims without having to steal passwords;
  • spoof login prompts to steal passwords (attackers like to cover all the angles);
  • capture keystrokes for banking, e-mail, and game websites;
  • use the browser to port scan a local area network;
  • surreptitiously reconfigure a home router to drop its firewall;
  • automatically add random people to your social network;
  • lay the groundwork for a Cross-Site Request Forgery (CSRF) attack.

Form Fields

Forms collect information from users, which immediately make the supplied data tainted. The obvious injection points are the fields that users are expected to fill out, such as login name, e-mail address, or credit card number. Less obvious are the fields that users are not expected to modify such as input type=hidden or input fields with the disable attribute. A common mistake among naive developers is that if the user can’t modify the form field in the browser, then the form field can’t be modified.

A common example of this attack vector is when the site populates a form field with a previously supplied value from the user. We already used an example of this at the beginning of the chapter. Here’s another case where the user inserts a quotation mark and closing bracket (“>) in order to close the input tag and create a new script element.

Element Attributes

HTML element attributes are fundamental to creating and customizing web pages. Two attributes relevant to HTML injection attacks are the href and value.

The single- and double-quote characters are central to escaping the context of an attribute value. As we’ve already seen in examples throughout this chapter, a simple HTML injection technique prematurely terminates the attribute, then inserts arbitrary HTML to modify the DOM.

All elements can have custom attributes but these serve little purpose for code execution hacks. The primary goal when attacking this rendering context is to create an event handler or terminate the element and create a script tag.

Why Encoding Matters for HTML Injection

The previous discussions of percent-encoding detoured from XSS with demonstrations of attacks against the web application’s programming language (e.g. Perl, Python, and %00) or against the server itself (IIS and %c0 %af). We’ve taken these detours along the characters in a URI in order to emphasize the significance of using character encoding schemes to bypass security checks.

The angle brackets (< and >), quotes, and parentheses are the usual prerequisites for an XSS payload. If the attacker needs to use one of those characters, then the focus of the attack will switch to using control characters such as NULL and alternate encodings to bypass the web site’s security filters.

Probably the most common reason XSS filters fail is that the input string isn’t correctly normalized.

Often the impact of HTML injection hack is limited only by the hacker’s imagination or effort. Regardless of whether you believe your app doesn’t collect credit card data and therefore (supposedly!) has little to risk from an XSS attack, or if you believe that alert() windows are merely a nuisance—the fact remains that a bug exists within the web application. A bug that should be fixed and, depending on the craftiness of the attacker, will be put to good use in surprising ways.

Bibliographic Information

Hacking Web Apps

By: Mike Shema
Publisher: Syngress
Pub. Date: October 11, 2012
Print ISBN-13: 978-1-59749-951-4
Web ISBN-13: 978-1-59749-956-9

These are notes I made after reading this book. See more book notes

Just to let you know, this page was last updated Tuesday, Mar 19 24