Posts (page 35 of 43)

HTML5 Unbound, part 3 of 4 May 28, 2012

(With the historical perspective behind us, we dive into HTML5. This series concludes on Wednesday.)

Security (and Privacy) From HTML5

Most HTML5 security checklists rehash the recommendations and warnings from the specs themselves. It’s always a good sign when specs acknowledge security and privacy. Getting to that point isn’t trivial. There were two detours on the way to HTML5. WAP was a first stab at putting the web on mobile devices when mobile devices were dumb. And one of its first failings was the lack of cookie support.

XHTML was another blip on the radar. Its only improvement over HTML seemed to be that mark-up could be parsed under a stricter XML interpreter so typos would be more easily caught. XHTML caught on as a cool thing to do, but most sites served it with a text/html MIME type that completely negated any difference from HTML in the first place. Herd mentality ruled the day on that one.

CSRF and clickjacking are called out as security concerns in the HTML5 spec. For some developers, that may have been the first time they heard about such vulns even though they’re fundamental to how the web works. They’re old, old vulns. The good news is that HTML5 has some design improvements that might relegate those vulns to history.

The <video> element doesn’t speak to security; it highlights the influence of non-technical concerns for a standard. The biggest drama around this element was the choosing whether an explicit codec should be mandated.

WebGL is an example of pushing beyond the browser into graphics cards. These hardware for these cards doesn’t care about Same Origin Policy or even security, for that matter. Early versions of the spec had two major problems: Denial of Service and Information Leakage. It was refreshing to see privacy (information leakage) receive such attention. As a consequence of these risks browsers pulled support. Early implementation allowed researchers to find these problems and improve WebGL. Part of its revision included attachment to another HTML5 security policy: Cross Origin Resource Sharing (CORS).

Like WebGL, the WebSocket API is another example where browsers implemented an early draft, revoked support due to security concerns, and now offer an improved version. For example, the WebSockets include a handshake and masking to prevent the kind of cross-protocol attacks that caused early web browsers to block access to ports like SMTP and telnet.

These examples show us a few things. One, we shouldn’t be surprised at the tensions from competing desires during the drafting process. Two, secure design takes time. (Remember PHP?) And three, browser developers are pushing the curve on security.

It’s only a matter of time before XSS rears its ugly head during a discussion of web security. After all, HTML injection has tormented developers from the beginning. Early examples of malicious HTML used LiveScript, the ancestral title to JavaScript. In 1995 Netscape offered a Bug Bounty for its browser. The winning exploit exposed a privacy hole and netted $1000. Interestingly, the runner up was a crypto timing attack that could, for example, reveal the secret key of an SSL server. Even if RSA has a secure design in terms of cryptographic primitives, vulns will appear in its implementation. That was merely a hint of the trouble to come for SSL/TLS.

Anyway, that was a nice $1000 bug in 1995. HTML injection continued to grow, with one of the first hacks demonstrated against a web-based email system in 1998. Behold, the mighty <img> tag using a javascript: URI to pop up a login prompt. That was just a few years after the term phishing had been coined.

So is there really an HTML5 injection? What terrible flaws does the new standard contain that its predecessors did not?

Not much. An important improvement from HTML5 is that parsing HTML documents is codified with instructions on order of operations, error handling, and fixup steps. A large portion of XSS history involves payloads that exploit browser quirks or bizarre parsing rules.

A key component to the infamous Samy worm’s success was Internet Explorer’s “fix up” of a javascript: token split by a newline character (i.e. java\nscript) to a single, valid URI. A unified approach to parsing HTML should minimize these kinds of problems, or at least make it easier to test for them. Last year a bug was found in Firefox’s parsing of HTML entities when a NULL byte (%00) was present. That was an implementation error; HTML5 actually provides instructions on how that entity should have been handled. The persistent danger will be a browser’s legacy support and non-standards (or relaxed standards) mode.

Sites that have weak deny listing will suffer the most from the arrival of HTML5. HTML5 has new elements and new attributes that provide JavaScript execution contexts. If your site relies on fancy regexes to strip out all the cool hacks from XSS cheat sheets you’ve been scouring, then it’s still likely to miss the new tags of HTML5.

The initial excitement around HTML5-based XSS was the autofocus attribute. A common reflection point for HTML injection is the value of an <input> element. Depending on the kind of payload injected, an exploit would require the victim to perform some action (submit the form, click a field, etc.). The autofocus attribute lets an exploit to automatically execute JavaScript tied to an onfocus or onblur event.

There’s a cynical perspective that HTML5 will bring a brief period of worse XSS problems by developers who embrace HTML5’s enhanced form validation while forgetting to apply server-side validation. There’s nothing misleading about HTML5’s approach to this. More pre-defined <input> types and client-side regexes improve the user experience. It’s not intended to be a security barrier. It’s a usability enhancement, especially for browsers on mobile devices.

HTML5 has distressingly few ways to minimize the impact of XSS attacks with <iframe> sandboxing and Cross Origin Resource Sharing controls. They help, but they don’t fundamentally change the design of the Same Origin Policy, which has the drawback that all content within an Origin receives equal treatment. Rather than providing a design of least privilege access, it’s a binary all or nothing privilege. That’s unappetizing for modern web apps that wish to implement everything from mashups to advertising to running third-party JavaScript within a trusted Origin.

The Content Security Policy (CSP) introduces design-level countermeasures for vulns like XSS. CSP moved from a Mozilla project to a standards track for all browsers to implement. A smart design choice is providing monitor and enforcement modes. It’s implementation will likely echo that of early web app firewalls. CSP complexity has the potential to break sites. Expect monitor mode to last for quite a while before sites start enforcing rules. The ability to switch between monitor and enforce is a sign of design that encourages adoption: Make it easier for devs to test policies over time.

HTML injection deserves emphasis since it’s the most pervasive problem for web apps. But it’s not the only problem for web apps. Other pieces of HTML5 have equally serious concerns.

The Web Storage API adds key-value storage to the browser. It’s effectively a client-side database. Avoid the immediate jump to SQL injection whenever you hear the word database. Instead, consider the privacy implications of Web Storage. We must be concerned about privacy extraction, not SQL injection. Web Storage has already been demonstrated as yet another tool for insinuating supercookies into the browser. In an era when developers still neglect to encrypt passwords in server-side databases, consider the mistakes awaiting data placed in browser databases: personal information, credit card numbers, password recovery, and more. And all of this just an XSS away from being exfiltrated. XSS isn’t the only threat. Malware has already demonstrated the inclination to scrape hard drives for financial data, credentials, keys, etc. An unencrypted store of 5MB (or more!) data is an appealing target. Woe to the web developer who thinks Web Storage is a convenient place to store a user’s password.

The WebSocket API entails a different kind of security. The easy observation is that it should use wss:// in favor of ws://, just like HTTPS should be everywhere. The subtler problem lies with the protocol layered over a WebSocket connection.

Security from controls like HTTPS, Same Origin, and session cookies don’t automatically transfer to WebSockets. For example, consider a simple chat protocol. Each message includes the usernames for sender and recipient. If the server just routes messages based on usernames without verifying the sender’s name matches the WebSocket they initiated, then it’d be trivial to spoof messages. Or consider if the app does verify the sender and recipient, but users’ session cookies are used to identify them. If the recipient receives a message packet that contains the sender’s session ID – well, I hope you see the insecurity there.

If there’s one victim of the HTML5 arms race, it’s the browser exploit. Not that they’ve disappeared, but that they’ve become more complex. A byproduct of keeping up with (relatively) quickly changing drafts is that modern browsers are quicker to update. More importantly, self-updating shares a of features like plugin sandboxing, process separation, and even rudimentary XSS protection. Whatever your choice of browser, the only version number you need any more is HTML5.

That’s the desire. In practice, accelerating browser updates isn’t going to adversely affect the pwn to own and exploit communities any time soon. IE6 refuses to disappear from the web. Qualys’ BrowserCheck stats show that browsers still tend to be out of date. But worse, the plugins remain out of date even if the browser is patched. In other words, Flash and Java deserve fingerpointing for being responsible for exposing security holes. When was the last time Adobe released a non-critical Flash update?

Browser security isn’t restricted to internal code. A header like X-Frame-Options offers an easy defense against clickjacking. New HTML5 capabilities like the sandbox attribute for iframes would defeat JavaScript-based frame busters intended to block clickjacking. With one fell swoop of security design (and adding a single header at your web server), it should be possible to get rid of an entire class of vulnerability. The catch is getting sites to implement it.

The browser needs the complicity of sites in order for a feature like X-Frame-Options to matter. It’s one thing to scrutinize the design of a half-dozen or so web browsers. It’s quite another to consider the design of millions and millions of web sites.

There is a looming XSS threat, but it’s a byproduct of the ecosystem building around HTML5. Heavy JavaScript libraries have become major components of modern web apps. JavaScript has a challenging environment for security. Its interaction with the DOM is restricted by Same Origin Policy. On the other hand, its prototype-based design and global namespaces.

JavaScript libraries are great. They reinforce good programming patterns and provide functionality that would otherwise have to be created from scratch. The flip side of libraries is that they offer additional exploit vectors and need to be maintained.

Let’s return to the idea of deny lists to discuss the other insidious aspect of XSS. These libraries also have functions that expose eval(), DOM manipulation, and XHR calls, among others. By no means is there anything insecure or inadvisable about this. All it does it magnify the impact if an XSS vuln already exists on the site – which isn’t likely to be from the JavaScript library.
HTML5 Unbound, part 2 of 4 May 25, 2012

(The series continues with a look at the relationship between security and design in web-related technologies prior to HTML5. Look for part 3 on Monday.)

Security From Design

The web has had mixed success with software design and security. Before we dive into HTML5 consider some other web-related examples:

PHP superglobals and the register_globals setting exemplify the long road to creating something that’s “default secure.” PHP 4.0 appeared 12 years ago in May 2000, just on the heels of HTML4 becoming official. PHP allowed a class of global variables to be set from user-influenced values like cookies, GET, and POST parameters. What this meant was that if a variable wasn’t initialized in a PHP file, a hacker could set its initial value just by including the variable name in a URL parameter. (Leading to outcomes like bypassing security checks, SQL injection, and accessing other users’ accounts.) Another problem of register_globals was that it was a run-time configuration controllable by the system administrator. In other words, code secure in one environment (secure, but poorly written) became insecure (and exploitable) simply by a site administrator switching register_globals on in the server’s php.ini file. Security-aware developers tried to influence the setting from their code, but that created new conflicts. You’d run into situations where one app depended on register_global behavior where another one required it to be off.

Secure design is far easier to discuss than it is to deploy. It took two years for PHP to switch the default value to off. It took another seven to deprecate it. Not until this year was it finally abolished. One reason for this glacial pace was PHP’s extraordinary success. Changing default behavior or removing an API is difficult when so many sites depend upon it and programmers expect it. (Keep this in mind when we get to HTML5. PHP drives many sites on the web; HTML is the web.) Another reason for this delay was resistance by some developers who argued that register_globals isn’t inherently bad, it just makes already bad code worse. Kind of like saying that bit of iceberg above the surface over there doesn’t look so big.

Such attitudes allow certain designs, once recognized as poor, to resurface in new and interesting ways. Thus, “default insecure” endures. The Ruby on Rails “mass assignment” feature is a recent example. Mass assignment is an integral part of Ruby’s data model. Warnings about the potential insecurity were raised as early as 2005 – in Rails’ security documentation no less. Seven years later in March 2012 a developer demonstrated the hack against the Rails paragon, GitHub, by showing that he could add his public key to any project and therefore impact its code. The hack provided an exercise for GitHub to embrace a positive attitude towards bug disclosure (eventually). It finally led to a change in the defaults for Ruby on Rails.

SQL injection has to be mentioned if we’re going to talk about vulns and design. Prepared statements are the easy, recommended countermeasure for this vuln. You can pretty much design it out of your application. Sure, implementation mistakes happen and a bug or two might appear here and there, but that’s the kind of programming error that happens because we’re humans who make mistakes. Avoiding prepared statements is nothing more than advanced persistent ignorance of at least six years of web programming. A tool like sqlmap stays alive for so long because developers don’t adopt basic security design. SQL injection should be a thing of the past. Yet “developer insecure” is eternal.

But I’m not bringing up SQL injection to rant about its tenacious existence. Like the heads of the hydra, where one SQL injection is gone, another will take its place. The NoSQL (or anything-but-a-SQL-server) movement has the potential to reinvent these injection problems. Rather than SELECT statements, developers will be crafting filters with JavaScript, effectively sending eval() statements between the client and server. This isn’t a knock against choosing JavaScript as part of a design, it’s the observation that executable code is originating in the browser. When code and data mix, vulns happen.

Then there’s JavaScript itself. ECMAScript for the purists out there. At a high level, JavaScript’s variables exhibits the global scope of PHP’s superglobals. Its prototype system is reminiscent of Rails’ mass assignment. Its eval() function wreaks the same havoc as SQL or command injection. And we need it.

JavaScript is fundamental to the web. Fundamental to HTML5. And for all the good it brings to the browsing experience, some unfortunate insecurities lurk within it. Forget VBScript, skip Google’s Dart, JavaScript is the language for browser computing. Good enough, in fact, that it has leapt the programming chasm from the browser to server-side code. If we were to tease developers that PHP stood for Pretty Horrible Programming, then Node must stand for New Orders of Developer Error. Note the blame for insecurity falls on the developers, not the choice of programming language or technology. (Although you’d be crazy to expose a node.js server directly to the internet.)

Basic web technologies didn’t start off much better than the server-side technologies we’ve just sampled. Cookies grew out of implementation, not specification. That’s one reason for their strange relationship with browser security. Cookies have a path attribute that’s effectively useless; it’s an ornamentation that has no bearing on Origin security. The httponly and secure attributes affect their availability to JavaScript and https:// schemes. Sometimes you have access to a cookie; sometimes you don’t. These security controls differ from other internal browser security models in that they rely on domain policy rather than Origin policy. Many of HTML5’s features are tied to the Same Origin Policy rather than a domain policy because the Origin has more robust integration throughout the browser.

Same Origin Policy is a core of browser security. One of its drawback is that it permits pages to request any resource – which is why the web works in the first place, but also why we have problems like CSRF. Another drawback is that sites had to accept its all or nothing approach – which is why problems like XSS are so worrisome.

User Agent sniffing and HTTPS are two other examples of behavior that’s slow to change. Good JavaScript programming patterns prefer feature detection rather than making assumptions based on a single convoluted string. In spite of the problems around SSL/TLS, there’s no reason HTTP should be the default connection type for web sites. Using HTTPS places a site – and its users – in a far stronger security stance.
HTML5 Unbound, part 1 of 4 May 23, 2012

(This is the first part in a series of articles that accompany my Security Summit presentation, HTML5 Unbound: A Security & Privacy Drama.)

The Meaning & Mythology of HTML5

HTML5 is the most comprehensive update in the last 12 years to a technology that’s basically twenty years old. It’s easy to understand the excitement over HTML5 by looking at the scope and breadth of the standard and its related APIs. It’s easy to understand the significance of HTML5 by looking at how many sites and browsers implement something that’s officially still in draft.

It’s also easy to misunderstand what HTML5 means for security. Is it really a whole new world of cross-site scripting? SQL injection in the browser? DoS attacks with Web Workers and WebSockets? Is there something inherent to its design that solves these problems. Or worse, does it introduce new ones?

We arrive at some answers by looking at the history of security design on the web. Other answers require reviewing what HTML5 actually encompasses and the threats we expect it to face. If we forget to consider how threats have evolved over the years, then we risk giving a thumbs up to a design that merely works against hackers’ rote attacks rather than their innovation.

First let’s touch on the meanings of HTML5. A simple definition is a web page with a <!doctype html> declaration. In practice, this means new elements for interacting with audio and video; elements for drawing, layout, and positioning content; as well as APIs that have have their own independent specifications. These APIs emerge from a nod towards real-world requirements of web applications: ways to relax the Same Origin Principle (without resorting to insecure programming hacks like JSONP), bidirectional messaging (without resorting to programming hacks like long-polling), increased data storage for key/value pairs (without resorting to programming hacks that use cookies or plugins). It also includes items like Web Workers to help developers efficiently work with the increasing amount of processing being pushed into the browser.

There’s a mythology building around HTML5 as well. Some of these are innocuous. The web continues to be an integral part of social interaction, business, and commerce because browsers are able to perform with desktop-like behaviors regardless of what your desktop is. So it’s easy to dismiss labels like “social” and “cloud” as imprecise, but mostly harmless. Some mythologies are clearly off mark, neither Flash nor Silverlight are HTML5, but their UI capabilities are easily mistaken for the type of dynamic interaction associated with HTML5 apps. In truth, HTML5 intends to replace the need for plugins altogether.

Then there are counter-productive mythologies that creep into HTML5 security discussions. The mechanics of CSRF and clickjacking are inherent to the design of HTML and HTTP. In 1998, according to Netcraft, there were barely two million sites on the web; today Netcraft counts close to 700 million. It took years for vulns like CSRF and clickjacking to be recognized, described, and popularized in order to appreciate their dangers. Hacking a few hundred million users with CSRF has vastly different rewards than a few hundred thousand, and consequently more appeal. If CSRF is to be conflated with HTML5, it’s because the spec acknowledges security concerns more explicitly that its ancestors ever did. HTML5 mentions security over eighty times in its current draft. HTML4 barely broke a dozen. Privacy? It showed up once in the HTML4 spec. (HTML5 gives privacy a little more attention.) We’ll address that failing in a bit.

So, our stage is set. Our players are design and implementation. Our conflict, security and privacy.