Douglas Crockford: Principles of Security


Uploaded by yuilibrary on 09.04.2012

Transcript:
>> ALLEN RABINOVICH: Welcome. He really doesn't need any introduction so I'm just ad-libbing:
Mr. Douglas Crockford!
[applause]
>> DOUGLAS CROCKFORD: Good evening. Glad to see you all here. Even though I need no introduction,
I feel compelled to introduce myself anyway. I'm Douglas Crockford of Yahoo! Yeah, this
is my Yahoo! ID, okay. But there's more. This is my California driver's license, okay? This
is my United States passport. All right, okay? Now that you have verified my credentials,
you now must believe everything I tell you, you must obey all of my orders, and when asked,
you must reveal all your secrets. It makes sense because we've gone through this identification
ritual.
Oh, one other: this is my identity membership card from the Cosmopolitan Hotel in Las Vegas.
Las Vegas is one of my most favorite cities, and the Cosmopolitan is my favorite of all
hotels. If anyone from the Cosmopolitan is watching this, you probably want to bump my
identity status from Silver up to Platinum. That'd be a really good idea.
[laughter]
The reason I went through all that is because tonight's topic is really important. Officers,
you may lock the doors now. We're going to be talking about security. Security can mean
a lot of things, so let me first tell you what we're not going to talk about. It's not
going to be about physical security, not about emotional security, or national security,
whatever that is. It's not going to be about imaginary security, like the security theater
that we all enjoy at airports. I'm not going to be talking about that tonight. I'm not
going to be talking about the anti-security which is obtained by getting tough on crime,
or going to war unwisely. I'm going to be talking about the security of distributed
applications. I've been concerned about this problem for much longer than I've been concerned
about JavaScript, so I've got a lot to tell you tonight.
It's very often thought that this is a conflict between white hats and black hats. The white
hats are good guys who are trying to stop the bad guys, who are wearing black hats and
who are trying to get into our systems. I think this is a really bad way to think about
security. I don't trust this model for a couple of reasons. One is it turns out a lot of the
most famous white hats were once black hats, and there have been some famous black hats
who had been white hats. There are lots of gray hats who seem to be playing both sides.
It means you really have to be suspicious of anybody who claims to be a security expert.
You just have to.
But it's worse than that. There's a sense of overspecialization, that we can rely on
the white hats to take care of all of our security so we don't have to worry about security
at all. As developers it's just not our job. That turns out to be really toxic. In my travels
I've seen organizations where project managers would instruct their developers to intentionally
violate the company's security in order to make a deadline, which is an awful thing to
do. But their thinking is that eventually the company's white hats will figure out what
they did and some engineers will have to spend some unpaid overtime in order to fix it. But
the manager still gets the win; he still gets credit for having met the deadline, even though
he violated the company's security in order to do that. That isn't tolerable, we can't
do that.
That's possible because of this overspecialization, that we can delegate all of that security
thinking to experts. We can't do that. Instead, we have to recognize that security is everybody's
job. Everybody needs to be working on security. It's something we can't leave to specialists.
Now, about the format of tonight's talk. I'm going to be giving you a lot of principles
that have to do with security, and each one of them is going to have a purple background,
purple for principle. The other convention I'm going to have tonight is that sources
of insecurity will be presented in red boxes.
One of the sources of insecurity is that things change. Here's an example. When the World
Wide Web was first imagined, it was going to be a simple document delivery system. The
security requirements for a simple document delivery system are pretty minimal, but over
time it has evolved into an application delivery system. The web is only interesting today
because it has become an application delivery system. An application delivery system has
very different security properties and requirements than a document delivery system, and the web
still hasn't caught up to that. The web was, as originally intended, possibly secure, but
is not as a result of the changes it's gone through. So it is not unusual for the purpose,
or use, or scope of software to change over its life. Rarely are the security properties
of software system reexamined in the context of new or evolving missions, and this leads
to insecure systems.
In thinking about security, it means we have to do the right thing all the time. That turns
out to be surprisingly hard, because our intuitions about security of network systems are very
often wrong, so just intending to do the right thing is not enough.
I'm going to give you tonight a set of principles. I'm not going to be giving you tricks and
hacks, because ultimately those don't work. There's no security obtained from tricks.
I've seen very specific talks about how to make your application secure which give extremely
bad advice which might be effective at the moment that it's given -- but sometimes not
-- but which will age very poorly and actually introduce new security problems when the next
iterations happen. Also, there is such a huge volume of tricks and hacks, there's no way
I could deliver them all in an evening, no way you could memorize them all, and there's
no way you could keep up to date. It's just too much. But the principles don't change.
Once you get the principles down, then you can reason about this stuff yourself and you
can do it properly. So that's going to be the focus for tonight.
Now, this stuff is hard to reason about because your intuitions are wrong. When we think about
other modes of security, deterrence is often a good thing to do. If we can prevent people
from doing things by frightening them or by going after them after they do it, then that's
sometimes effective. But in online systems deterrence is not effective. You can't punish
an invisible attacker, you can't punish a bot, you can't punish a script, so there's
no form of intimidation which will keep you safe. The only thing that works is prevention.
Prevention is the only card we have to play.
We need some historical context so we can start thinking about how to think about security.
We're going to start with this guy. This is Johann Martin Schleyer. He was a 19th century
German priest, a Roman Catholic priest in the Baden area in Germany. One night God came
to him in a dream and told him to do something. What did he tell him to do? Well, in order
to understand what God told him we have to go back a little bit further.
A long time ago on the Plane of Shinar, some of the best architects, material specialists,
and builders in the world got together to build a tower to reach all the way to Heaven.
For some reason, God didn't want them to do that. He was concerned about them all getting
together and working together and accomplishing great things. That was not something that
he intended, so he came down and confounded our speech, caused everybody on the project
to start speaking different languages. Then they all wandered off and started their own
countries, apparently, and went off speaking their own languages there. Basically he created
the i18n problem. We don't know exactly when this happened. It was some time after Noah
got drunk and his robe popped open, but before Lot offered his virgin daughters to the Sodomites.
Somewhere in that area is when that happened.
Then many thousands of years later, God appears to Schleyer and says I changed my mind about
that confounded speech thing; I want you to invent a new language which will unify the
world. And he did. He created a new language called Volapük. He punished a book about
his new language in 1880 in German, and it was a hit. There was a huge amount of activity
around Volapük. Someone told him that people who speak English were comfortable with umlauts
-- and I can tell you that I'm not, which is why I keep mispronouncing the name of the
language.
Now, he was not the first to invent a new language. If you read The Baroque Cycle by
Neal Stephenson you'll remember John Wilkins, who was inventing a philosophical language.
George Delgarno in England was doing a similar thing about the same time. Those are real
people, and they were inventing a real language. They were not the first to be inventing new
languages, and many people after them invented languages. But Schleyer's, for some reason,
caught the imagination of Europe, and its growth was explosive. Every two weeks or so
there was a new journal being published about Volapük, or in Volapük. Every other day
a book was being published about or in Volapük. There were Volapük societies forming all
over the world; there were about 30 of them. The estimated number of speakers was between
a quarter million and a million, and this all happened in a few years. Just explosive
growth.
The reason for this wasn't because it was such a great language design. It was actually
a problematic design. But the people of Europe were tired of war. Europe had been in a constant
state of war for centuries, and they were never good wars, they were always wars about
ambitions or the failure of politics. Regular people were getting killed all the time and
not benefiting from any of it, and they were tired of it. They could see that Europe was
getting even more militant, and they could see bad things on the horizon. There was some
philosophical belief that languages were part of the root of the problem, because everyone
was speaking different languages it was impossible to unify Europe. And there was not a great
sense of cultural pride about languages because in most cases you were speaking the language
of some conqueror from many centuries ago, and the conqueror is gone but you're still
speaking his language. Let's find another language. A lot of people got behind Volapük
and said this is it, this is the way we can go forward together.
So there was a huge amount of excitement, and a lot of it was due to this guy: Auguste
Kerckhoffs. He was a Dutch linguist who translated the Volapük books into lots of other languages,
went all over Europe lecturing about the language, helping to create a lot of enthusiasm about
it. He was rewarded for his efforts by being appointed the director of the International
Volapük Institute, so he was responsible for helping to promote the language. Being
a linguist, and having spent a lot of time trying to teach this language, he found that
there were some aspects of the language which were unnecessarily complicated, and he thought
that if he could reform the language, if he could simplify it, reduce it to its good parts,
if you will, then the language would be a lot easier to teach, a lot easier to learn,
a lot easier to adopt, and it would be more likely that the world would adopt this language.
At the third international Volapük conference, which was the first conference held all in
Volapük -- even the waiters at the conference were speaking Volapük -- he presented his
idea for these improvements to the language. Many of the delegates at the convention said
this is great. Schleyer, on the other hand, said no. God had told him to invent this language,
and he wasn't going to let anybody else change it, so he insisted that the convention give
him a veto. At that point the convention forked. Half of it went with Schleyer, the other half
went with Kerckhoffs, and the movement fell apart. Suddenly there were cunning linguists
all over the place who started saying hey, it's open season on new ideas -- I've got
new features I want to add! They started publishing new features, and the language started splintering
all over the place. Other people were saying well, let's forget this language, let's start
all over with Esperanto. Esperanto was actually a better language design, but it never had
anywhere near the reach that Volapük had.
The thing fell apart less than a decade after Schleyer had published his book. It was done.
It left things in a worse state, that it was less likely now that there would ever be a
universal language. Instead of debabelization, it resulted in rebabelization. There were
actually more languages when he finished than when he started.
And it goes on. There have been lots of artificial languages invented since then. Charles Ogden
invented basic English. He was taking English and reducing it to 850 words that would be
easy for anybody to learn. The guy who created Simon Templar, The Saint, created a language
called Paleneo. The guy who created the board game Careers created a language called Loglan.
JR Tolkien created lots of languages for all these mythic races, then wrote poetry in those
languages, and then histories about the people who were making up the poems and used all
of that as material in his epic novel The Lord of the Rings.
Tolkien called his compulsion to design languages a secret vice. It seems that the compulsion
that causes some people to make languages is really similar to the compulsion that makes
some people make programming languages; they're the same sort of thing. Any idea what the
most popular artificial language is today? Any guesses? Yeah, it's Klingon. It's true.
So why did I tell you all of that? Well, it's because I wanted to introduce you to Kerckhoffs.
Before he got mixed up over that Volapük thing, he wrote in French a book about military
cryptography. It was the first modern book about cryptography. Telegraph was a fairly
recent invention at this time, and Kerckhoffs was the first to reexamine the requirements
of cryptography in a world that had telegraph, which had electronic communication. All previous
systems had been based on paper systems, and the properties of an electronic system can
be quite different. Kerckhoffs was the first to figure this out. The principles that he
came up with still work. It was an amazing piece of work.
One of the things he recommends has become called the Kerckhoffs Principle. He said:
"the design of a system should not require secrecy; and compromise of the system should
not inconvenience the correspondents". What did he mean by that?
Here we have Alice and Bob. Alice and Bob want to exchange a message, but they're afraid
that they're going to be spied upon. They want to use some cryptography so that nobody
can find out what they're saying, so they've got a crypto system. Alice will take her message
of plain text and the key and put it into a machine which will encrypt the message and
produce a cypher text. She can then transmit the cypher text to Bob, who will then put
it in his decryption machine -- which might be identical to the encryption machine -- and
a key which is probably the same as Alice's key by previous arrangement. That way nobody
else can read the text.
The thing that Kerckhoffs said, which was amazing, was that there should be no secrets
inside of the encryption machine, that you should assume that the enemy is going to find
out how the machine works, and having them know how it works does not compromise the
security of the system. Prior to that, there had been thinking that you could do something
kind of silly in the box and as long as you could keep the contents of the box secret
it didn't matter. Kerckhoffs said no, you've got to do this right so that there can be
no secrets in that. Some people have taken it further and said not only should you assume
that the enemy has it, you should go ahead and publish it, make sure that the enemy has
it, because that's the only way you can be confident that you've got the right discipline.
One of the corollaries of that is that there is no security in obscurity. Doing something
in which the truth is hard to find is not effective, because the enemy can find it.
Making it hard to find is not good enough. I still see experts claiming that the best
way to build a cryptographic system is still to try to hide as much material as possible
to force the bad guys to try to find more stuff, uncover more information in order to
break it. It's wrong. Kerckhoffs showed us it was wrong over a century ago. There are
some people today who still haven't figured that out. It's because the more secrets you
have, the harder they are to keep. Keeping any secrets is hard, so in a cryptographic
system the only thing you want to have secret is the keys. Just that is hard enough.
Sometimes when we look at cryptographic systems people say well, we're using a good algorithm
and therefore we're unlikely to ever be broken. But there are many other places where you
can break a cryptographic protocol, and you don't get to choose what part of it the attacker
is going to go after. They're going to go after you where you're weakest and not strongest.
Let me show you an example of that. There's an encryption algorithm called the One Time
Pad, which is provable to be unbreakable. You cannot break this. All the computers running
till the end of time can never break this code. It's amazing. You are going to break
it. But first let me show you how it works. There are a couple of rules on the use of
the One Time Pad. The first is that the key must always be kept secret -- this is going
to be true of any crypto system. The key must be at least as long as the plain text, so
however many bits you have in the plain text, the key is the same size. That's a little
unusual, that's one of the distinct characteristics about this algorithm. And the cypher text
is obtained by xor-ing the text with the key, so it's pretty easy.
Here's an example of it. This is my message, this is my plain text. It happens to be a
picture, in this case. It's the JSON logo. Usually we think about encrypting text, but
there's no reason why we can't encrypt pictures, and pictures are easier to demonstrate. That's
my text. This is my key. It's a bunch of random numbers; that's all it is, random pixels.
For those of you who grew up with broadcast television you might remember this; we used
to watch a lot of this. And it's random -- I took a lot of trouble to create random numbers
here, and that turns out to be a surprisingly difficult thing to do. But they're there.
If I now exclusive or the two things together, I now get the cypher text. If I did my job
correctly, you cannot see any aspect of the image in that; it's completely hidden inside
of the randomness. It looks like I did it right that time.
Okay, there was one more rule: the key must be perfectly random, whatever perfectly random
is. In cryptography there is a sense of randomness which is much more severe, or precise, than
in other disciplines. Let me demonstrate that. This is a key that I made with Photoshop.
I just used Photoshop's noise filter, and made this. It looks exactly like the other
one, right? For you in the back, all you see is gray. You sitting closer, you might see
some pixels but you can't see any pattern in this, right? But it was not a cryptographically
secure random number generator, so when I exclusive or with this one, you can see the
image, right? It leaked through, because the key was not sufficiently random. You just
broke the code, you could see what the message was that I was hiding. You're a crypt analysis
now.
The final rule is that a key must never be used more than once. Let me show you what
happens if you use a key more than once. Here is another message. This happens to be a picture
of me in Istanbul. You remember this; this is the same key that I used the first time,
right? You recognize the pixels; it's exactly the same. I'll exclusive or them together,
and good, the image is still hidden. Now I'm going to exclusive or together the two cypher
texts. This is something that an eavesdropper would have access to. When I do that, the
two keys cancel out, and I can see both messages. Exclusive or-ing two messages together gives
you no security, so you've now broken two messages. Congratulations.
Cryptography is not security. Cryptography is one of the tools that we use to build secure
systems, but simply putting cryptography into a system doesn't guarantee it's secure. There's
a lot of other stuff you have to attend to. Part of the discipline of being a cryptographer
is you have to imagine that at every stage in the development of your protocol, you may
be vulnerable to attack, and you have to reason all of that stuff out. That's the kind of
intuition that we need to be developing, as well, in our applications.
One of the things that cryptographers think about is that there are more people involved
in this transaction than just Bob and Alice. For example, there may be Eve, the eavesdropper.
Eve might have a packet sniffer, and she's just watching all the traffic going on between
Alice and Bob. By analyzing that traffic and by keeping all the messages, she may be able
to figure out stuff about what's going on there and compromise them. Eve is one of the
standard characters that's of importance to a cryptographer.
Another character is Mallory. Mallory can do man-in-the-middle attacks. Mallory might
be operating a free public WiFi hot spot, and Bob connects to Mallory, thinking that
he's connecting to Alice. Alice asks what's the password, she asks it of Mallory, Mallory
asks it of Bob, Bob gives his password to Mallory, Mallory gives it to Alice, Bob asks
Mallory what's my balance, Mallory says to Alice, change my password, and Mallory says
please transmit my account to the Caymans, and then sends a message back to Bob saying
everything's great. That's something cryptographers worry a lot about.
Then in my own practice, I developed one more character who I call Satan. Satan is very
powerful and totally malicious, and wants to be one of our customers. Now, some people
think the way you deal with Satan is with an identity system: we'll figure out who everybody
is and as long as nobody is Satan, then everything's going to be fine. That turns out not to work;
there's no way you can do that. So what you have to do instead is assume somewhere among
all of our users and customers is going to be Satan, and if we do our jobs right, Satan
can come and interact with us and cannot cause us any harm, and cannot cause any harm to
any of our other customers or partners. Only if we're that confident have we done a good
job.
One of the things we should learn from the cryptographers is that security needs to be
factored into every decision. Not only is it our job -- we didn't used to think of it
as our job, but it is -- but it's our job all the time. Everything we do has to consider
the security of what we're doing. In the development of an application we might end up making thousands
of millions of decisions, and all of those decisions must consider the security implications
of what we're doing.
One of the biggest causes of insecurity is 'we'll go back and make it secure later'.
That's really common when architects or developers are building a new operating system, or a
new stack, or a new platform, or a new set of services, or a new protocol or whatever.
They think the hard part is getting the machine to boot, or getting the thing to cycle, or
getting the pixels on the screen, or getting the boxes to talk to each other. Making it
secure, we'll save that to 2.0, and that's a tragic mistake that is really, really difficult
to do, and is rarely done.
Part of the reason for that is that you can't add security, you can only remove insecurity.
If you've published a platform and people have used it in an insecure way, it's difficult
to then remove those features, because they are inherently insecure, and replace them
with more reliable features. You just can't do that without causing big breakage. It's
not compatible. You need to fix it before you release it.
Another fallacy is that having survived to this point probably means that we're not going
to be hacked. That doesn't work at all. As we become more successful, as our business
grows, we become bigger targets, and eventually we can expect that they come after us.
The impossible is not possible. Maybe that should be called Crockford's Principle; I
think it's pretty good. You should not be depending on anything which can't be done
in order to make you secure, because it can't be done. But related to that is the idea that
you shouldn't be trying to do things that are not going to be effective. Sometimes there's
the idea, well, we can't stop them but we sure as heck can slow them down. We'll put
some speed bumps in the information super highway and that'll keep us safe for a little
while. That turns out not to work at all. If what you're doing is not effective then
it's ineffective, and you're wasting your time. In putting together these speed bumps,
you're using resources that could have been used to do something that was more effective.
Don't prohibit what you can't prevent. The corollary is that the bad guys will exploit
whatever you cannot prevent.
False security is worse than no security. If you know that you have no security then
at least you'll be smart about risk taking. If you think you are secure and you're not,
you're going to make really bad judgments. Also, it turns out that false security has
a cost, and by pursuing false security you're not pursuing better forms of security.
Which brings us to the browser platform. The browser is horribly insecure. We're still
fixing it. The web is now drinking age and we're still fixing it later. And HTML5 made
it worse instead of better by adding a bunch of powerful new capabilities but not constraining
the ability of bad guys to get at those, so it's made everything worse.
Despite that, the web is better than everything else. All other application platforms and
application delivery systems are strictly worse than the web. The reason for that is
their blame the victim security model. One thing all those systems have in common that
the web does not do is ask questions of the user about what a program should be able to
do, and generally ask them in a way that the user cannot answer them correctly. All that
accomplishes is that when the thing finally goes wrong, you can say well, it's your fault,
you agreed to this. Although the user said I never agreed to have my identity stolen.
But you did agree that it could have access to your file system. You go, yeah. You did
agree that I could have access to the internet. You go, yeah. Well, there you go, that's identity
theft. The web doesn't do that, and that's why the web is better at security than everything
else, but there is still a lot that it gets wrong.
The thing that the web got right that everybody else got wrong was whose interest does the
program represent. The browser knows that the program does not represent the user, or
the owner of the computer. The site is represented by the program. All other systems, going back
to UNIX and way beyond, all the way to the beginning of time, got that wrong. They think
that the program represents the user, so the program gets all of the user's privileges.
But the user may not necessarily intend that the program be able to do anything beyond
do the useful thing that the program was obtained for.
So the web got some things wrong. What did it get wrong? It turns out there are more
interests involved than the users and the sites. There can be third parties, and fourth
parties, and many other parties on the same page. A malicious party can exploit code conventions
to inject malicious code onto the page, and that code gets all of the rights of the site.
This can compromise the site and the user. This is known as the XSS problem.
So what can an attacker do if he can get some script onto your page? The attacker can request
additional scripts from any server in the world. Once it gets a foothold -- and it only
needs a tiny amount of code to do that -- it can obtain all the additional scripts it wants
from the most evil websites in the world. The browser has the same origin policy that
limits the ability of a page to interact with other sites, but it in no way limits the ability
of an attacker to get more script to run on your page.
An attacker can read the document. The attacker can see everything the user can see and a
lot of the things the user can't see. It can see hidden fields, comments, cookies, all
sorts of stuff which is invisible on the page. The attacker can make requests of your server,
and your server cannot detect that the requests did not originate from your application. Now,
you should be using SSL to secure your connections, but if you do, it doesn't help you here because
the attacker gets access to your secure connections. You should be authenticating your requests
from the browser with a special token, sometimes called a crumb -- that doesn't help. The attacker
has access to that as well.
If your server accepts SQL queries then the attacker gets direct access to your database,
and can do anything that SQL will allow them to do. Now, if your server application is
creating SQL queries by concatenating together pieces of material that it gets from the browser,
then you probably gave access to the attacker to your database, because SQL was optimized
for SQL injection attacks.
The attacker has control over the display and can request additional information from
the user, and the user cannot detect that the request did not originate from your application.
The browsers all have anti-phishing chrome in them now. The problem with it is that the
users don't pay any attention to it. If they did, the chrome would be saying this is a
legit request, go ahead and give it. Because what the browser is looking for is where the
HTML came from, not where the script came from, and it's the script that's evil here.
The HTML is inert.
Some sites, whenever something scary is about the happen, think okay, let's make sure that
the user is still on board, so let's ask for their password again. That doesn't help you
in this case because the attacker has control of the screen, so he can go to the user and
say what's your password, and everything tells the user that this is a legitimate request:
give it up. In fact, if your site routinely asks the user to give up their password at
unlikely times, what you're doing is training the users to give up the password anytime
an attacker asks for it.
The attacker can then send the information that it obtained by talking to your servers
or scraping the page or talking to the user and send it to any server in the world. Again,
there's the same origin policy in the browser, which does not limit the ability of the attacker
to send this information to the most evil site on the planet. Anybody freaked out yet?
This is a problem. This is why we worry about security. The browser does not prevent any
of these, and web standards require these weaknesses. If your browser does not expose
your site and your users to all of these problems, it is not standards compliant. There's something
deeply wrong in the W3C standards.
The consequences of an attack are horrible. There's harm to customers, loss of trust,
legal liabilities. There's even been talk about criminal liabilities, because of negligence
of exposing people to harm.
This general category of attacks is called XSS, which is supposed to stand for Cross
Site Scripting. It's not CSS because that's some other abomination that we'll talk about
another time. Some stylists I hear in the audience. One of the problems with this attack...
It was identified many years ago by some security white hats, I suppose, and they misunderstood
what the attack was about so they gave it the wrong name. The problem isn't with cross
site scripting -- cross site scripting is actually a desirable thing. We call that mashups.
The problem is this confusion of interests. But the white hats gave it the wrong name
from the beginning, continued to use the wrong name, and they expect you guys to keep up.
It doesn't make any sense.
Cross site scripting attacks were invented in 1995. We have made no progress on this
problem since then. It's appalling.
A mashup is a self-inflicted XSS attack. Mashups are great. It's a way of creating an application
of components that come from several independent parties and are letting them work together
for the user's benefit. But they're not safe as currently practiced in the browser. And
it turns out, advertising is a mashup, which means advertising is a self-inflicted cross
site scripting attack. It turns out the most reliable cost effective method of injecting
evil code into a website is to buy an ad. Yeah.
So why did this happen? There are a number of causes. First, the web stack is too complicated.
There are too many languages: there's HTML, HTTP, the cookie language, URLs are a language,
CSS is a language, JavaScript is a language. All of these can be embedded inside of each
other, they all have different styling and quoting conventions, and also the browsers
are all competing to try to make sense out of badly written code, and that makes it really
easy for the attackers to hide stuff inside the code stream.
We have template based web frameworks that are optimized for XSS injection; PHP is a
popular example of that. The JavaScript global object -- or the window object as it's called
in the browsers -- gives every scrap of script the same set of powerful capabilities, so
there's no way a page can defend itself from any other script that happens to get into
that page. But then, once again, as bad as it is at security, the browser is still a
vast improvement over everything else. I wish that weren't the case, but it is. Even platforms
that were developed after the browser seem to have avoided learning any of the lessons
that the browser figured out.
This all comes down to confusion of interests. The browser distinguishes between the interests
of the user and the site, but didn't anticipate that there could be other interests represented.
Within a page, interests are confused, and an ad, or a widget, or an Ajax library, they
all get the same rights as your own script. You hope if you're loading jQuery, there's
nothing to prevent jQuery from deciding we're going to go rogue today and start harvesting
identities. If you're loading their stuff, it can happen.
JavaScript got close to getting it right. There's a lot that's wrong with JavaScript,
but you can avoid most of that, and for the rest of it, at ECMA where we maintain the
language, we are slowly making progress in reforming JavaScript into an object capability
language with which we can, finally, write secure applications.
HTML, on the other hand, we haven't seen any progress there. It grants power to confusers,
it is itself easily confused, it's forgiving because web developers in times past were
incompetent, and there was a competition between Netscape and Microsoft to try to capture as
many incompetent webmasters as possible. And the DOM's API, or the DOM, is just awful.
I don't think the DOM can be repaired; I think ultimately we have to replace it. And we should
replace it with something that looks like YUI or jQuery, because we know how it should
work and the way that raw DOM is is just awful.
Anyway, this stuff is not going to get fixed in a hurry, so it's up to you to create secure
applications on an insecure platform. It's hard, but there is hope, and there's hope
in principle. The name of the principle is the Principle of Least Authority, which says
that any unit of software should be given just the capabilities it needs to do its work
and no more. The problem we have in the browser today is that it gives capabilities to all
of the script, but we're getting better at being able to put some constraints on it.
The capability model came out of the Actor model. It was developed by Carl Hewitt and
his students at MIT in 1973. The Actor model is a brilliant thing which is slowly starting
to get recognized as the brilliant thing that it is. It has the potential of solving the
multi-core problem, and the cloud problem. There's nothing else that can scale at those
two extremes. It's pretty amazing. And it also has the security model that follows out
of it.
In the Actor model, an actor is a computational entity. It could be an object, or a little
program or something. It's something that can run somewhere. An actor can send messages
to other actors only if it knows their addresses. Every actor has an address. An actor can create
new actors, and an actor can receive messages from other actors. Web workers are like actors
in that you can create a web worker and give it some work and it'll send you a message
when it's done. That's pretty neat. Web services are not because the security model is different,
but that could be repaired.
There's a system I invite you to look at called Waterken, which applies the Actor model to
web services. You get very easily distributed, reliable services with also a very high level
of security. Check out Waterken; it's really neat.
I've been talking about capabilities. The address of an actor is a capability, a reference
to an object. Like if you have reference to a JavaScript object or a JavaScript function,
that is a capability. Let me tell you more about capabilities, because I think this is
the most likely mechanism to allow us to be secure in the browser. A is an object. A has
state and behavior. Object A has a reference to Object B, and Object A can communicate
with Object B because it has that reference. Object B provides an interface that constrains
access to its state and references, so having a reference to B shouldn't mean that you can
get in the middle of B and mess with it. It just means you can get to B's interface.
Object A does not have a reference to Object C, so Object A cannot communicate with C.
It's almost like there's a virtual firewall between C. It simply can't get to it because
it doesn't have its address. In an object capability system, an object can only communicate
with objects it has references to. An object capability system is produced by constraining
the ways that references are obtained. A reference cannot be obtained simply by knowing the name
of a global variable or a public class.
There are exactly three ways to obtain a reference: by creation, by construction, and by introduction.
By creation means that if a function creates an object, it gets a reference to that object.
That's pretty straightforward. By construction means that an object may be endowed by its
constructor with references, so as part of its initialization it can get some stuff,
some ability to communicate with things. Then three: this is the most interesting one, by
introduction. Here we have a situation where A has references to B and C, and it would
like for B and C to be able to communicate with each other. It can do that by introducing
them, so it sends a message to B which includes a reference to C. When that message is delivered,
B now has that reference, now has the capability of interacting with C. It has acquired the
capability. If references can be obtained by creation, construction and introduction,
then you may have a safe system.
Potential weaknesses include arrogation, corruption, confusion and collusion. Arrogation means
to take for yourself without right. Examples of that would be global variables -- that's
the big problem in JavaScript. In Java it would be public static variables, and also
standard libraries that give you access to powerful capabilities simply by knowing the
name of the library. Address generation allows for this, so C++ is not a secure language.
Known URLs can also do this. Corruption: it should not be possible to tamper with or circumvent
the system or other objects. Confusion: it should be possible to create objects that
are not subject to confusion. And fourth, collusion: it must not be possible for two
objects to communicate until they are formally introduced.
Let's skip that. Ultimately, every object should be given exactly the capabilities it
needs to do its work, and no more. Capabilities should be granted on a need to do basis. In
good design you have information hiding -- it turns out you also have capability hiding.
Intermediate objects, or facets, can be very light-weight and class free languages can
be especially effective. A facet object limits the guest's objects access to a powerful object.
Here the guest object cannot tamper with the facet to get a direct reference to the dangerous
object. References are not revocable, so once you give a reference to an object you can't
ask it to revoke it. Actually, you can ask it, but you shouldn't depend on it obeying
your request. But you can work around that with one level of indirection. We can have
a guest object that has a reference to an agency object and the guest asks for an introduction
to the powerful object. It gets given a facet, not a direct reference. At any time, the agency
can ask the facet to drop its link, and then it becomes useless. A facet can mark requests
so that the powerful object can know where the request came from. It gives us some accountability.
So facets are great; they're very expressive, they're easy to construct. In JavaScript it's
just a function. They're very, very cheap. They allow us to attenuate the power of dangerous
objects. They give us revocation, notification, delegation. It turns out that the best object
oriented patterns are also capability patterns. Sometimes when you're trying to design a system
and you're trying to figure out the interfaces and the APIs, if you look at it from a capability
perspective, you usually get the right design. It turns out that good systems are also secure
systems.
It all depends on functions. Functions in JavaScript become the mechanism by which we
can build secure applications. For more on this I recommend that you watch the Lazy's
Programmers Guide to Secure Computing by Marc Steigler. When you finish with this, just
go to Yahoo! and Google for the Lazy's Programmer's Guide to Secure Computing and watch that.
It's a great show. It's about being smart and lazy at the same time, and getting secure
systems as a consequence of that.
One of the things that makes JavaScript difficult to program securely is that there are hazards
in it. This is an example that Mark Miller produced. We want to build a table object
which will have three methods in it called get, store, and depend, which all work on
a secret array. The array is encapsulized inside of a closure -- this is the object's
closure pattern -- so that the attacker should not be able to get at array. Array is the
thing that we're trying to defend. It turns out there is an attack which will allow the
attacker to get the array out of this object. Anybody see what the attack is?
In this case our array is not a global variable, so no, it's not going to be that.
Okay, so here's the attack. I used the table store method to replace the push method with
my own function, and that function, when I call it, or when I trick it into calling it
by calling its append function, will then put it in an i score. The reason for this
is one of the design errors in JavaScript. JavaScript doesn't have real arrays. Its arrays
are just a little bit of trickery on top of objects. With get and store, we assume that
i is going to be a number, but it can be anything, and if it turns out to be the name of a method,
we can replace a method. That's not what we intended. But this is confusion because JavaScript
doesn't work the way we think it does. For our own purposes we need for arrays to work
the way we think arrays should work, but that's not what JavaScript does. The difference between
those two causes confusion, confusion causes bugs, and in this case a security hazard.
Oh sure, there are lots of things. There are a number of fixes to this. The most obvious
would be put a type of i and make sure it's a number. But that's a level of defensive
programming that most of us don't anticipate that we need to do. The fact that the language
doesn't match our expectations is what leads us into these sorts of problems.
Confusion, confusion's a bad thing. Confusion causes bugs, confusion gets in the way of
reliability, also gets in the way of security. Confusion aids the enemy. Bugs are a manifestation
of confusion. With great complexity comes great confusion. It's hard enough to reason
about what our programs do just in terms of their functionality, but now we have a whole
'nother level of reasoning we have to do, so in order to have any hope of being able
to do that effectively we need to keep our designs as clean and as simple as we can because
complicated, busy designs are difficult to reason about.
So we should code well. It turns out good code is ultimately cheaper to produce than
bad code over its whole life cycle, so you might as well just write good code all the
time. Good code helps serve the interests of security. Good code is easier to reason
about; code that is difficult to reason about is more likely to be problematic in terms
of reliability and security. So strict conformance to good style rules is really important. It's
important for reliability and even more important for security, so if you're using JavaScript,
you should be using JSLint. It does not guarantee that you're not going to have any security
problems, but if your code passes JSLint without warnings, then you know your code's going
to be easier to reason about and you'll have a better time trying to find the big problems.
Never trust a machine that's not under your absolute control. Sometimes I'm not even sure
about those. I can trust my server; I know what I put on the server and I know what it's
going to do, but I'm not sure about the services that it's talking to, the things that are
out on the other side of my wall. Probably the machine that I'm most worried about is
the browser. Never trust the browser. The browser cannot and will not protect your interests.
You must properly filter and validate everything that comes from the browser. You must properly
encode all output that's going to the browser. Context is everything, so you need to understand
where inside of the HTML stack stuff is going to go. Encoding inside of a paragraph is different
than inside of a style sheet, is different than inside of a URL, different than an event
handler. Wherever you're putting that stuff, you've got to make sure it's encoded properly
for its context. Everything has to be filtered and encoded.
Let me tell you an example. A friend of mine was going to fly to Asia, and he had a coach
ticket. He thought that's a long trip, it'd be nicer to go first class, so he went to
the airline's website and tried to upgrade. They said well, you need this many upgrade
certificates, and you have zero. So he'd go oh, so he opened Firefox and found the variable
that contained his number of certificates, bumped it up, tried again, and he flew first
class to Asia. The reason that worked was the server trusted the browser to enforce
the policy about how many certificates you needed in order to file the request. Shouldn't
do that. You should not trust the browser to be looking after your interest that way.
The server has to validate everything that comes to it.
One of the things that makes the web stack problematic is templating. With templating,
it's really easy to echo something into a form and allow for evil content to get injected.
This is one of the simplest possible XSS attacks. You trick a user into somehow following a
URL to your site with this goofy looking file name. There are a lot of ways to do that.
If you can get them onto a page, you can generate this. You could post to an invisible iframe,
or you could have a short URL that will translate into this. The user might never know that
you're doing this stunt. And a lot of web servers by default will generate a 404 page,
simply taking that file name and sticking it in a body and sending it back.
The effect of this, now, is that the script runs with your authority, so that script gets
your cookies, gets your local storage, your local database -- anything that you can get
at on the browser, he has now gotten to, and he's got the chrome working for him too. Everything
says it's a legit site. He can be loading in more stuff really easy. The fault here
was that something which was safe in URL position is not safe in HTML position, so it had to
be coded. One of the problems with PHP and other templating systems is it's much easier
to do it wrong than to do it right. It's on us to do it right. Any time you're going to
be inserting anything into the HTML stream, if there is any chance that it could be harmful
stuff, you've got to make sure that it's encoded properly. Just getting it from your database
is not evidence that it's safe. Everything has to be properly encoded.
There's a similar kind of confusion that can happen when you're concatenating. This can
happen in JavaScript too, but it happens in lots of languages. Like, you should never
build a JSON text by concatenation because the attacker could give you text to insert
into the JSON payload which contains quote marks. Those quote marks will then break the
JSON encoding, and if this stuff gets evaled -- some people are still using eval for this;
they shouldn't but that still goes on -- that can cause an injection of bad code. So when
there's a good encoder around, you should always use it. There are JSON encoders everywhere
now. Always use JSON encoders, don't build stuff by concatenation. The same goes for
SQL -- never build SQL strings by concatenation. It's way too easy to create an SQL injection
attack.
One of the causes I used to hear a lot of security is 'why would anybody do that? Why
should I have to defend against these things? It doesn't make sense why anyone would do
that.' I haven't heard this in a long time now. We've spent enough time on the internet
that we know the answer to this. The reason people do that is because we let them do that.
If we don't want them to do that, we shouldn't let them do that, and that's that.
In our programs, there can't be any capability leakage. We can't allow arrogation. Everything
must be solid. If anything leaks capabilities, then everything can be compromised. If a user
has rooted their identity in one of our accounts, and if we leak, the consequences could be
tragic. This is especially true for sites that offer any kind of email, like Yahoo!
or Google or AOL or anybody else, because some users will root their identity in that.
When they forget their bank password it goes to that account, so if we're giving access
to that account to attackers simply by doing any of these XSS tricks, we've done a lot
of harm. We need to be really, really diligent.
I'm at the end, so I've got a few more principles I want to throw at you. Inconvenience is not
security. We see a lot of instances where it says oh, you want to do something routine,
huh? What's your root password? That doesn't make any sense. Just because you're making
people inconvenienced doesn't mean you're making things any better for them.
Identity is not security. I see that mistake a lot too, that if we can just get people's
credentials then we can compel them to do whatever we want. That's probably not a good
idea.
There's a model of security called tainting where we try to find the sources of insecurity
and track them all the way through the system. It seems nice, but it doesn't work in practice.
There are some operators who've gotten so tired of the fact that they're constantly
under attack that they've sort of given up, so we're not going to prevent the attacks
anymore, we'll just try to figure out when we're attacked. I can see why they want to
do that, but that's not a substitute for security. Knowing that you've been attacked doesn't
do you any good.
Last slide. The last source of insecurity is mismanagement. It should be everybody's
job to maintain the security of the site. The executive staff has to make sure that
they are not creating incentive systems which incent anybody, at any time, for any reason,
to violate the site's security, because you just can't tolerate that. You, as developers,
need to know that you're supported all the way up to the top, that everybody agrees about
the fundamental importance of getting security right.
With that, I'm done. Danog ols e neit gudik [thank you and good night].
[applause]