Authors@Google: Allen B. Downey


Uploaded by AtGoogleTalks on 29.05.2012

Transcript:
>>Male speaker: It is my pleasure today to introduce Allen Downey.
He's a professor of Computer Science at Olin College, and a lot of us, he's also the author
of at least three books for O'Reilly: Think Python, Think Stats, and
Think Complexity. Looking at an intersecting world of ideas
around, using software to explore other concepts. And he's a professor
at Olin College of Computer Science. A lot of us here at Google,
know him from his stint as a visiting scientist, and Nathan Glasgow's
group Under Hiatus. And just a quick personal anecdote. I took
this book home on Tuesday, by Wednesday afternoon, there was already
an iterated prisoner's dilemma tournament going on in my house when I got home, so,
it's interesting, so thank you very much Allen.
>>Allen: Thank you very much David. Thank you all for coming.
it's a lot of fun for me to be back. I was just pointing out
the part in the book that's about you. So enjoy that.
If you want to follow along, that's the URL for the slides I'm using.
There are a few links in the slides, so if you go to that, you'll be
able to follow those links.
So as David mentioned, when we were talking about this book, Think
Complexity, I want to start out by telling you a little
bit of the history of how this book came around. I teach at Olin
College, which is a small engineering college in Needham.
It's relatively new, it's only been operating for about ten years.
The mission of the college is to create an innovative engineering program.
And specifically for Computer Science, we needed to create a curriculum
that made sense for engineers. We don't offer a degree in Computer Science,
we offer degrees in engineering. Students can concentrate in Computer Science,
which is kind of like doing a minor, but we didn't want to just do an Engineering
major with a C.S. minor. We thought that was an
opportunity to do something richer than that.
So what we weren't looking for is a Department of Computer Science.
What we ended up doing is, taking the ideas in Computer Science,
took the whole curriculum, and you can imagine if each class is a box,
full of stuff, we just took all the stuff and dumped it out on the table,
threw away the boxes, and started rearranging the stuff.
'Cause a lot of the groupings of ideas are historical,
and might not be the best choices for a current curriculum.
So, for example, if you have the luxury of offering lots of
classes, you'll have an Operating Systems class, a
Networking class, and a Databases class,
and there might not be explicit connections between those. But if you now take all those
pieces, and dump them on the table,
and say I'm going to teach one class that covers those three topics,
well, you lose something, because you're not going to be able
to cover everything that you did before. But you gain something because if you take
Operating Systems and Networks together, you're naturally talking
about Distributed Systems. If you take File Systems and Databases together,
you're naturally looking at a File System as a kind of Database,
or a Database as a kind of File System, and I think there are ideas that come out
of that.
So we wrote about it, and we called it "The Small Footprint Curriculum in Computer
Science." The one problem that we found ourselves with
is that there was no natural place
to teach Data Structures. Now, this is a picture of my Ph.D. advisor,
Paul Hilfinger, who's been teaching Data Structures at Berkley,
for quite a long time, and I think he would be disappointed looking
at my curriculum to see we weren't doing justice to Data Structures.
The problem that we ran into, was that most of the way Data Structures gets taught
is out its context. Kind of one Data Structure after another.
Lots of pros and cons of this implementation versus that implementation,
but mostly unmotivated. So that was stuck in my head,
okay, how are we going to get Data Structures back
into the curriculum. At the same time, historically, we had Complexity Science happening, and it's
a little hard to characterize exactly what was happening,
and when. This is a time line from the Wikipedia page
on Complexity Science, that gives some of the timing
of the major events. The thing about this that was exciting
to me is that it's all relatively new. You can study a whole lot of Math and Science
at the undergraduate level before you get past,
say, the nineteenth century. There's very little twentieth century math
and science that you get to, until you get to grad school.
But I thought there was an opportunity there because Complexity Science, in addition to
being new, is also very accessible. I'll talk in a minute
about ways that is accessible. One of them is, there were lots of
popular nonfiction books written about Complexity. So I started building a class around that.
Version one of this class was in 2005. And I started by just grabbing all the books
about this stuff, and trying to figure out if we could find out what's there.
You know, there's a certain amount of hype, is there some substance behind it?
And the other thing that I wanted to do was start with the, start with the popular nonfiction,
which usually doesn't have the technical details in it.
There's no code. There's very little mathematics. But what you can do, in the context of a class,
is start with that, and dive in deeper. So we found a number of the papers, the original
papers on these topics. Student re-implemented
a lot of the models that were described in those papers.
So that was the first version of the class. At the end of the class,
I tried to summarize the whole thing with what I call "the ultra-secret point of
the class." Because for the whole semester,
we didn't really know yet what the class was about.
I kind of did the same thing the second time around.
I felt like I was playing "hide the football" with the class for the whole semester.
Because I hadn't, in my own head, got to the point where I could explain
a coherent big picture of it was about. So I think we were having a good time,
we learned some more Python, we got some more Data Structures.
We found ourselves talking about philosophy of science quite a lot,
because the models that were coming up in Complexity Science
raised a lot of questions about what exactly are we doing with these models?
What kind of science is this?" [phone rings] That was the second iteration of the class.
The third iteration was just this past fall, and I got to do a couple of things.
One of them is, I wrote this book. I had drafted parts of it,
but I finally sat down and wrote everything except chapter one.
We used that for the class, and what the students did over the course
of the semester was, a set of case studies, where they worked in teams of three or four.
And each of them chose a topic that could have been a chapter in the book,
but wasn't. And they wrote the missing chapters of the
book in the form of these case studies.
And some of them are now included in the book that you've got. So this, I thought, was a
nice opportunity to give the students a chance to do some authentic
work and get published.
In fact, I recruited a program committee to get
other members of the faculty to do my grading for me.
So the students turned in their case studies, I pretended I was the program chair,
and sent out all the papers. The other faculty wrote reviews,
we decided which ones would be included; which ones weren't quite ready.
And then I edited in January, and sent it to the publisher,
and O'Reilly I think does a really good job of turning books around quickly.
I turned in the manuscript in January, and I think it was available in March.
Which if you've worked with other publishers, that's pretty remarkable.
And [claps hands softly] in January, I finally sat down and wrote
what I think is the "ultra secret point of the book",
which is the "ultra secret point of the class." And I will now tell you what it is.
Which is, I think, the development of Complexity Science
has caused a quiet revolution in science. What I mean by that is,
what kind of activities do we mean when we say "science."
What do we think is a good theory? If you propose an explanation for something,
what's a satisfying explanation? And maybe hand in hand with that,
what's a good model of a physical system?
I am going to talk about some of the axes that I think we are moving along.
And I don't mean to say that this is black and white.
That it used to be a hundred percent this, and now it's a hundred percent that,
but that I think the center of mass of scientific activity is gradually shifting
from your point of view, from left to right, on each of these axes.
From models of physical systems that tend to be
in the form of equations, toward things that tend to be
in the form of simulation. Away from mathematical analysis,
like symbolic computation, and toward discrete computation.
Continuous mathematics toward discrete mathematics; linear models, linear systems, toward non-linear
models. I'll give an example of some of these in a
minute, I don't want to go too much into details,
but I talk about it more in the book. Deterministic to stochastic,
I've also mentioned abstract and detailed in the context of modeling.
What I mean by that is a detailed model is a realistic description of a physical system.
Where I can look at the physical system, and the elements of my model,
and there are clear analogs between them. I can match them up one to one,
and I can say "yep, that's a realistic description of the system."
And when I say abstract, I mean mostly the opposite of that.
It’s a more, abstracted; it's a more, less detailed description,
less realistic.
Some of the other axes, if you are doing analysis, if you're doing
mathematical computation, you tend to have to limit yourself to simple
systems that have a small number of components,
and those components are usually identical. With computational models,
you can often move toward large numbers of components,
that can be different, heterogeneous.
Okay, that was pretty abstract. Let me give a more concrete example.
So, celestial mechanics might be the canonical example of classical science.
Which I'll just use the term of classical science,
just to contrast it with, whatever the new thing is,
maybe complexity science. So the question is, why are planetary orbits
elliptical? It's a natural kind of question to ask,
we'd like an explanation of that behavior. It turns out, if you know the law of Universal
Gravitation, you can write down a set of differential equations
that describes the motions of planets, at least for a simplified solar model, solar
system model. And if you now solve those differential equations,
you get ellipses, and you've explained why those orbits are ellipses.
And I think there's something very satisfying about that kind of explanation. I think at
least most people find it satisfying.
I want to offer, as a point of comparison, this is Thomas Schelling's model of racial
segregation. Similarly, it is motivated by a "why" question.
If you look at a lot of cities, people are segregated by race, so you might
ask "why?" What he proposed is a model
where the people in your city have two kinds, in this case, red and green,
and they all live on a two dimensional grid, so that each of them has eight neighbors,
and the model of their behavior is that they are happy
if at least a few of their neighbors are like themselves.
So, if two or three out of the eight are the same color,
they're happy, and they stay put. If they have fewer neighbors like themselves,
they start to feel unhappy, and they move, and moving in this case means just picking
a random, unoccupied cell in the grid, and moving to it, right?
So, you can characterize the agents, in this model,
as being mildly xenophobic. They don't like to be completely surrounded by people who
are not like themselves, but you probably wouldn't
describe them as rabid racists. Nevertheless, the outcome of system tends
to look like what's happening on the right there,
where things get almost completely segregated by color, in this case.
And in fact, this is an intermediate step in the process. If you keep running that,
it almost just becomes two great big blobs, with one boundary between them,
depending on the parameters that you tweak in the system.
The red and green, anybody colorblind? Oh alright,
I just, I thought it funny that the people who made that particular,
graph, made it as inaccessible as possible.
So the question is, do you find the Newtonian explanation
of elliptical orbits more satisfying than the Schelling explanation of racial segregation?
I think most people do, but I want to poke a little bit at why.
There are a few characteristics of classical science
that make us feel good about it, but it's not obvious that we are justified
in feeling good. So one of them is the appeal to a Universal
Law, in that case we were able to invoke
Universal Gravitation and I think there's something
about the fact that the same law applies to things moving on the Earth's surface,
and also to planets moving in solar systems, that makes us feel like that law has validity,
as opposed to a law that has very narrow scope. Okay.
There's a certain amount of mathematical virtuosity that I think impresses us,
and makes the work seem more real. I have a colleague in Physics that commonly
refers to the sort of thing I do as
"playing with computers." So, mathematics is real, playing with computers
is fun and games. On the other hand I don't want to bias this
too much, I think that one might have been a little
bit edgy. But legitimately, predictive power is something
we would like to see in a model or a theory, and that was, is the great selling point
of Newton's explanation of Celestial Mechanics. We can predict solar eclipses, we can predict
all sort of things that are going to happen. That's not as true of Schelling's model.
and "proofiness." It feels like we've just done
a mathematical proof. Now, I would say, it's not a proof. We can't
prove a thing about actual physical systems.
We can proove things about mathematical abstractions, and we might hope that the proof
about mathematical abstractions is analogous to something in the real world, but that part's
just a hope.
On the flip side, thinking about Schelling's model,
the obvious characteristic of Schelling's model
is that you've just described human beings by a set of rules,
that says "I'm going to count the number of my neighbors and be happy
or unhappy depending on that count." Most sociologists would not be happy
with the reduction that you just made of human behavior. The other,
as I mentioned, is that you're just playing with computers.
And the last is, if models like this can't make detailed predictions,
that allow us to validate whether the model is correct or not,
then what kind of work can they do?
So this is the framework that I use when I teach the class,
and it appears in the book as well, I think this is a general model
of a lot of what we do when we model the world, which describes
scientific activity, but I would also argue
it describes how we think. Most of our thinking of the world
is in models like this where we have some physical system that we can observe,
we have either direct or indirect sensory data.
We then construct a model, sometimes implicitly, sometimes explicitly,
that is an abstraction of the real thing, meaning that we've left out details.
The nice thing about the model though, is that, unlike the real world,
we can write proofs in models, we can do analysis of models,
we can run simulations of models. What you get out of that are predictions
and explanations that you can then compare to whatever it was that you saw
in the physical system, and get some kind of validation out of that.
This is the framework at least that I propose to my students.
And then we can ask, what kind of work does Schelling's model do?
It's not a proof. It has very little predictive power.
I can't tell you, I can't use Schelling's model
to tell you how segregated Boston will be in the future.
But what it does provide is, kind of a logical argument.
It's an existence proof, or a kind of counter example
to the assumption that if you see segregation in a city,
that the people who live there must be racist.
That the only explanation of segregation is racist agents.
So the claim, I think, that Schelling would make is that this
might be sufficient to cause segregation, but it's not necessary.
You don't have to actually have people behaving like this in order to see
segregation as the outcome. And what that means is that
that segregation, the fact that city seems to be racist,
could be an emergent property. Meaning that the system as a whole has that
property, but the components do not.
The individual agents might not be racist.
I want to give one other example of this kind of model, because I think
a couple of examples will help us think about it.
I'm going to talk about the Six Degrees experiment
that Stanley Milgram ran. How many people are familiar with Stanley
Milgram?
Okay, if you're not, I highly encourage you to find out more about
him. I think he's fascinating in part
because he ran two out of the three most infamous experiments in all of social
science. One of them, that I just think you have to
read about if you haven't, is his study
of obedience to authority. This is the original, one of the original
deceptive, social behavior experiments where he brought people in,
and told them they were participating in a teaching and learning experiment,
where they would act as a teacher, and the "learner" on the other side of the
glass was actually an actor.
And the subjects were told to administer electric shocks whenever the "learner"
got the answer wrong. And they would start out
by giving a fifteen volt shock, and work their way up a board,
to four hundred fifty volts. And what would happen,
as you went from fifteen to four hundred fifty; two things would happen:
one, the warning labels on the board, would say things like mild shock,
severe shock, hazard, warning, danger, I don't remember exactly what it was,
but it was increasingly dramatic. Yellow and black, diamond and stripes,
everything. Also, the actor on the other side of the glass, would go from saying things
like "ow", or "hey stop that" to "I'm having a heart
attack, please stop!" to silence [hesitant chuckles] at one point.
The nature of the experiment was to how far up the board people would be willing
to go under different circumstances. And, some of the circumstances, for example,
would be the "investigator" in a lab coat, also an actor, standing behind the subject
and saying things like "You must continue, the experiment requires you to continue."
And you probably see where the punch line of this is going, which is that many people
will go much farther up the board than you would like to believe about human
nature. In fact, something like six--in some conditions,
I forget the exact details, something like sixty percent of the subjects
went all the way to four hundred and fifty volts,
and were asking for more. And--
>> Male speaker 2: I think it was when the authority assumed, said that they were
assuming all responsibility.
>>Allen: Yeah, you're right. And there were different scripts and different layouts
of whether they could see the authority, or whether they could see the learner, and
so on. Fascinating stuff, especially because no one
will ever be allowed to run that experiment again.
[laughter]
>>Allen: Institutional review boards were pretty much created,
they should be called the Stanley Milgram Memorial Institutional
Review Boards.
[laughter]
>>Allen: Anyway, fascinating stuff. Read more about it if you're not already familiar
with it. The other experiment that he did, that is
the one that I'm going to talk a little bit about is
"The Small World" experiment, where he wanted to investigate the structure of social networks
by seeing whether he could get a package delivered across the country, following only social
hops. So, I think he, they started in Wichita, Kansas,
and they had a couple of different targets, one of them was a stockbroker in Sharon, Massachusetts.
My father was a stockbroker in Sharon, Massachusetts, but not the target of the Milgram experiment.
[laughter]
>> Allen: The idea was that Milgram, or one of his associates
would give a package to someone and say, "I want you to deliver this to
this person" and they would be given the name, occupation, and where they lived,
but the rules are you can only give this to someone that you know personally.
So, if you know the target personally, you can just go give it to them, and you're
done, and otherwise you have to give it to someone
that you know personally. What he wanted to know was,
how many hops it would take to get from Wichita to Sharon, Massachusetts.
And this graph shows the distribution of hops for the packages that were delivered.
Now, not all of the packages were delivered, so there's a bias behind this, and that's
been the subject of a lot of discussion but the interesting thing about this is
that the mode is at six. It took six hops most often to get from source to destination,
and that's where the six degrees of separation term
comes from, alright?
So at the time, very little was known about the structure of social networks,
since then, it's been the subject of a lot of study,
and we have several good explanations of why the diameter of these social graphs
seems to be so much shorter than you would expect.
So that's the good thing that we have lots of explanations. The bad thing is
we have too many explanations, and they're incompatible.
So, there are two models that people have proposed
simplified topologies that capture important elements of social networks.
Watts and Strogatz had what they call a Small World Graph, which is a parameterized
graph between a completely regular graph on one
end, and a completely random graph on the other,
and what they find is a sweet spot in the middle
that behaves like social networks, in the sense that it is highly clustered,
meaning that your friends tend to know each other,
but also has a small diameter, meaning that it has this small world behavior.
So that's one possible explanation.
The other is Barabasi and Albert's model, which is a scale free network.
The way you build one of these is by growing it.
So if you start out with one, or a small number of nodes,
and gradually add nodes, with the property of preferential attachment,
meaning the rich get richer, and/or don't go as fast,
what you get is a high, a long tailed distribution of degree
in the network. Most people have a small number of friends,
but there are a few people who have a very large number of friends.
That graph also has the Small World Property, and so now, if you're trying to explain
the Small World Property, you don't know which explanation is right.
This is part of the reason in my class that I found myself inevitably
talking about philosophy of science,
because we just left the world of doing science, building models,
and using them to predict, explain and design. What we're now talking about is
theory choice. Why should I prefer one model, or theory,
over another? And so that led us to Thomas Kuhn,
who wrote The Structure of Scientific Revolutions, which is probably his most famous book.
If you're not familiar with that book in particular, you're almost certainly familiar with
Paradigm Shifts. That book introduced that term. So all the bad jokes
about paradigm shifts that we've had for the last, let's see,
since 1963 or 4. You can blame Thomas Kuhn for that
particular piece of vocabulary.
He also wrote an essay called "Objectivity, Value Judgment, and Theory Choice"
which is explicitly about the situation we were just talking about.
Two competing models, in some sense they both explain the data,
so you can't choose one over the other purely in terms of how well it fits data.
There have to be other criteria, and that's exactly what he explains.
Well, what are those other criteria? Both descriptively,
like if we watch scientists and listen to what they say,
what are the criteria that they seem to be applying?
And also normatively, which is, which of those criteria do we think
are justified? And which might be biased?
One of the reasons that the class inevitably ends up in philosophy was...
explained nicely by xkcd, which found that the structure of information,
at least as represented by Wikipedia, has a tendency to lead inevitably to philosophy.
And this came out by the experiment that if you take any article,
and click on the first link, and repeat that process,
you will eventually end up at "Philosophy." if you haven't read that cartoon,
I recommend it to you.
so I took some of the things that we were talking about in the class,
and made them into a running theme throughout the book.
So each chapter tends to raise a different issue
in philosophy of science, and then I tried to
write a little bit about it, just enough to introduce what I thought were interesting
questions and hopefully point you to toward more reading.
I mentioned Theory Choice as one of the first topics
that comes up. The Demarcation problem is one of the others.
The Demarcation problem is, is there a justified definition
of science that lets us distinguish between things
that are real science, good science, and other things like pseudo-science, or,
that seem that like they might be sciencey, but
they're not? Or say, religion, things that are not science
at all, but how do we clearly, justify that definition?
Realism and instrumentalism comes up quite a lot.
This is, this pertains to how we interpret the theories and models that we're using
to describe physical systems. I won't say too much
about that now, I'm going to come back to it
just a little bit later. Holism and reductionism was one of the other
topics that comes up. Again I won't say too much
about that right now.
But I will come back to as I said, the ultra secret point of this book.
I started with this set of axes, at the beginning,
and I speculate, or at least my thesis that we are living through a transition
toward a new kind of science. I'm kind of borrowing that phrase
from Steven Wolfram, but I mean it in a different sense then he
does. So, I'm going to have to violate
his trademark on that phrase. He is primarily talking about, well,
actually tell you what, if you want to know what Wolfram is saying
when he says new kind of science, ask me after, let's not go down that.
What I mean by it is just what I was saying before,
the shift along these axes in what we think is science; what we think
is a good model; what we think is a satisfying kind of explanation; what
we think is good work for a scientist to do.
I think there are some other shifts going on at the same time
that are related. One of them is what kind of work
do we want models to do? With the Newton example, and the Schelling example, I wanted to show
what I think is a shift for models that are primarily meant to be predictive. They were
often tools that we would use as computational devices,
for making predictions that had practical consequences.
I think there's a shift toward models that may be explanatory without necessarily
having predictive property. I mentioned realism and instrumentalism,
alright, let me finally say what I mean by that.
Let me throw out an idea. Are electrons real? By a common language use of the word real?
Alright, so I think we would all buy, or at least
I'll ask you to accept, that chair is real. Okay?
Unicorns are not real. So on that axis of real/not real,
[laughter] what's your feeling about electrons?
>>male speaker 2: Not real.
[muffled laughter]
>>Allen: Not real?
>>male speaker 2: They are merely an abstraction.
>>Allen: They might be a useful abstraction, and you can
certainly take that view. That would be an instrumentalist
interpretation of a theory involving electrons. Now, here's a trivial thing that you can do
with that theory, electrons, for historical reasons,
have negative charge. You could imagine a whole new
physics of the world where you just flip the science
of everything. Okay? That new physics would be just
as good as the old physics, except that electrons, in some sense would be different. They would
have positive charge. And it's clear that you haven't done any work
by doing that transition, but it does suggest that the entities that
make up your model are in, some sense, arbitrary.
I could have...
>>male speaker 2: The charge is arbitrary, the numbers arbitrary [unintelligible]
>>Allen: Yeah. Right, so in that case, I haven't really done much,
but I could imagine another physics that postulated a different
set of entities. More different than just positive and negative charge.
And that other physics might be just as good as what we currently call physics.
It's not easy to make one up as a good example but it's at least imaginable.
Anyway, that discussion would take you down the road toward talking about
realism and instrumentalism. the other one is reductionism and holism.
I'm torn about how far to get into that one. Hofstadter talks about this in Gödel, Escher
and Bach. This is one of his figures that I kind of like because it's a nice
way to think of things that you would interpret differently at different scales. And that's
related to reductionism and holism. I think I'm going to punt on that for now, but we'll
give you chance to talk about it later if you want to.
If there's a new kind of science going on here,
one could imagine there'd be a new kind of engineering
that goes with it. And I think there is. These, again, are some of the axes where I
think we're shifting over time. Away from engineered
systems that tended to be centralized toward decentralized
systems. Isolation, interaction what I mean by that
is, that in a lot of engineering design, if you want to build
something complex, that involves many subsystems, you want to
be able to design each subsystem in isolation
without having to worry about all the others at the same time.
In the context of, software, this is often described in terms
of abstraction and encapsulation, but it's a fundamental idea through all of engineering,
that if every component of your system depends on every other component, you very quickly
won't be able to design anything. It will collapse
under the weight of its own complexity. But that's increasingly, I won't say not true,
I'll say less true that we now have more tools for dealing with complexity, and that
allows us to relax a little bit the requirement that everything be very carefully isolated.
We can design things that have more interacting components now, and still be able to manage
that.
One of the ways of managing that is to replace classical analysis with computation. You can
design more complicated things if you throw
more computational power at it. One example of that,
does anybody know what that building is?
>>male speaker 2: The Guggenheim Museum...
>>Allen: The Guggenheim Museum in Bilbao, Spain.
Who designed it?
>> male speaker 2: Frank Gehry
>>Allen: Anything else around here also designed by Frank Gehry?
[laughter]
>>Allen: Stata center. Is it Stah-ta or state-a?
>>Male speaker 3: Stahta
>>Male speaker 4: State-a, I think.
>>Allen: Okay.
>>Male speaker 4: The guy's name is State-a, but the people who work there call it
Stah-ta Center.
>>Allen: Interesting. Okay. [Laughter]
>>Allen: So, one of the reasons they were able to build
that thing is that they used CAD software that had been
developed for designing boats, and Gehry's innovation
was to apply it to buildings. It would not have been
possible or feasible to design or build that prior to the
availability of that software, in the same way it wasn't
feasible to design or build the Eiffel Tower prior to
Eiffel's development of analytic techniques for designing and analyzing those types of
structures.
The other piece of this that I think is interesting is that, I think, increasingly
we will search for engineering solutions, rather than design them. And maybe
I'll just leave it at that.
One more piece of this, so I'm speculating on a new kind of science, a new kind of modeling,
a new kind of engineering, and here's a new kind
of thinking. Which is, I think, our fascination with
Aristotelian logic will gradually fade, and be
replaced by some kind of multi-valued logic. Being kind of a Bayesian myself, I'm
biased toward Bayesianism as my favorite multi-valued logic. But that inevitably
takes you on the road from imagining that your scientific theories are objective,
or that your beliefs about the world are objective, toward acknowledging that they're subjective,
and if they are, then what?
Subjectivity seems like a dangerous slope toward pure relativism, you know,
what's true is true for me, what's true for you
is true for you, and we can't talk to each other.
That, doesn't seem to be the case, so it raises the question can we have subjectivity without
complete collapse?
A couple of other shifts that I think are going on,
away from thinking of scientific theories in terms
of Universal Physical Laws, more toward theories and models.
I would argue that there's no meaningful difference between a theory and a model, that when you
tell me that you're, that you have a theory, what you're really
doing is recommending a model to me. And the language
that we would use for choosing one of those models
again goes back to subjectivity and theory choice.
The discussion that Kuhn brings up.
And the last is away from determinism and into indeterminism. So, I kinda blew through
a lot of this, in part, because, me standing up here,
and talking about it, is not the most effective thing.
I, part of the reason that I wrote this down, is that I think it's better to read a careful
exposition of it, than to get the thoughts off the top
of my head. But more importantly, what you really need
to do is wrestle with the stuff. So just reading
what I think isn't going to get you very far,
what I really want you to do is is pick up the book,
and do some of the exercises, and, as I'll talk about in a minute,
I wanna get you to write a, a case study.
but just to finish off a couple of thoughts here,
I hope you'll read the book. I think the Complexity Science stuff might be fun for you, and I
think parts of it are probably familiar to each
i, each of you, but it's unlikely that all of you are familiar
with all of it. So there might be some good stuff there.
The philosophy of science might be new to you,
unless that's a hobby of yours. The analysis of algorithm stuff is probably not that new
to many of you, at least the software engineers. And similarly, intermediate Python; those
are two topics that suited my class that may, or may not
be the most important to you.
Alright, the case studies! So I mentioned that the students worked on
case studies, and we picked the best ones, and then we published
them. We're planning to do the same thing with the
next edition of the book. So if you're familiar with the
cookbook, books, that O'Reilly does. So the Python cookbook,
for example, they collected a bunch of short articles,
where someone would explain a recurring problem, and give a solution to it, and explain the
alternatives. They published a bunch of them and then there's
a webpage that collects submissions, and then every
once in a while, they seem to collect all the good ones off
the webpage, and that becomes the next edition of the book.
[coughing]
So, we're thinking of something similar, just because there's so much good stuff here,
and I didn't want to, and couldn't, cram it all into the book,
so I've left some for my students, and they've left some for you.
Some of the ones they've done so far, this is an ant trail model, which is another
agent based model. We have a very simple model of ant behavior
here, their decisions whether they will turn left or right, and
how they propagate out from the nest. The interesting thing about
that model is that a very simple system reproduces trails
that are at least qualitatively similar actual ant
trails, but maybe more interestingly, different species
of ants, have different foraging patterns, and if you
track them, and look at the shape of those trails, they
have distinctive texture, to them. You could look at them and visually
identify which species it was, by the structure of
those trails. This model has a few perimeters that you can
turn that will make the trails look like species
one, or species two, or species three, depending on how you tweak
it. So, it's a nice model.
Slime molds are another classic, agent based model where the emergent behavior
of the system, is different in character from the behavior
of the individual agents, and surprising. And the surprising-ness might
be an important characteristic of emergent properties.
Another student project looked at the distribution of wealth. It's common observation that in
a lot of economic systems, the distribution of wealth
quickly becomes long tailed. That seems to be hard
to avoid, and hard to fix. The system tends to gravitate to those long tailed distributions,
and you might wonder why. Sugarscape is, again, a very abstract, very not realistic model
of an "economy" of a kind. A sugar economy. but it develops long tailed distributions
in ways, that I think, can be explanatory.
A knot in Wikipedia, if you have a directed graph,
and you follow links through the graph, you might
find that you've arrived in a neighborhood that you can no longer get out of. [laughter]
And so the students were searching to see whether
a directed graph created by Wikipedia has knots in it.
Norms game, is related to iterated prisoner's dilemma.
And then I had one student that took on a very
ambitious project that didn't quite get to the finish line,
that's not in the book, but they did a lot of very cool stuff,
with the evolution of virtual creatures. So the creatures
would have a genotype, that would describe their phenotype
in a simulated three-dimensional environment. And so, you would
evolve them over time, by having them interact in this simulated 3D world. Have some kind
of fitness contest to see who gets to be represented in the next
generation, and then follow the gene pool over time. What
they discovered, and lots of people have built versions of
this, is that the animals discover lots of innovative solutions to problems
like, gathering food, and hitting each other,
and locomotion and things like that. Including the ability
to exploit bugs in your simulation of the 3D world. [laughter]
>>male speaker 4: They discover magic.
>>Allen: That's right, they discover magic. [laughter]
One of the things that they discovered was that there was an error in the boundary condition
so that if you were, if your critter was very close
to the boundary, and kind of moving back and forth across the
boundary of the world, it would start translating along one of the
axes much faster than anything else in the world could move, and
they kind of discovered the transporter, and animals were exploiting
it. And when they were running away from something,
they would head for the boundary, and then zip along. [laughter]
Okay, just a couple of thoughts about things I'm currently working on.
In seven days, I need to turn in the manuscript of Think Python,
and they're going to try and turn it around and make it available by July, so we'll see
how all that goes. This is the second draft of the cover.
I have to tell you a little bit of the story of the cover.
You might know that O'Reilly puts animals on all the covers
and a lot of people have wondered, who gets to choose
the animals. They actually have a person who works full time; that’s her job; she
picks the animals. [laughter] And they tell you, if you're an author,
they tell you very explicitly don't try to pick your animal.
We don't want to hear it. The most input that you get is that if you
can give a few words that describe the character of your book,
that will help the animal chooser choose your animal.
Although for my books, so far, they haven't even asked me for adjectives,
they've just given me an animal. Now, I've been happy so
far. I've got an Archerfish on the statistics book, which
is a really cool, I don't know if you know about the Archerfish,
but it has a little groove in its throat and it shoots
a jet of water, out of the water, and it hits insects. So
if an insect is out on a branch, it will shoot this jet
of water, knock it off the branch into the water,
and then the Archerfish swims over and eats it.
And it turns out that it can shoot out of the water hit things
about two meters away with high accuracy, sorta, you know, insect size
precision, despite the fact that its eyes are underwater
and it’s going out into air, so it’s got the refraction at the surface of the water,
that throws off, so it compensates for refraction and hits
things. So cool animal, really happy with that.
This is, I believe, a Black Eagle, now I forget, it's on the last page
of the book so if you want to look it up. So, okay, it doesn't shoot water or anything,
but [laughter] it's, it's cool enough. The Python book, the first draft of the cover
came back with, all together now.
>>[all together] A python!
>>Allen: A python. Just like almost every other Python book,
with a couple of exceptions, so I wrote back and said,
"The Python's fine, and if you want to go with that,
no problem, but Python, the programming language, is named
after Monty Python. It's not named after the snake, so
every time someone puts a python on the cover of a Python book,
you're basically saying, “we didn't get the joke."[laughter]
So, so here's a suggestion, can we put a dead parrot on the cover?"
[loud laughter]
>>male speaker 5: A Norwegian blue.
>>Allen: A Norwegian blue, exactly right. So, and the only thing
I had to say was, the text at the end of the book
that describes the animal that's on the cover, I said, "Look, if you'll please put a parrot
on the cover, you could have a lot of fun with the text
at the end because you could basically lift lines out of the
sketch, 'A Norwegian blue, beautiful plumage isn't it?' That kind of
thing." So they went along with it, I believe this is not actually
a Norwegian Blue Parrot, because there's no such thing. [laughter]
But as far as I'm concerned, that's a Norwegian Blue Parrot.
[laughter] I haven't yet, I haven't seen the text that they've
written yet, but I'm going to try to convince them
to actually say it's a Norwegian Blue Parrot and see if people get it.
Projects that are up next, is anybody familiar with The Little Book of Semaphores? I don't
know if that's crossed anybody's path. I wrote a book a number of
years ago where I took all the fun, synchronization
puzzles, like readers/writers, but also the Santa Claus
problem, Dining Philosophers, what are the other fun ones? There's Dining
Savages. If you took an operating systems class, you
might have done one of these.
>>male speaker 6: The Banker?
>>Allen: Which one?
>>male speaker 6: The Banker?
>>Allen: Oh yeah, now that's the deadlock one, right?
>>male speaker 6: Yeah.
>>Allen: Yeah, no, I didn't do a lot with deadlock,
but anyway, collected a bunch of synchronization puzzles,
and I'm thinking of turning that into a book called
Think Synch, and I'm also thinking about linear Algebra.
So there might be a Think Linear, at some point.
If you want to follow the progress, [clears throat]
Green Tea Press, is me. Green Tea Press is where
I publish the first drafts of all my books. And then see if I can find a publisher that
that wants to work with them. But while I am
developing them, they're all available under free
licenses. For a while I was using Gnu Free Documentation
License, lately I've been using a lot of the Creative
Commons licenses, but the nice thing is that you can
modify these books; you can translate them into other languages;
you can pull chapters out of several different books and
and recombine them. So there a lot of cool projects
that I think come from making material available like that.
The Python book, that is the origin of the Statistics book,
and the Complexity book was originally a Java book,
that somebody else translated from Java into Python, and sent
it back to me, and I actually learned Python by reading my own book.
[laughter] Which was very strange. [laughter] Because I recognized
my writing [laughter] but it was telling me things I didn't know.
[laughter] And it's kind of like those time travel plots [xylophone sounds]
where you from the future comes and tells you things. That's exactly [laughter]
So I've had that experience. So, yes, check out the free books,
read Think Complexity, write a case study. I think that's all I wanted to talk about
for now, but I'm happy to take questions, or hear comments,
or.
>>male speaker 7: My question is, that I have to interview new candidates
at Google, often around, you know, algorithms and complexity.
Do you have good questions, because you know, most of them are banned now.
>>Allen: Right. [laughter] I'll have to think about that,
because this, we're taking a kind of different approach
to it, and that might actually lend itself to the next generation of algorithm design
questions. Cause, it used to be kind of a test of the
canon, there, how much of Corn, Mueller, Silver and
Ruest have you absorbed and can deliver on request? Usually applied
in non-trivial ways, so I mean, the thing I like about the
questions is a lot of what you learn by memorizing the
data structures book, you might not be able to apply in flexible
ways to a new problem. I think that is what interview questions are
really testing so I think that's the good part. But, I'm kinda,
I'm getting away from this canonical idea that you have to know three
kinds leveling, balancing, self-balancing treatise, or that
you have to know five implementations of a queue and all the
pros and cons of them.
>>male speaker 8: That's because we design by search now.
>>Allen: Because we design by search, yeah. So, I mean,
I think there is a different tool kit now, and a different
set of knowledge. I think there are a number of sufficiently
solved problems that we kinda don't really need to think much
about their implementations, at least a very small number
of people will, the vast majority of us will now work above
that level of abstraction, so let's put some of that stuff aside
and thing about what are the interesting and hard questions
now at that level. But I don't have any great ones
off the top of my head, but I love the question.
>> David: You talked about moving from, the predictive to the explanatory,
>>Allen: Um hmm.
>>David: But we still need predictions.
>>Allen: Sure
>>David: Even in the complex worlds, in true learning
new things.
>>Allen: Sure.
>>David: If it's true science is moving from here to there,
but I take it you're saying, we're evil if we move over here
to discover more new things per decade, than if we just stayed where
we were, where we kind of mind, the kind of things that gave us [unintelligible] mechanics,
>>Allen: Yes.
>>David: But in truth, you're giving us a model of science,
you're saying we're likely to get more discoveries per decade
if we move over here, but we need a prediction about how many more?
For example, if I had one of these scale free networks graphs,
and I build the largest social networks in the world,
is that worth one hundred billion dollars? [laughter]
>>Allen: So yes, we still need models that are capable of prediction,
I think, when I'm describing that shift, I think two things are going on.
One of them is we are relaxing the requirement that every model be predictive, and admitting
that there can be good science at the other end of the spectrum. That, if,
there's lots of good work to be done there, even if we relax the requirement
to be predictive. and the other is that we can be a little more
fearless about attacking large, complex, heterogeneous
systems that, I think for a long time we would have avoided
because we knew we wouldn't be able to make any progress on them, with the
tools that we had. Now, with a different set of tools, we can
start to attack a bigger set of problems.
Yeah?
>>male speaker 10: I had a question about the trend towards instrumentalism.
>>Allen: Mm hmm.
>>male speaker 10: When we talked about does as electrons exist,
I was kind of reminded of this Xeno School of Paradoxes.
One of the paradoxes was the millet paradox, right?
>>Allen: I don't think I know that.
>>male speaker 10: Why does a bag of millet when dropped make a sound, when a single pellet
of millet when dropped, does not make a sound. You know, I think this talk both to their
summation and their use of perception as a means of
determining reality. All right, so the Greeks said "well,
sound doesn't exist unless my ear can actually hear it." Right,
an electron, in my understanding is, it's real in the sense
of you you can prove what is it, the Thompson's experiment, the
thing about the droplets moving at these discrete velocities, and that there's
a quantized charge. And so, one can assign quantification of that,
like are we moving towards a trend of other things being considered
like, rayified, like traffic jam or some type of common phenomenon
as some sort of real, ya know, as part of your paradigm shift?
>>Allen: Right, Okay. So I think, so the Xeno example
is one of those cases where the two endpoints are clear, and, but something
must change in the middle, but we have a really hard time saying where it changed in the middle,
and I think realism and instrumentalism has that character
as well. There are some examples where the vast majority
of us feel entirely comfortable saying, "Yup, that's real." At the other end
of the spectrum, things that we all, at least have a consensus,
that they're not, something in the middle, there has
to be a transition somewhere in the middle. Where do we put it?
One way to resolve that is to say, is to accept as real, any entity postulated
by a theory that we use. So if it's a good theory,
then the entities that it postulates are real. In other words, you just expand the space
of what you're willing to call real, to make the conflict go away. The other is
to go all the way to instrumentalism, and say that all of our theories are tools
that we use to explain stuff, and we're not obligated
to consider any of our hypostulated entities as real.
Even things that seem even obviously real, like nice firm objects,
which at the Mezzo scale of physics, have nice reality to them,
but if you go to very small scales, that starts to get fuzzy.
Now with nice solid objects, I think we have a hard time with it.
One of the examples I like to use is mushrooms. When when you look at a mushroom, what you're
actually looking at is this fruiting body of a fungus.
And that thing that's discrete and we pick up and eat, seems
to have nice realness to it. But that organism, that you
just harvested, is actually a very fine network of cells that
is mostly underground. The underground part might be hundreds or
thousands, or millions of times more massive than the thing you just picked
up. So in the sense of being big,
and concrete, it's real, but it's also really diffuse. It doesn't have borders that
can track. You couldn't dig it up,
you would end up digging up an acre of land. And you couldn't separate it from the soil
that it's part of, so it has some of the characteristics that we associate with nice, concrete, well
delineated objects, but not all of them. That's one of those
weird middle test cases, that I like to think about.
One of the other ones that's a little bit fun is
trying to figure out which atoms do you consider to be you.
[laughter] Okay, one of them is, if you weigh
yourself, you get the total mass of the stuff that you consider to be
you, but that includes cells that have your DNA,
those seem like they're probably you. But it also includes a lot of dead stuff,
that may or may not be you, it includes a lot of other organisms,
like the bacteria that live in your gut, and the mites that live in your eyebrows,
and all kinds of other things, that again, you know, so are you a symbiotic colony of
all those things? You could have a lot of fun with this.
Sorry if I'm wondering around a little bit. Yes?
>>David: We have time for one more question if anyone has one.
>>Allen: Great, yes?
>>male speaker 11: One thing, this is going to be a vague wandering question.
>>Allen: I will give you a vague wandering answer.
>>Male speaker 11: The problems that human brains seem to have
dealing with complexity, one of the main manifestations of that
is very smart people make very stupid mistakes. NASA's been pretty good at that with centimeters
versus inches, things crashing and
and then the sort of recurrence of the number a hundred and fifty, which is shortly the
size of a village, and the number of social connections, are
there do you think there are limitations to our
physical limitations even, to our brain in dealing with increasing complexity?
Are there ways out of that?
>>Allen: Right. Umm, it does seem clear that there are aspects
of human psychology that influence what we consider
to be good theories and bad theories, and also
our decisions about real and not real, and all that.
And that does seem to to be a consequence of the environment that our brains evolved
in. And it comes with a certain set of capabilities,
things that we're really good at, and also as you said, certain types of lapses,
where things like engineering systems tend to poke at the exactly spots in our brains
that are kind of soft. So I promised a vague and sort of wandering
answer, and I think that might have been it. [quiet
laughter] So I think the challenge then for theory choice
is to distinguish between what are the preferences we have,
just because that's how we are, versus preferences that we can justify in terms of properties
such as being truth tropic. You know, I think this is more likely to lead
toward more truth, you know, I think this theory is better than
that one, not just because my monkey brain tells me
that it's better, but because I've got a justified reason to
believe that it's more likely to lead to, lead toward a truth,
be good for prediction, whatever the other characteristics are that
we have for a model.
>>David: Alright.
>>Allen: Thank you very much.
[Loud applause]