QUT Seminar: Acoustic indices for ecologist

Uploaded by TheQUTube on 09.02.2012

Michael: Ok, so Stuart and Jason have said the background for some of the results which
I want to show you. I started on this work in 2007 when Paul took me on, and very earlier
on, Stuart and Peter had a workshop in this room actually and that was my first introduction
to acoustic indices. And it was the AHQI, Acoustic Health Quality Index, which I notice
now Stuart’s renamed. But the idea was that, if you made an assumption that, in the low
frequency portion of the spectrum, you’ve got mostly sound is due to human sources.
In the higher part of the spectrum it’s due to biological sources. Then you can take
some ratio in that and so measure the acoustic health. And then that got me thinking “well
what other indices might be possible”, because there’s a, I think Stuart will agree, there’s
a problem with this particular index, and that is that of course you can have frogs
and whatever in the low frequency, and you can have technological sources in the high
frequency part of the spectrum, so that can create problems with this particular ratio.
So, the good old engineering approach is just to see what’s the noise ratio, and how that
applies to an acoustic landscape. This is a portion of an acoustic recording. In the
top there you can see the dark red. That’s a koala call. It’s a real koala call.
(Audience chuckles)
This one here is an artefact caused by a mobile phone noise. So this is low frequency and
this is high frequency. Time is through there. And so, and through here we have continuous
cicada noise. And the way I want you to think about this is that, you can picture this as
a landscape in which we have some topography. That’s actually the land itself and then
dotted on the landscape we have these acoustic events like trees or forest on top of the
actual land. And it would be nice to make a distinction between the background, essentially
that’s the noise, and the signal, that’s the events which are dotted on this acoustic
landscape. So, to attempt to gather that Stuart had already added some of his data - it was
getting towards this kind of thing. So this is the acoustic landscape for a site near
Brisbane Airport, averaged over, so the ‘x’ axis is midnight to midnight, averaged over
15 days in October 2007. The top graph is the background noise, and the bottom graph
is the number of acoustic events that, if you like, are dotted in the landscape. Near
Brisbane Airport, you’ll see that…, probably easier if I point, we have a lot of acoustic
energy in that low frequency area, and that’s because there is a motorway nearby, so that’s
essentially reflected motorway noise. This here is a very strong population of cicadas,
and you’ll notice that they’re temperature sensitive, because the frequency increases
as the temperature gets hotter. And, the other thing about those is that, although there
are individual cicadas which are close to the microphone, there are
so many of them dotted through the landscape that it becomes part of the background noise
of that particular location. On the other hand, you’ll see that we have a very, in
the lower graph, where we’re now looking at acoustic events. So this is actually things
that are dotted, if you like, acoustic events that are dotted on the acoustic landscape.
Very strong morning chorus and quite a strong evening chorus. And this activity through
here would be birds, and this activity here is actually cicadas, which are close to the
microphone, which stand out as individual events. So that’s an acoustic energy map
for that particular location, and it’s quite meaningful as Stuart said. You can interpret
these things: if you know what’s there, it makes perfect sense, so you should be able
to make predictions about other landscapes. Another way of representing the number of
acoustic events, in the bottom-right there. So it just shows you the remarkable nature
of the morning and evening chorus at that particular location.
So, “Hypothesis”, that if we wanted to compare environments could we say that a city
environment will have high intensity background noise, but be poor in acoustic events. A natural
environment would have low intensity background and rich in acoustic events. And I had the
good fortune to be able to compare 3 different environments. The top-left is St Bees Island,
so that’s a, if you like, undisturbed site. In the middle is the Brisbane Airport site which I’ve just
shown you, and bottom-right is the Brisbane CBD. Actually each of these graphs makes perfect sense, if you understand the
environment. Probably I don’t need to go a lot into it but, here we have a form of
cicada through here which is not temperature sensitive, like this one is here. And Ian
told me that’s because it is a cicada that lives underground so maybe we mentioned it
once. (Audience mumbling)
Peter: Did you see it? (Audience chuckle)
Ian: [Reply inaudible]
Michael: Just looking at this graph here, and bearing in mind that this is Margaret
Street. And we have in the low frequency area quite a high intensity of noise around 6:00
and 8:00 in the morning, but not the equivalent in the evening. And so the question is, why
would that be?
Peter: It’s a one way street. (audience surprised and laugh)
Ian: Well done Peter!
Michael: 130 I.Q. straight up. (Audience laugh). Aw yes, ok, so this is a little bit harder
question, again background noise, not much in the morning here, but noticeably more background
noise in the evening or the afternoon. This is a particular site. I suppose you should
know that St Bees Island is a clue.
Kris: Jet skis?
Michael: Ah, I don’t think so.
Michelle: I was going to say coastal winds always pick up in the afternoon -
Michael: Yes, it’s coastal winds, so…
Michelle: - and they break off at midnight or 2:00am.
Peter: Yeah.
Michael: So, that’s just to give you the sense that these diagrams are interpretable
and that they make good sense. The only other thing I’ll point out from these is that
as far as the hypothesis is concerned about the city, well, rich in noise, or high intensity
background noise, and poor in acoustic events, compared to this landscape, but actually St
Bees is surprisingly, I expected more. It’s there, the birds are singing, but these are
averages over 15 days in each case, so the averages tend to smooth things out. So what
you’re seeing in these graphs is only what’s persistent day after day after day, so it’s
a statistic. Like all statistics it hides the detail.
Peter: So when you say poor in acoustic events, the diversity of them there - the diversity
of the frequencies. Is that what you're saying?
Michael: Poor in acoustic events means that, on average, there's less happening at this
site (St. Bees Island), then there is at this site (Brisbane Airport).
Peter: Ok.
Michael: Again, like all statistics they only make any sense when you compare them. So,
uses of acoustic indices. What we've been talking about is acoustic indices in terms
of health, acoustic signatures, that was the word I was looking for. And you could have
a bench-mark site or reference site, and one can see how this could contribute to habitat
hectares and habitat condition, measures of environmental health. Ok, we can use these
indices to detect non-biological events, and some of this work that's being done to detect
rain, wind, thunder events. Why would you want to do that? Because, these recorders,
you can hide them under the undergrowth in the bush somewhere, whereas if you leave a
weather station out in the open, in public places, it's probably going to get damaged.
And it's useful to know when it's raining because you can cut a lot of that out, and
also for example, frogs calling after rain - and it's good to know when the rain is.
So that's its use, and finally I want to talk about another use for acoustic indices, and
this is what Jason has set the background for, is smart sampling of very long recordings.
So, as Jason has made it quite clear, you can't possibly listen for 3 weeks continuous
recording, and that's the sort of thing we're getting. So there just has to be some way
of sampling that intelligently to get the maximum leverage from those recordings, and
one could imagine sampling for different reasons. Like, you want to give samples to Birds (Australia),
or you want to give samples to Citizen Scientist Annual, or whatever it might be - the project.
So this is what I've been doing for the last few weeks, and this is a graph of the data
which Jason's got. He showed you several days of this. This is just one day, I think 13th
of October last year, and that's 24 hours of recording broken into minute chunks, and
for each minute the taggers have counted the number of unique bird calls in that minute.
And that's what the profile looks like over the day. And so, the question is "well how
can we identify the most species with the least number of samples" and we can see why
dawn-plus-3-hours is a good choice because that gives you this portion here, where you're
most likely to have the most different species. So, what I did was, to extract from each one
of those minute recordings, 13 indices, and those plotted these tracks here. So this is
midnight to midnight. And to give you an idea of how this is working - the bottom track
here is the species count - so that previous graph that I showed you. The intensity of
this black line here is proportional to the height of the line, so this is the morning
chorus here, and then this is night time here - there is no activity.
Paul: There's one thing: give us an idea about the variation in species Mike, because that's
the trick isn't it? That’s what we’re after really?
Michael: Yes, if I try and go through all these indices, you'd probably get bored too
quickly. Couple of things, ok, some things that I'll point out. This third track here
is signal-to-noise ratio - this one through here. So that's what I was attempting to plot
in those other graphs I showed you before. The signal to noise ratio is high in the morning
chorus, and then there will be particular places through here. And this is not particularly
informative this signal to noise ratio, simply because it depends upon whether some bird
is squawking close to the microphone or not, so it's not telling you anything deep about
the particular location. This one's interesting here - you'll notice that when you have this
intense activity, acoustic activity, here, and that marks the end of the day's activity
for the birds. This is the bird counts through here, and that's a cicada chorus, which accordingly
terminates the day. What you'll see here is that most of these indices have low values
during the night. In other words, there’s not much acoustic energy and most of these
indices are related to acoustic energy, but we have 3 here, which are reversed, and these
are measures of entropy. That's the statistic - the measures of entropy. You understand
entropy in this context as the degree of dispersal of acoustic energy, either through time or
through the spectrum. So high dispersal of acoustic energy will give you this black line,
and low dispersal, concentration of acoustic energy, in particular calls or particular
times is low entropy. So as we would expect we have high entropy at night, and less during
the day, but you'll notice during here, during the cicada chorus, because it so dominates
the landscape, the acoustic energy's very intense, but it's very dispersed, so we have
high entropy here as well. The question which Paul asked about, is there any measure of
diversity of different spectral types in the landscape, acoustic landscape, and that's
what I attempted to capture in this third from the bottom. This one through here, and
it seems to be pretty good, obviously we've got, so this is the number of different spectral
types, how the spectrum varies in that minute. And we have lots of spectral variation through
here, and not much at night, and so anyway we got 13 indices, and the trick was to see,
can we use these indices to direct our sampling of very long recordings so we're starting
to pick up places where we're likely to get bird calls. That's the point of all of that.
Ok, so just to give you an example of how we might use this information. On the left
hand side we got that same graph of the number of species counts. On the right hand side
in this case is that entropy track, and you'll see that entropy is high at night and low
in the day, as a measure of dispersal of acoustic energy. We plot those 2 together, and you'll
see that, ok, you can fit a line through there, exponential decay, and it's got an R2 value
of about 0.5. And I think that's bad.
Ian: (laughter)
Michael: Ian seems to think that's not so bad. I'll stay out if it. But-
Peter: n=?
Ian: Well, n = 1400, that's quite healthy. (laughter) Umm, the point is that, to find
a simple relationship between these indices and a species is not going to work. But can
we use this information in a list to guide scientists? Can we get some leverage from
that? And the answer is we can. So coming back to the problem again: "How to detect
the most different bird species with the minimum effort?". And, so let's just concentrate on
getting “75”. S, "25s, 50s, 75" means the number of samples required to get 50%
of the identified species. So it's like a “t(1/2)” value for radioactive decays.
The amount of time over samples and numbers. The number of samples to get 50% of species.
And so let's forget 100%. Jason thinks that we should be going for 75%, and that's not
a bad deal. So this is the same graph, and because we can repeat random sampling many
times, we can get some sense of the standard deviation. That's what you might expect, sampling
many times. So this is best possible deal. This is the good bench mark. And then this
is the informed sampling strategy, where you sample only at sunrise. The difference being
that if you only sample 3 hours after sunrise, you'll never get 100% of the species, whereas
you obviously can if you use some other strategy. So the question is: "Can we do better than
this informed sampling strategy where you sample sunrise plus 3 hours?", using the indices
which I showed you before.
Paul: So, that’s saying that I can take 100 x 1 minute samples randomly, plus around
sunrise, and I’ll get 75% of the species?
Michael: On average, yeah.
Paul: On average.
Michael: Yeah.
Paul: I mean that’s pretty good. That’s 2 hours, isn’t it? But still, that’s better
than the “birdoes” are doing right? For the same amount?
Michael: Yes, well that’s the point that Jason was making. Yeah, but the question is,
it’s still random sampling, can we use the indices to do better than that? That’s what
I’m getting at. And the last graph, I think it’s the last graph. And essentially the
answer is, well we can do better. So this is the benchmark, random sampling over 24
hours. This is the sampling randomly 3 hours after sunrise, and taking just 1 of the best
indices which I found, simply ranking the 1 minute samples by the value of that index,
and sampling in that order. And it’s a number of active segments, so what you do is you
just counting, essentially you’re counting the number of acoustic events in the one minute.
And if you just do something simple like that, you can already, to get 75% of the species
present, you’ve already cut your sampling down from 100 to 40/47. Now, if you do something
else, which is, instead of just relying upon one of those indices to guide your sampling,
you start to take some weighted combination of those indices. Then, I just played around
yesterday for some time and the best I got down to was 32. So essentially what we’re
down to now is, in theory, 32 samples will get you 75%. That’s half an hour’s work.
Audience: 1 hour and a half.
Michael: Well if you take 30 samples, but then of course you’ve got to listen to them…
so essentially, let’s put it this way, that with smart sampling, you only need to listen
to ½ an hour of recordings in order to get 75% of the species present. Now where this
is going, is that this is an optimisation problem. In other words, it’s just essentially
“how do I work these indices?”. And there are a few other parameters thrown in there,
where we have tried to find some optimum value. So let’s say you’ve got 10 parameters
to optimise, and so this can now go very fast with all the information is stored in the
database, and we could possibly get this down to 20 samples. But the one catch is that by
the time you start to optimise, your sampling regime for this particular 24 hours of data
is no guarantee that for the same combinations of the same waste parameters is going to apply
for the next day. So the next step is to get another day’s data, another 2 day’s data,
optimise the parameters on Day 1, and then test it on Days 2 and Days 3. And you’ve
got a…(better result). My guess is, just from working with this sort of training test
data before, that we could perhaps optimise, on a particular day, get down to 20 samples,
but when you go to another day, and you’re probably back up to 30 or 40. And at the moment,
using the indices we got, it’s probably about what we’ll manage. But that’s not
to say that there aren’t other indices and other tricks that, you know, that I could
try. It’s just these ideas, they all take time. And I think that’s it. That’s it.
Audience: (Applause)