Finding a Source
How do scientists know when they have truly determined
the location of
a source emitting high-energy photons? They try to find out if the
the source has statistical significance, and in order to do this, they
this "short cut" method related to standard deviation.
In order to understand what a standard deviation (also referred to as sigma)
is and how it helps you to determine if you have detected a source or not in your data, we need to first learn about what
statisticians call "normal distribution" of data.
A normal distribution of data means that most of the samples in
a set of data
are close to the "average," while relatively few samples tend to one
the other. If you looked at normally distributed data on a graph, it
would look something like this:
One standard deviation away from the mean in either
direction on the
horizontal axis (the red area on the above graph) accounts for
68 percent of the data in this set. Two standard deviations away from the mean
(the red and green areas) account for roughly 95 percent of the data. And
standard deviations (the red, green and blue areas) account for about 99
of the data. If the datum is greater than 3 sigma away from the mean, it is
truly an exceptional sample compared to all the rest of the data.This is what you would expect a source to look like compared to the background noise! In other
words, if the difference between the two numbers you are testing is "greater
than three times sigma", then you can be certain that you have located
emitting high-energy photons.
In this activity, we can assume that one sigma is well approximated by the
square root of the value of the pixel count that we are testing. You may ask
your more advanced students to research why this would be the case, and when
this approximation would break down.
Now consider the following:
Find the pixel with the highest number in it. Then exclude all of the pixels
immediately surrounding this pixel and look for the highest number in any of
the pixels directly outside the excluded area.
In the example below, the maximum pixel count is 60 and the highest pixel
count outside the excluded area is 20.
Note that we exclude a box of pixels around
the highest pixel because a real source will be imaged onto more than a
single pixel; thus we exclude the nearest-neighbor pixels from
consideration when looking for a statistical significance of a source
(indicated by the maximum pixel count over the whole array) above the
background, or noise.
So our source is 60, our sigma is sqrt(60)=7.8, and the highest pixel outside our excluded area is 20. We see that 60 - 3 x 7.8 > 20, so you can be
99% sure that you have located a source at the pixel position with 60 counts