Archive for the 'Anze' Category

First paper accepted

22nd May 2008 | Category: Anze, Kate

We have just found out that the first Galaxy Zoo paper we submitted has been accepted for publication in the Monthly Notices of the Royal Astronomical Society. This is really exciting, and what is even better is that we were not required to make any revisions or corrections first! This is almost unheard of - so unheard of that it’s not listed as a possibility in Kevin’s outline of the peer review process in a previous post. We expected the anonymous referee to produce a report that commented, criticised, and quibbled about parts of the paper that they were not satisfied with. But we in fact received no report and it went straight to being accepted. Which means that the referee must have been very happy with what we had done. Hoorah!

It will hopefully be appearing in the online version of MNRAS in a month, then a little longer for printed versions. But there is no need to have a subscription to MNRAS to view the paper as the paper is also available publicly here (and we will be sure to keep the versions the same).

We will now be writing a less technical version of the paper for those interested - the idea is to make the results accessible to all the Galaxy Users. So this paper will have the same figures, data, etc. but with more clarification where necessary, less astro-jargon, and some extra reflection on the results. Watch this space!

8 comments

New Scientist

10th April 2008 | Category: Anze, Kate

As some of you may have noticed, our first paper has caught the eye of New Scientist (in fact they have written about us before). This is pretty cool considering that we effectively had a null result in our paper - concluding that the Universe seems to be relatively normal. The real excitement is due to the ‘people power’ that the Zoo has harnessed.

I think you need a subscription to see the full article, and however much I’d love to give out Chris’ account details I am sure this breaks certain rules! However, if you have been able to take a look at the full piece then you may be a little curious about the comments from Prof Michael Longo towards the end

It turned out people have a preference when picking orientation: despite the mirroring, 52 per cent of the galaxies were still described as anticlockwise. “Rather than the universe being odd, it might be that people are odd,” says Land. The team has submitted the findings to Monthly Notices of the Royal Astronomical Society (www.arxiv.org/0803.3247).

Longo, however, is unconvinced. The mirroring analysis was only carried out for 5 per cent of the galaxies studied and he believes this sample is too small to justify rejecting the original excess that users spotted, which corroborated the existence of the axis. “[Land and colleagues] have done an impressive job of organising the Galaxy Zoo project, but I believe their analysis is flawed,” he says.

It is really thanks to him that this part of the Zoo, looking into the spins of galaxies as opposed to just the morphology, took place (see here for more on the motivation of this part of the study). And he raises an interesting point in the article that I thought it’d be worth responding to…

The point was raised that because we only did the bias study on about 5% of the Galaxy Zoo sample then we cannot really comment at all on the level of bias in our full dataset. Indeed this sounds quite reasonable. We can see how the classifications for this random ~5% of the data behave when we flip the images, but how do we know that the full sample wouldn’t behave differently?

Well, I think there are two important points to be aware of. Firstly, with statistics we were able to confidently detect a bias in the classifications of these ~50,000 galaxies. The analysis we performed is discussed in some detail on this blog. It is a bit technical, but not only do we detect a bias effect, but with a method called resampling we further established the uncertainty in this result - the probability that the effect could just appear by chance.

For this the data was split into further subsets, and by looking at how the results varied between these groups we could estimate the overall uncertainty in our results. For example - if it turns out that removing a few pieces of information causes the results to vary wildly then this means that you have a huge uncertainty and cannot make strong final conclusions about the full dataset. In our case this method actually returned relatively small errors because even between subsets of the data the results did not vary much. When we formally computed the uncertainty (we used the jackknife method to be specific) we are able to detect the bias at the ‘3-sigma level’.

This kind of lingo is used a lot by scientists, and what we mean by ’sigma’ is one standard deviation. This is a measure of how much numbers can be expected to vary by chance. Consider for example that you toss a fair coin a thousand times, and you want to know how many times you can expect to return a heads. Well obviously you’d expect 500 heads - but not necessarily exactly 500 as there will be some natural variance in the results. In this example it actually turns out that the number of heads you expect roughly obeys a Normal Distribution with mean of 500, and standard deviation of ~16. This means that if we repeated the experiment a number of times we expect 68.3% of the results to find the number of heads to be with 1 standard deviation of 500 (between 484-516), 95.4% within ‘2 sigma’ (468-532), and 99.7% within ‘3 sigma’ (452-548). If your experiment returned 450 heads from 1,000 tosses of the coin, then this would be unexpected at the ‘3-sigma level’ and would be highly unlikely - thus indicating that the dice is probably biased.

Well, similarly we found that our original and our flipped classifications were inconsistent at more than 3 standard deviations - and this means we can be sure at the 99.7% level than there is a bias effect in our study. This is what we mean by confidently!

But what about the full dataset? Well this is the second point - the bias-study galaxies were selected completely at random from the full dataset in order to get a representative sample of them. We have conclusively shown that there is a bias in the way people classify galaxies and hence the same effect should be present in the full sample. We cannot be 100% sure that the full sample would show exactly the same bias effect, but we can be over 99.73% sure (3 sigma) sure. In other words, for the bias effect to be a statistical fluctuation due to reanalyzing just 5% of the data, we would have to be very lucky (not quite LOTTO lucky, but more than BINGO lucky!). But once the bias effect is taken into account, the axis (or more specifically the excess of anti-clockwise galaxies) disappears. Alas!

9 comments

In the eye of the beholder?

10th January 2008 | Category: Anze, Kate

Hey guys and girls,

So, as you probably know, the last month or so of Galaxy Zoo has been dedicated to testing whether we have any bias in our classifications (and if you want to know why we are interested in looking at the rotation of galaxies then please have a read here). By ‘bias’ we basically mean some systematic error in the way people classify (you can get a good explanation in Jordan’s post), and this is different from just random general scatter of results. For example, we know that when a galaxy is faint or small then people are more likely to think it is an elliptical galaxy - and this particular mophology bias is something that Steven must compensate for in his work.

It has been really exciting to work on the rotation classifications of Galaxy Zoo, and as many of you know early on in the project we realised that people were classifying more galaxies as anti-clockwise (see the Telegraph article for example). Specifically, if we take those galaxies that are well classified (ie. more than 80% of people agree) then we find we have an anti-clockwise:clockwise ratio of about 52:48. This may not sound particularly significant, but as you increase the number of galaxies that you have in your sample (as more of you lovely people classify for us) then this ratio becomes more significant, and is highly unlikely for the ~35,000 galaxies that we have. [For those of you who like probability, the number of anti-clockwise galaxies that we expect is distributed according to a Binomial probability distribution. And if we assume that the ratio is really 50:50, then out of a total of N galaxies we expect N/2 to be anti-clockwise, with a standard deviation of sqrt(N/4).]

In the plot below we show the relative excess of clockwise votes (for users that classified more than about 300 galaxies) - this is the number of clockwise votes minus the number of anti-clockwise, votes divided by the sum of the two. For example, this number would be 1 if a user always clicks clockwise, and zero if they click both clockwise and anti-clockwise equally.

bausers.gif
This graph confirms that everyone is generally clicking anti-clockwise more often, because we see that the mean tends to lie below the zero line. But this plot cannot distinguish between intrinsic excess of anti-clockwise galaxies on the sky or human bias, and it is obviously very important for our rotation results that we get a handle on this as we could not announce our possible anti-clockwise excess result to the scientific community without doing these bias checks. So the basic idea is to look at the votes for a galaxy before and after a galaxy image is flipped. For example, if 6 out of 10 people thought it was originally clockwise, then after flippping we expect about 6 out of 10 people to now think it is rotating anti-clockwise (if there is no rotation bias).

Since the end of November many of the images in Galaxy Zoo have been flipped for this purpose (and we’ve been monitoring the status here), and we now think that we have enough data to measure the levels of bias. This week Anze has flown over from Berkeley (in California) especially to crunch the numbers with Kate (in Oxford); it is quite a job - with over 7 million classifications to go through! And during our analysis some rather subtle points arose… as with most science, things don’t go exactly to plan!

So we basically wanted to compare the classifications for a galaxy before and after flipping, but we quickly realised that peoples behaviour in the last month or so is very different to the earlier datasets (see Anze’s post for an explanation of how we reduce the data). For example, recently people have been more likely to click the ‘Star/Don’t know’ button. This might be because we have lots of new users, recruited through our latest publicity drive. Or maybe lots of old members have come back after receiving the newsletters. Either way it meant we couldn’t simply compare before and after votes. Also, annoyingly, the original unflipped images are no longer on the site and so getting a handle on this behaviour change was a bit tricky (note that one of the first rules of scientific experiements is to have a control test, but accidentally a miscommunication amoungst team members meant that in this case our control sample got left out!). Fortunately though, we are able to use the monochrome images that are currently in the site to compare to (as we observe that being in black and white does not change how people choose between anti-clockwise and clockwise).

So we want to know what the average votes per button are, for the average galaxy in Galaxy Zoo. This is where we encountered our second problem - our bias sample does not cover all of the Galaxy Zoo galaxies, but just 10% of them, and this 10% was not selected at random. In particular we know that we have more anti-clockwise galaxies in the bias sample (on the site at the moment). Therefore we needed to careful undo what we did when we selected this subsample, so to then construct an effectively random subsample of our full database. Then we could look at the average weights.

In the figure we show the average fraction of votes that a galaxy gets for clockwise (class=2) and anti-clockwise (class=3). We show the result for the original classifications in black (before December), for the monchrome images in red, and for the flipped images in green. We also show the 1 standard deviation errorbars from sampling.

jack2.gif
So what we see is that the class=3 points are always higher than the class=2 points, and crucially this is true even after we flip a galaxy image! Looking at the red points, we find that before flipping there is a 6.0% chance of hitting anti-clock and 5.5% of hitting clock for our sample. Then after flipping (green) there is a 5.9% of hitting anti-clock and 5.6% of hitting clock. So the point is that those numbers stay the same (within 1 standard deviation) when they should actually reverse if there is no bias. It is easier to think in terms of the ratio of fractions:
anti/(anti+clock)=0.522 before flipping
anti/(anti+clock)=0.512 after flipping.
And if we had:
a) no bias and no excess then these should both be 0.5.
b) no bias and a real excess then one should be the opposite of the other (ie. 0.52 and then 0.48)
c) a bias and no excess then we would expect them to stay the same and not equal 0.5.
But what we actually find is that 0.522 is 5 standard deviations away from 0.5, 0.512 is 3 standard deviations away from 0.5, and 0.522 & 0.512 are within 1 standard deviation of each other. So you see we appear to be convincingly in situation (c).

So what next? Well - it is fantastic that we have been able to get a handle on the bias even if it did turn out to be effecting our results. Only with Galaxy Zoo which has so many contributors were we able to detect the bias (and it may turn out to be an inherent bias in the way people see galaxies, which an interesting psychology result). Without so many classifications the excess result would have always remained uncertain. And while we no longer think we have an overall excess of anti-clockwise galaxies (which we never expected in the first place!) we can still do a lot of interesting work and pursue our original scientific aims, as explained here and here.

Thanks guys! And keep up the good work. Current classifications remain useful, and we hope to give you some more images next week (possibly returning to the full catalogue!).

Cheers, Kate & Anze

25 comments

Final pre-bias data download

1st January 2008 | Category: Anze

Today I delivered the final pre-bias testing data to the collaboration. In other words, for some time now, the website is serving only images for the purpose of bias testing - the mirrored, black and white, etc images. Therefore, the standard data are as complete as they will ever be. The process of getting the data into a form suitable for processing for individual science projects is beautifully inefficient and convoluted! Below is a somewhat technical description of what I do.

First I login into a database server at the Johns Hopkins University and perform an SQL query that dumps the entire live database into a text file, which I then compress and FTP over to my computer workstation at the Lawrence Berkeley Laboratory. This is to bridge the gap between computer science world (pretty ASP.NET code and SQL backend) and science world (spaghetti FORTRAN code on UNIX and binary files).

The data is then reduced in a series of steps. First, the data is organized and sorted by galaxies, and usernames are converted into consecutive numbers (so that the usernames are anonymous in the final database). Second, the data from various downloads are combined into one big dataset. Third bad data are weeded out (misconfigured browsers, bots and similar). Finally, the reduced “histograms” for each galaxy are produced. These correspond to our final state of knowledge about each galaxy.

There are four ways of doing these: spirals can be combined or separate and users can be reweighted or not (and two times two makes four). In the combined spirals sample, we combine all three spiral subsamples (clockwise, anti-clockwise, and edge-on) into a single spiral category: science projects that are interested purely in the galaxy evolution don’t care about orientation of a given galaxy. In the reweighted sample, we try to improve the sample by essentially comparing the agreement between users: the idea is that if ten users claim that a certain galaxy is a spiral and the eleventh users says it is an elliptical, it is likely that the 11th user got it wrong. Users who commonly disagree with everyone else gets down-weighted and those who always agree get up-weighted.

It is a purely statistical exercise meant to remove pranksters that click randomly and up-weight careful users. In practice, we can check how well it works. We do this (well, Steven does it) by looking at galaxies that have the same absolute luminosity and size and shouldn’t evolve over the small redshift range probed by the SDSS. The upshot is that it doesn’t work as well as initially anticipated: as an old english proverb goes: if one million French believe in something, it doesn’t make it right. And so we also produce the unweighted sample in which all users are given the same weight. It is up to individual science projects to decide which combination to use.

Finally, the reduced data is uploaded to a super-secret web server where other collaborators can download it.

The final datasets contain 34,617,406 clicks done by 82,931 users. Hooray for all of you! However, the previous downloads already went over 30 million, and hence this will make only small improvements to our science results. Now, the important task is to gather enough information about biases in our datasets and so keep clicking, please!

5 comments