Archive | February 2008

Galaxy Zoo: Behind the scenes

For the past few months on this blog, we’ve been talking about the science of Galaxy Zoo – what your millions of classifications have revealed to us about the way the universe works. Right now, as Steven and I described Friday and Monday, the members of the Galaxy Zoo team are writing papers announcing our science results, and offering feedback on each other’s papers.

But of course, Galaxy Zoo has become much more than just a science project. The site has become an Internet phenomenon, and for the next few posts, we’d like to focus on some other aspects of the Galaxy Zoo phenomenon. Today, we wanted to talk about the thing that makes everything else work – the site itself.

Without a good-looking and well-functioning website, we could have never invited all of you to participate in this project, and you could not have generated the excellent scientific dataset that you have generated. The site was designed by two professional web designers: Phil Murray and Dan Andreescu. Galaxy Zoo is now proudly listed as a featured project on Phil’s web site.

Phil designed the look and feel of the site, and Dan wrote the code that allowed the website to take your input and write it into a database of classifications. Dan left the project in late 2007, and Danny Locksmith has taken over the coding.

The best way to tell the story of Galaxy Zoo’s design is to let Phil and Danny tell the story themselves. So here is Phil, talking about how he designed the layout of Galaxy Zoo:

GZ1.0 Visual Design

I was asked by Chris Lintott to design the Galaxy Zoo logo and web site in March 2007, and I realised early on that this had the potential to be a hugely successful project — little did we know just how successful it would be! I was given a completely free rein to handle the visual design of both the logo and the web site.

The Galaxy Zoo Logo

I wanted to create a visually appealing logo that would work in several formats – web and print. It had to be flexible enough to work as a standalone logo or to be incorporated into an overall page design – as is the case with the web site. The graphic part of the logo is in fact based on a Hubble image of Supernova 1987A Rings, which seemed to fit very neatly into the text of ‘GALAXY ZOO’ to form an official logo. A variation was developed for both web and print use.

The Galaxy Zoo Web Site

It seemed obvious that part of the attraction of the GZ1.0 project to non-astronomers was the sheer beauty of the galaxy images, plus the fact that many of these images had never previously been seen by human eyes. So I wanted to maximise impact right up front on the Home Page of the site by using a large galaxy image as a main background to the page and to carry this theme through into the inner pages. I wanted all text to sit on a semi transparent screened background to give the impression of depth on the page.

Choosing a colour palette was relatively straightforward given the colours within the logo, hence the basic black, grey, orange and gold colour scheme. It was decided to go with a slightly shallower header graphic for inner pages with all top navigation shown horizontally and any secondary navigation to be contained in a left column (as is the case in the Analysis Page). I decided on a fixed width solution catering for a minimum screen resolution of 1024×768 pixels.

When it came to the buttons for the Galaxy Analysis page, I spent some time designing what I hoped would be generic buttons for the various options on offer (Spiral Galaxy – Anticlockwise, Clockwise and so on). The intention was to try to design buttons that would not influence the decision making of the Galaxy Zoo visitor but also that they would be intuitive to use. In fact it quickly became apparent that having designed the buttons to look like they were part of an ‘online  game’, was a feature which also helped with the appeal and overall usability of the site. The feedback and data received as part of GZ1.0 has given us some valuable information about how to present these and other buttons for GZ2.0.

As for building the site, I constructed all the pages as HTML templates, which were then integrated into the ASP.Net web programming environment by the excellent Dan Andreescu. Danny Locksmith has taken over the ASP.Net duties since late 2007.

I think you’ll agree that the site that Phil did a great job – he created a really beautiful site that was easy to navigate. Now here is Danny, talking about how he took over the coding from Dan:

Most of the ASP programming was decided on before I got involved. My task so far has been to try to understand how someone else thought it should work!


In effect this is how it works:

Your login to the site, your user preferences, etc. are all controlled by the .NET 2.0 framework. The site uses a template that provides the basic logic involved in recording your clicks and ensuring that the right person is credited with each classification. The persistant data is stored in a database.

When you load the Galaxy Analysis page, a galaxy is selected randomly and displayed on the page. Next to the galaxy’s image is a the Galaxy ID, which is a hotlink to an SDSS page where you can view details of the galaxy – its spectrum and a zoomable picture, etc. Watch for changes to this in GZ2!

Next to the galaxy image is a custom control which has the various buttons you can click to classify the galaxy. Since we learned about the anticlockwise bias, various theories have been put forward about to explain it – one of them is the design and layout of these buttons. Another problem here was that people tended to click the button several times, recording several results. This was worked around by only allowing one classification per galaxy. Yet another potential problem was that you could easily make an error, but you could not go back and fix it. Look for changes in GZ2!

Once you click a button, your classification, your user ID, the date and time, etc. are recorded in a database. The data that is stored was designed to answer specific questions, and the scientific papers which are to be published soon. With the advent of the bias testing phase, additional information was stored in the same database – the way the image displayed had been transformed.

In GZ2, the data collected will be more generic, and will create a very comprehensive catalog of galaxy information, almost certainly the biggest ever. To a great extent, the inner workings of the site are defined by the various scientists involved in the project. It is very much designed by the entire team, and as such my task is to ensure that the finished site meets all the goals of the team, and at the same time is pleasureable to navigate and to use.

You will be able to decide if I was successful when GZ 2 is launched!

I’ll add just two things to what Phil and Danny said:

1) The servers that run Galaxy Zoo are in the Physics and Astronomy building at Johns Hopkins University in Baltimore, Maryland, USA. (Here is the building in Google Maps – the Johns Hopkins lacrosse stadium is just to the north, and the building across the winding street is the Space Telescope Science Institute).

2) One of the things I find amazing about Galaxy Zoo is that no member of the team has ever met all the other members face-to-face. Chris has come closest – he has met everyone except Jan and Alainna, who do IT support for the servers at JHU. In addition, 8,549 km separates Anze in Berkeley, California from Chris, Kate, and Kevin in Oxford. The Galaxy Zoo team could not have existed without the Internet, and communication tools that allow us to work together productively on different continents.

Reading the drafts

As Steven mentioned in his post last Friday, we are hard at work on the first round of papers from Galaxy Zoo. Back when we started this blog, Chris listed the four papers that we expect to come out in the first round. To review them:

1) A paper summarizing the structure of Galaxy Zoo, with details of how we turn your clicks into a catalog of galaxies. Chris is the first author on this one, and Anze talked here on the blog about how we got our catalog of galaxies. Chris’s talk at the AAS meeting also gives a good introduction to what is likely to go in this paper.

2) A paper about the relationship between what a galaxy looks like and where it lives. Steven is the first author on this one, and he wrote about the results very clearly here.
3) A paper about the unusual “blue ellipticals” that you found. Kevin is the first author on that one, and he wrote about it here, with lots of really nice sample images.

4) A paper examining the structure of the universe by studying the rotation direction of galaxies. Kate is the first author on this one, and Anze is working closely with her. She wrote about the reasons for the study on the forum, and her paper will also include the results of the bias study. The bias study showed that the apparent excess of anti-clockwise galaxies seems to be a result of people’s perception of galaxies on the site, rather than any feature of the galaxies themselves or our position relative to them. We actually never expected to find any excess – and often in science, disproving a result is just as important as finding a new result.

Steven’s post Friday did a great job of describing what goes on in writing a scientific paper. Here, I’ll talk about what it’s like to read over a paper and provide comments to the first author.

The results so far have been really interesting, and it’s been a lot of fun to see them written down. I looked through Chris’s paper in detail, since I know a good deal about the process by which we created Galaxy Zoo, and the SDSS data on that Galaxy Zoo uses. I know less about the astronomy, so I’ve just skimmed through Steven and Kate’s paper. I haven’t seen Kevin’s yet.

We’ve been exchanging drafts of the papers as PDFs, then sending comments back to the first authors by E-mail. I’ve been reading along and making notes as I go. I’m trying to make sure that everything would make sense to an astronomer who hadn’t worked with Galaxy Zoo before.

One of the most important parts of any scientific paper is the figures. The old statement that “a picture is worth a thousand words” is definitely true in science, but in this case the pictures are usually plots of data. I’m checking over the figures to make sure the x and y axes are clearly labeled and the figure caption makes sense. A lot of readers read the figures first, then come back to the text, so the figure captions should make sense when read apart from the paper. The way that figures can depict scientific data is quite interesting, and creating figures for professional astronomers is frequently quite a different visual style from creating figures for the public.
The last section of any science paper is the References – the previous papers that this paper builds on. Any assertion that you make in a paper should either be a direct result of your study, blindingly obvious, or referenced in a standard style. So, when Chris talked about how images from Galaxy Zoo were generated, I sent him a reference on how we take individual black-and-white images in different wavelengths and combine them into a color image.

Can you feel a draft?

As you may have noticed from the sparseness of recent posts, the Galaxy Zoo team have all been buckling down in an effort to get some work done. Progress on my Galaxy Zoo paper has been a little delayed by the need to do some work for another project I’m involved with: the GAMA survey. The first observations for this survey will start in a few weeks time, so I couldn’t really put off doing my bit to help make sure we target the right objects! All astronomers usually have several different projects on the go, some of which span years.  Juggling them all can get a little tricky.  Anyway, with my most urgent work out of the way, I’ve had another push on my Galaxy Zoo paper and today I sent a draft version around to the team members for them to have a look at.

Generally, although several people may contribute to a scientific paper, the main business of writing the text and putting together the figures is done by one person, the lead author. This person is usually the one who has done the most work producing the results that are described in the paper, and their name comes first in the author list. Once the paper is mostly complete, although still at an early stage, it is often sent around the coauthors for comments. It is helpful to get input from the coauthors at this stage to help refine the overall structure and content of the paper before too much effort has gone into checking the fine details, because these will get messed up again if the paper ends up being rearranged following the coauthors feedback.

We’ve recently had drafts set around the Galaxy Zoo team by Kevin and Chris, and today Kate also sent hers. These are all looking good, though some still need a little bit more analysis including. When the rest of the authors have given their feedback, the lead author tries to incorporate their suggestions into the paper. Sometimes this process might go through a couple of cycles before a final draft is produced. The final draft is then proof read by some of the coauthors, to check the spelling, grammar, style, and to generally improve the clarity of the text.

Finally, when all the authors are content, the paper is submitted to a journal for peer review, prior to being publishing. We’ll describe that stage in a bit more detail when we get there.

Keep watching the skies!

One of the things that constantly amazes me about astronomy is how much we can learn from so little. The only information we get from stars and galaxies is the light they give off. Whether this light comes in the form of visible light, infrared radiation, radio waves, x-rays, and so on – it’s still just light. We can’t go see these galaxies ourselves, even by robotic probes. We can’t bring samples back. We can’t even get a different view of the galaxy – we are stuck here on Earth, watching from one location. And if something in the sky changes – as it does constantly, sometimes dramatically – we can’t say “hey, I missed that. Do it again!”

Given these constraints, it is sometimes amazing to me that we learn anything at all. But we have learned so much about the sky – everything from our planet’s place in the Solar System to the origin of the universe!

Since all our knowledge of astronomy is gained by looking, it makes sense that we should look as hard as we can. That’s the premise of the Sloan Digital Sky Survey, which you’ve heard a lot about, since it’s the source of all the images you see on Galaxy Zoo. The Sloan used a 2.5-meter telescope in New Mexico, USA to look up at the sky every clear night for five years. Its goal: to use these observations to make a map of the universe.

The more we look at the sky, the more likely we are to see something interesting. That’s the guiding principle of astronomy, of the Sloan Digital Sky Survey, and now of Galaxy Zoo.

Many of you are now scanning through SDSS images, classifying galaxies by shape. This is a critically important thing to do, but computers find it difficult, so it takes people watching carefully. And, as we have all seen with the Voorwerp (described here once, twice, three times), the more you look, the more you see.

And starting in 2013, the new Large Synoptic Survey Telescope will watch the sky as never before, viewing the entire sky at a higher resolution than the SDSS – every four nights.

It’s an exciting time to be involved with astronomy, and we’re glad that Galaxy Zoo has been a part of that. Keep looking up!