SDSS Data Access

Physics R

How to get hold of data from the SDSS astronomical catalogue.

Eugene https://www.fizzics.ie
2021-12-16

When it comes to astronomical catalogues, SDSS was without peer for many years. We used it routinely for our work on differential photometry, and developed ways of accessing SDSS data directly to R. This post is really just a reminder-to-self, so that next time I need to get my SDSS data I won’t need to scrabble around looking for old scripts.

The packages needed are tidyverse (of course), RCurl, and glue.

N <- 5 # makes a table of 5 star targets
# we'll just get the spectrum from the first one
# delta <- 0.1 # 
bands_min <- 15 # this places an upper limit on the brightness of the star
bands_max <- 20 # this places an lower limit on the brightness of the star
# fainter stars have noisy spectra
# SQL that downloads some info on the chosen target from SDSS.
# ObjID from SDSS specifies the target

master_target_SqlQuery <- glue("SELECT top {N} p.ra, p.dec, ",
                       "p.u, p.g, p.r, p.i, p.z, p.objid, ", 
                       "s.specobjid, s.class, s.subclass, s.survey, ", 
                       "s.plate, s.mjd, s.fiberid ", 
                       "FROM photoObj AS p ", 
                       "JOIN SpecObj AS s ON s.bestobjid = p.objid ",
                       "WHERE p.g BETWEEN {bands_min} AND {bands_max} ",
                       "AND p.r BETWEEN {bands_min} AND {bands_max} ", 
                       "AND p.i BETWEEN {bands_min} AND {bands_max} ", 
                       "AND s.class = 'STAR' ",
                       "AND s.survey != 'eboss'" )
# downloads target data
# dataframe master_targets has necessary info
master_target_SqlQuery <- str_squish(master_target_SqlQuery) |> 
  str_replace_all(" ", "%20")
my_format <- "&format=csv"
urlBase <- "http://skyserver.sdss.org/dr16/SkyServerWS/SearchTools/SqlSearch?cmd="
master_targets <- read_csv(paste0(urlBase, master_target_SqlQuery, my_format), skip = 1)


ra dec u g r i z objid specobjid subclass plate mjd fiberid
245.1802 6.98543 16.73 15.56 15.78 15.96 16.06 1237662636377833472 1950104876876851200 A0 1732 53501 168
245.2807 6.83412 21.76 19.48 18.13 17.02 16.43 1237662199901913344 1950105151754758144 M3 1732 53501 169
245.2640 7.33773 19.19 17.89 17.39 17.22 17.15 1237662200438718720 1950105426632665088 F9 1732 53501 170
245.2317 7.38190 21.69 19.30 17.88 16.95 16.42 1237662636914639616 1950106251266385920 M2 1732 53501 173
245.2420 7.31339 21.93 19.31 17.92 16.92 16.39 1237662200438718720 1950107075900106752 M2 1732 53501 176

There are times when the SDSS data server is down. In this case, expect to see an error message like - “Error: InternalServerError”.

Now that we have the plate, mjd, and fiberid for some stars, we can go ahead and download their spectra.

index <- 1 # uses first star from list
get_spectrum <- function(object, wavelength_lower_limit = 5500, wavelength_upper_limit = 7000){
  plate <- object$plate
  mjd <- object$mjd
  fiber <- object$fiberid
  url_spect <- glue("http://dr12.sdss.org/csvSpectrum?plateid={plate}", 
                    "&mjd={mjd}&fiber={fiber}&reduction2d=v5_7_0")
  spectrum <- read_csv(file = url_spect)
  spectrum %>% 
    filter(between(Wavelength, wavelength_lower_limit, wavelength_upper_limit)) %>% 
    dplyr::select(Wavelength, BestFit)
}
spect1 <- get_spectrum(master_targets[index,], 
                       wavelength_lower_limit = 3500, 
                       wavelength_upper_limit = 8000)

This gives a dataframe with wavelength and intensity. Let’s plot this

The spectrum from sdss is shown below, and can be seen here

SDSS Spectrum RA = 245.1801662, Dec = 6.9854337

The SDSS image of the star itself is given here

Full code is available from github.

Note, for some reason the SDSS objid returned from the SQL query is incorrect. I’m still trying to figure out why and to fix this.

Update, the following is an ugly workaround on the objid issue. Haven’t verified that it always works yet.

radial_url_root <- "http://skyserver.sdss.org/dr17/SkyServerWS/SearchTools/SqlSearch?cmd="
radial_url_core <- glue("SELECT top {N} p.ra, p.dec, ",
                        "p.u, p.g, p.r, p.i, p.z, p.objid, ", 
                        "s.specobjid, s.class, s.subclass, s.survey, ", 
                        "s.plate, s.mjd, s.fiberid ", 
                        "FROM photoObj AS p ", 
                        "JOIN SpecObj AS s ON s.bestobjid = p.objid ",
                        "WHERE p.g BETWEEN {bands_min} AND {bands_max} ",
                        "AND p.r BETWEEN {bands_min} AND {bands_max} ", 
                        "AND p.i BETWEEN {bands_min} AND {bands_max} ", 
                        "AND s.class = 'STAR' ",
                        "AND s.survey != 'eboss'" ) %>% 
  str_replace_all(" ", "%20") %>% 
  str_replace_all("\n", "")
w <- rvest::read_html(glue::glue(radial_url_root, radial_url_core, "&format=csv"))
X <- as_list(w)$html$body$p[[1]] %>% 
  as.character() %>% 
  str_remove("#Table1\n")
master_targets <- read.table(text = X, header = TRUE, sep = ",", dec = ".", comment.char = "#") %>% 
  mutate(across(where(is.numeric), round, 2),
         objid = as.character(objid),
         specobjid = as.character(specobjid))
master_targets$objid # check the top one to see if this works

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/eugene100hickey/fizzics, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Eugene (2021, Dec. 16). Euge: SDSS Data Access. Retrieved from https://www.fizzics.ie/posts/2021-12-16-sdss-data-access/

BibTeX citation

@misc{eugene2021sdss,
  author = {Eugene, },
  title = {Euge: SDSS Data Access},
  url = {https://www.fizzics.ie/posts/2021-12-16-sdss-data-access/},
  year = {2021}
}