HWR's Occasional Notes to Self: (variability-based) YSO selection for SDSS-V

Starting point

In the previous post I made a proposal on how to select (low-contamination) samples of YSOs for SDSS-V targeting, through a combination of WISE W1-W2 excess, variability, and a parallax cut to eliminate backgrounds. This was based on:
YSO's can be discerned by (any combination of) the following observational properties:1) their SEDs (0.5-20mum) are not-just-a-simple-photosphere (..disks, accretion, etc..)2) they lie off the(ir) main sequence3) many (most?) of them show some flux variability
4) they are clustered in position and velocity space.

This (at first glance) seems to do very well at selecting YSOs (Class 0,I,II) that a) have a mid-IR excess (W1-W2>0.25), and are bright enough to show up in the Gaia catalog (G<18). But that leaves out later YSO phases (no W1-W2 excess), and thereby leaves out objects (Class III) that a portion of the SDSS-V YSO group care about.

Here a propose a considerably broader YSO selection for SDSS-V (which encompasses the above approach), which is largely based on optical variability, but still seems to get low-contamination samples, though are more dominated by more 'mature' (low-mass) YSOs. To keep background contamination in check, the sample distance needs to be limited (e.g. to ~1kpc). Which subset of these YSOs are interesting enough to get targeted in the SDSS-V context needs to be sorted out.

The overall approach can be summarized as:

[ Gaia-detection (var>0.0x mag) or WISE-color-excess-Gaia-non-detections ] AND H~<12

where Gaia-detection may mean G<18 and parallax>x mas (x=0.3-1.5).

Variability Selection

Selection "Philosophy"

We are seeking YSOs that are a) bright enough to be well within Gaia's flux limit (G<18), b) are bright enough in H-band to be sensibly observed within the SDSS-V context, c) and are YSO's in the sense that they have not yet reached the main sequence of their Minit.
"Variability" in Gaia DR2 is defined as sqrt(g.phot_g_n_obs)/g.phot_g_mean_flux_over_error i.e. via the photometric excess noise.

Query

I run on ESA's Gaia DR2 server, with a distance cut at 1kpc (TDB, g.parallax - g.parallax_error > 1) ;

SELECT * , sqrt(g.phot_g_n_obs)/g.phot_g_mean_flux_over_error as variability

FROM gaiadr2.gaia_source AS g

INNER JOIN gaiadr2.allwise_best_neighbour as xaw

ON xaw.source_id = g.source_id

INNER JOIN gaiadr1.allwise_original_valid as allwise

ON xaw.allwise_oid = allwise.allwise_oid

WHERE

g.phot_g_mean_mag < 18.

and

sqrt(g.phot_g_n_obs)/g.phot_g_mean_flux_over_error > 0.02

and

sqrt( g.astrometric_chi2_al / ( g.astrometric_n_good_obs_al - 5)) < 2.

and

allwise.w1mpro < 12.5

and

g.parallax - g.parallax_error > 1.

See here for an explanation of the query "philosophy". I subsequently make a cut to H<12.5, and excise sources with W1_error or W2_error > 0.05 (TBD: is this necessary?). I also run -- on a subset of the sky -- the exact same query, just insisting that variability < 0.02.

This yields a set of 93.000 stars; query output (i.e. "the sample" is here as a fits file).

The following Figure shows the distribution of this sample in the mid-IR excess (W1-W2) vs variability space. There are three regimes: W1-W2>0.25 (Class I,II objects), a group of low-variability objects at W1-W2~0.1, and the dominant plume at W1-W2=0 (stellar photosphere only?).

W1-W2 vs variability (for H<12.5, D<1kpc). Three regimes are apparent: sources with mid-IR excess (W1-W2>0.3); (the dominant subset of sources with simple "photospheric" colors (W1-W2~0), and the (low-mass) sources with W1-W2~0.17.

CMD Distribution of Variability Selected Stars

Obviously, youth or YSO-ness is not the only source of >0.02mag variability among stars within 1kpc, they pulsate, eclipse, etc.. I certainly had no idea or preconceived notion of which fraction of the H<12.5 D<1kpc var>0.02mag stars may be YSO. The availability of parallaxes allows us to put these stars onto a CMD.

Let's stars with looking at the complementary set: (a subset of) stars H<12.5 within D<1kpc that does NOT vary (at the rms 0.02mag level):

Non-varying stars (at <0.02mag level) with H<12.5 (G<18) and D<1kpc(plx>1mas). Overplotted are (Padova) isochrones of log(t)=6.6 -- 9.6 in steps of 0.5 dex (4Myrs,12Myrs,40Myrs, etc..); this all looks nice and "boring".

Once can see a hint of the binary sequence, but otherwise this looks like a 3-10Gyr old population (to me). As expected, the combination of geometric survey volume and magnitude limit, prefers a certain stellar luminosity (those still bright enough to make the magnitude cut at the maximal distance).

Now let's look at the analogous query, but ALL stars that vary by variability>0.02mag.

Stars H<12.5,D<1kpc and variability > 0.02. The vast majority of them lie above the MS!

The following is the same plot as above, increasing point size to show that all the very reddened YSOs are still in here...

As above, but showing the sparse parts of CMD space.

Remarkably, the vast majority of these stars lie above the (old) main sequence, around the 4-40Myrs isochrones. That patterns becomes even more distinct, if we look at the stars that vary by at least 0.05mag rms, as shown here; insisting on larger variability also selects against low-mass YSOs (?).

As above, but restricted to stars varying (rms) >0.05mag.

If that is true, then the majority of stars (H<12,D<1kpc) that vary by 0.02mag (or certainly 0.05) mag are YSOs (??!!??). Can someone educate me whether that can be true?

Obviously, this sample include all kinds of other variable stars, but they seem to be a modest fraction.

-------------------------------------------------------------

Aside: Variability Selection in the Orion Region

If I take I select all stars form Gaia (G<18.5) within 190<l<215 and -26<b<-8 and 2<parallax<3,

I get this

I can now split this in the non-variable sources (variability < 0.02mag; 53.000 sources)

and variable ones (variability > 0.02mag; 7000 sources)

and those 2000 with variability > 0.05

And now an aside on the aside: this is all for PMS stars (in Orion absG>4-ish). The more massive stars in Orion (absG<3, parallax and PM selected) that have presumably reached their MS show no variability: shown as larger symbols (color-coded by variability) on top of the lower-mass var > 0.02 background.

Seems all pretty neat.

[end of aside] -----------------------------------------------------

On-Sky (and Parallax) Distribution of these Stars

If that interpretation is correct that variability >0.02 is an efficient (both reasonably complete and pure) YSO selector, then this should be reflected in the sky distribution.

Let's start with the "easy" case: 2300 YSOs with distinct W1-W2 excess (>0.2) (and H<12.5 and D<1kpc) shown as a sky map woth parallax as color-coding.

W1-W2>0.2 stars within 1kpc that have H<12.5 and D<1kpc.

As above with larger dot sizes to show the correlation between position and distance.

This seems to give a very clean sample, as before here , just restricted to D<1kpc.

Let's now contrast that with the control sample that has no (<0.02mag) variability.

.. a nice smooth on-sky distribution, with most stars near the geometric sample limit, 1kpc.

Now what about the stars that have no W1-W2 excess (actually all variable stars, most of which have not W1-W2 excess). Their sky distribution looks like this (var > 0.02mag)

Alls stars (H<12.5, D<1kpc) that vary at >0.02mag, color-coded by distance. The two vertical features must be data artifacts.

and like this when restricting to var > 0.05mag, or like this

Question: what is that warped configuration? Gould's belt? I have no idea... Tell me what paper to read.

when restricting to D<500pc. If we take the subsample with small mid-IR excess (W1-W2~0.18) the sky distribution looks like this:

Again showing the stronger spatial clustering of younger objects. (my conjecture)

Next steps verification:

My current conclusion is that variability alone is very effective at picking out objects too young to have settled on the MS of their mass. If this is a useful definition of YSOs, then most of them are YSOs. The ones with strong mid-IR excess (W1-W2) are very tightly clustered. The ones with W1-W2=0 and low-variability (0.02-0.05mag) have a considerably smoother sky distribution. If many of them are 30Myr+ old, this may not be surprising.

Questions to all:

What needs to be done to verify this?

What's the interest in young (PMS) "field" stars?

Please pay with the query results, as a proposed sample file to draw from. ( https://www.dropbox.com/s/de832x5p78q6y8h/GDR2_var%3E0.02_H%3C12.5_G%3C18.fits?dl=0 )

What's the best way to augment all of this by WISE-selected (Class I) sources, that don't show up in Gaia, but have H<12? Of the MANY towards the Galactic center, which are interesting?

==============================================================

Another aside on: which sources does a simple criterion W1-W2>0.25, H<12 select, and of these, which sources is a Gaia variability selection missing?

And, is it enough to get all the ones that have H<11 through the GalacticGenesis program anyway? [Should those get a priority flag?] The plots below show ALL W1-W2>0.25, H<12. sources.

Here is an approximate map of those sources that are NOT in Gaia

and here is their galactic latitude distribution quantified: they are almost ALL exactly in the Galactic plane.

I.e. there is a modest number of such (missed sources, not in Gaia 11<H<12) sources in Orion, but the vast majority of them are inner disk (within 1deg of the Galactic plane). What should we do about them?

Let's look at the Orion region, defined as

There are 1127 sources that pass H<12 and W1-W2>0.25. Of those, 1085 (97%) are in Gaia, and 979 (87%) bright enough to be included in the variability selection. Do we need to address those?

end of aside
====================================================================

Implications for SDSS-V target selection:

This picks out nearly 100k YSO/Young stars targets, which is more than we can target.

My proposal for SDSS_V YSO targetting: let's make the sample definition:

-- all stars with H<12.5, and W1-W2>0.25 and D<5kpc

and

-- all stars with H<12.5, variability>0.02mag and D<1kpc

Or put differently:

-- all stars with H<12.5, variability>0.02mag and D<1kpc

augmented by

-- all stars with H<12.5, and W1-W2>0.25 and 1kpc<D<5kpc

Implicit is G<18, and we then need to set a targeting priority, where priority decreases as W1-W2 decreases. E.g. if "second priority" are the targets with slight W2 excess, we still zoom in on clusters.

HWR's Occasional Notes to Self

Sonntag, 23. Dezember 2018

(variability-based) YSO selection for SDSS-V