JWST associations are produced by combining science exposures using a set of predefined rules that depend on the type of data and observation.
Associations provide the relationships between multiple exposures and provide the user with the means to identify a set of exposures that belong together and may be dependent upon one another. Associations allow for the data to be calibrated, archived, retrieved, and reprocessed as a group rather than as individual objects and allow the user to combine single, multiple observations or even different programs. Finally, associations capture the relationship between exposures and higher level data products.
In order to capture a list of exposures that could potentially form an association and provide relevant information about those exposures, DMS first generates an association pool. These are simple ASCII files that contain the metadata for all the data for a given proposal. These pools are then used by the association generator that runs within the calibration software to generate association definitions in a JSON format. Based on the association table content, the calibration software creates the associated data products.
The basic association that the pipeline generates are combinations of mosaics/dithers for a single observation. Higher level products are built by associating data outside routine science data processing pipeline; these might include multiple observations from a single target in a program or large mosaics similar to HLA for HST data.
Components of an association
An association can contain data products, related files, and contemporaneous calibration files.
- Data products are multiple exposures from a science instrument that are taken as part of a dither, a mosaic, a coronagraphic or AMI image, a time series observation, or moving targets.
- Related data files that provide support to the science data are included in an association. Supporting data files include jitter data, target acquisition images, background images and confirmation images.
- Contemporaneous calibration files are exposures used to calibrate the science data, such as wavelength calibrations, lamp data, or flat fields, which are executed in the same time frame as the science exposures.
There is a separate association for each program, and each target within a program if it can be determined that multiple targets in a program are not related. Each association will follow the same format. The pool of data that is considered to construct an association within a specific program is called association pool and it will include:
- All observations from the same target in a given program
- Observations from multiple science instruments of a given target within the same program
- Different filters for the same target within the same program
- Exposures from linked observations within the same program
- Calibration exposures can be members of more than one association pool
When all of the exposures for an observation, or set of observations, have been collected, an association generator will determine which exposures are needed for the stage 3 (and in some instances stage 2) data products and will output an association table that documents the content of the association. Multiple association tables can be generated from a single association pool. In DMS, associations are created by the association generator which, based on rules, classifies the data into one or more associations. Users should not need to run the generator; instead, it is expected they will edit the existing association that accompanies the user’s JWST data.
An association table is a JSON formatted file that includes the list of all data related that might or might not be combined into a single image. An example of the format for this file is below.
Association names have to be unique and allow for different possible types of data to be associated. Associations produced in stage 3 of the calibration pipeline have the following format:
- <ProgramID> or <ppppp> is the program identifier,
- <AC_ID> is the association candidate (AC) ID. This AC ID table is a byproduct of the APT. There are four types of association candidates:
- <"o"ooo> can be o001 - o999 and are for associations constructed directly from all observations in an observation folder in APT.
- <"c"cccc> can be c1000 - c2999 and are for candidate associations constructed from linked observations within a program via APT
- <"a"aaaa> can be a3000 - a4999 and are for archive associations, constructed from linked observations within a program but not explicit via APT
- <"r"rrrr> can be r5000 - r9999 and are for reserved associations for future use. These include associations that do not fall within the above types; i.e. high-level products.
- <target|source ID> one of these should be present
- <tTTT> is a three-digit target ID. Usually for imaging targets.
- <sSSSSS> is a five digit source ID. Usually for spectral targets.
- <"epoch”X> is the text "epoch" followed by a single digit indicating the epoch number. This is an optional parameter.
- <science_instrument> is the science instrument; e.g. nircam, miri, nirspec, etc.
- <optical_elements> is a list of optical elements separated by "-"; e.g grating-filter
- <subarray> this is an optional parameter for subarrays.
- <product_type> is the suffix for the product type. See File Naming Conventions and Data Products for a listing product types.
- <ACT_ID> is a two-digit number indicating the activity ID
Underscores are used to separate fields within the file name. Dashes are used to separate values within fields. An example for an imaging association would look like:
Type of associations
The following tables provide with a list of data types that might belong to an association; if these exist for that type of observation. Any background or PSF observation can belong to more than one association.
Table 1. How data is associated
|Type of data||Association|
|Astrometric confirmation images||x||x||x|
|Singe target observation||All dither points||x||x||x||x|
|All nodding points||x||x|
|All mosaic tiles within an observation folder||x||x|
|All mosaic tiles in different observation folders||x||x|
In different orientation ( target grouped via special requirements sequence observation non-interruptable)
|Autowave calibration observations||x|
|Autoflat calibration observations||x|
|MSA Plan sources catalog||x|
|PSF observation associated with the science target||x|
Outlier Detection Associations Products
The Outlier Detection step is part of stage 3 of the calibration pipeline; however, its products have the characteristics of those produced in stage 2. These are a copy of the _cal files produced in stage 2 but with the DQ array updated with flags for new outliers detected as part of the mosaicing or cube building step. The outliers are mainly due to cosmic rays that were not detected and handled as part of CALDETECTOR1.
Because these files are produced in stage 3, these have in their root names the information about the association ID from which these were derived. In this case, the name has the form
where all the fields, except for <AC-ID> and <suffix>, follow the same naming convention as those of the exposure file names produced in stage 2. The <AC_ID> parameter takes the value of the association ID that created them and can have the form of any of the association candidate types. These are referred as stage 2c products and these can be recreated as new data for a given association becomes available. In order to clearly distinguish these products from the original stage 2 counterparts, the <suffix> will be different.
Table 2. Stage 2c products suffix and associated stage 2 data
|Stage 3 suffix||Type of data|
|_crf||2D calibrated data with DQ array updated by outlier_detection step. Copy of _cal input.|
3D calibrated data with DQ array updated by outlier_detection step. Copy of _calints input.
One _annnn_crfints product per target _calints input (none for PSF inputs).
In this case, the _crf suffix stands for "cosmic-ray flags".
Table 3 shows some examples of stage 2c products. The first and second associations shown here are constructed from a two-point dither, 2 × 1 mosaic; which will generate four stage 2c products (eight if NINT >1), each with the DQ array updated after combining these observations. The third example is a mosaic-of-mosaics association that uses the two previously created associations as input. Note that in the latter we use association ID <"c"ccc> because we assume this is an association constructed from linked observations within a program via APT.; otherwise, we would have used <"a"aaa>.
Table 3. Example association and its stage 2c products
(combines o001 and o002 above)
Note that in the bove example we are omitting the _calint and _crfint products that are created when NINTS > 1.
These type of observations distinguish themselves by a large number of integrations that can result in very large stage 1 and stage 2 files; these can be 200 Gb in size or more. Transferring or trying to calibrate files of this size can be limiting to the user so DMS will break these files into more manageable files sizes documenting their relationship and order within the association, such that user can easily reconstruct their observations.
JWST-STScI-004078 " Design of Imaging Associations "