JWST Data Products
Content Migration
WARNING: This page has been deprecated and the content migrated to JWST Science Data Overview. Please update your links accordingly.
Working with JWST data requires an understanding of the JWST data products and the steps in the JWST Science Calibration Pipeline that produces them.
On this page
See also: Getting Started with JWST Data, Understanding JWST Data Files, File Header Contents
Software documentation outside JDox: File Naming Conventions, Data Product Types, Science Product Structures and Extensions, Data File Associations, JWST Data Products Information
Science data products
JWST science data products can be divided into 4 main types, depending on the stages of processing of the file and the how these relate to other exposures:
- Uncalibrated products are from single exposures.
- Products of stage 1 are also from single exposures.
- Data products from single or multiple exposures produced during stage 2 of the JWST science calibration pipeline.
- Data products that result from the combination and/or association of the single exposures into a single integrated product in stage 3 of the JWST Science Calibration Pipeline.
Within the 2 categories that include multiple exposures, there are different ways in which a set or subset of exposures are combined, and each of these have a unique association. More information about the JWST data products and how these relate to the different type of data and calibration steps can be found in the software documentation.
Uncalibrated data
Uncalibrated data are generated by the the Science Data Processing (SDP) system from telemetry data. These are input to the stage 1 of the calibration pipeline. These come from single exposures and are usually contained within a single FITS file. However, when the raw data volume for an individual exposure is large enough, like for time-series observations, the uncalibrated data are broken into multiple segments less than 2GB each, so as to keep total file sizes to a reasonable level. These exposures usually include "segNNN" in the file names, where NNN is 1-indexed and always includes any leading zeros.
Stage 1 data products
Stage 1 data products from single exposures can be science or non-science products. Science products include files generated by intermediate steps, like count rate data, or the final product generated by this stage. Some of this data are generated by default and can be retrieved from MAST, while others are optional products that can be generated by reprocessing the data yourself; check the example Jupyter notebooks to learn how to run the pipeline. Non-science products can be calibration data products, like dark exposures, that use the same steps used to in stage 1 for JWST science data. These also could be auxiliary products that provide relevant information about the data, such as charge trap state data.
Stage 2 data products
Stage 2 data products depend on the observing mode and these can be generated from a single exposure or from the combination of more than one exposure. The number and type of products generated will vary for imaging, spectroscopy, and TSO data. For spectroscopy data it will also depend on how the observation was planned. When observations include background exposures, the background-subtracted data is also produced by default. Information about what data should be combined is captured in stage 2 association files. In this stage, users have the option to produce an on-the-fly constructed flat for NIRSpec data.
Stage 3 data products
Stage 3 data products result from the combination of stage 2 exposures into a single integrated product in stage 3 of the JWST science calibration pipeline. The type and number of data products of this stage vary with the type of observation, but each of these corresponds to a unique association. The information about which data will be combined is captured in association files information below.
Association overview
See also: Understanding JWST Data Files
Software documentation outside JDox: JWST Associations
Relationships between multiple exposures are captured in an association, which is a means of identifying a set of exposures that belong together and may be dependent upon one another. The association concept permits exposures to be calibrated, archived, retrieved, and reprocessed as a set rather than as individual objects.
In order to capture a list of exposures that could potentially form an association product and provide relevant information about those exposures, the JWST Science Calibration Pipeline first generates an association pool. These are Astropy tables that contain the metadata for all the data in a given proposal. These pools are then used by the association generator, that runs within software, to generate stage 2 associations or stage 3 associations in JSON format. Based on the association pool content, the JWST Science Calibration Pipeline software creates the associated data products.
The basic association products that the pipeline generates are combinations of mosaics or dithers for a single observation. Higher level products are built by associating data outside the routine science calibration pipeline; these might include multiple observations from a single target in a program or large mosaics similar to HLA for HST data. An association file can contain several science data products, related files that support the science data (e.g., jitter data or target acquisition images), and contemporaneous calibration files used to calibrate the science data
Association pool
See also: Understanding JWST Data Files, Association Generator, Types of Associations
There is a separate association file for each program and each target within a program. The list of data that goes into constructing an association file, within a specific program, is contained in what is called an association pool. This will include:
- All observations from the same target in a given program
- Observations of a given target from multiple science instruments within the same program
- Different filters for the same target within the same program
- Exposures from linked observations within the same program
- Calibration exposures, which can be members of more than one association pool
Association generator
See also: Understanding JWST Data Files, Association Pool, Types of Associations
In the JWST science calibration pipeline, the association generator classifies the data into one or more association files based on a set of rules. When all of the exposures for an observation or set of observations have been collected, the association generator will determine which exposures are needed for the stage 2 and stage 3 data products, and will output an association file with the list of all these files. Multiple association files can be generated from a single association pool. Observers should not need to run the generator; instead, it is expected they will edit the existing association files that accompanies the JWST data.
Association file
An association file is a JSON formatted file that includes the list of all related data that might be combined into a single image, as shown in this example stage 3 association. Each association is intended to make a specific science product known as association type, which in turn determines what pipeline module to use.
Types of association
See also: Understanding JWST Data Files, Association Pool, Association Generator
Table 1 shows a list of data types that might be included in an association file. Some of these files, if they exist for that type of observation, might be or not be used to create the association products. Any background or PSF observation can be part to more than one association product.
Table 1. List of data types that can be included in an association file
Type of data | Association | ||||
---|---|---|---|---|---|
Imaging | Spectroscopic | Coronagraphic | AMI | ||
Target Acquisition | x | x | x | x | |
Astrometric confirmation images | x | x | x | ||
Single target observation | All dither points | x | x | x | x |
All nodding points | x | x | |||
All mosaic tiles within an observation folder | x | x | |||
All mosaic tiles in different observation folders | x | x | |||
In different orientation ( target grouped via special requirements sequence observation non-interruptable) | x | x | |||
Background observations | x | x | x | ||
Autowave calibration observations | x | ||||
Autoflat calibration observations | x | ||||
Confirmation Images | x | ||||
Pre-image | x | ||||
MSA Plan sources catalog | x | ||||
Leak image | x | ||||
PSF observation associated with the science target | x |
Currently we have three types of associations being produced, these are "candidates", "observation", and "discovered" associations. These rules used to select the data along with the naming convention for each of these cases are provided in Table 2. Note than different type of associations might include the same datasets but associated following different rules.
Table 2. Association Types and type of data included
Association Candidate | name identifier | what | Type of data included |
---|---|---|---|
observation | oNNN | Group data within an observation |
|
candidate | c1NNN | Data for within an observation or different observations |
|
discovered | a3NNN | Data in whole program for same target and configuration | These type of associations are not being created at the moment.
|