JWST Data Structures
JWST data generally shares a basic structure with slight variations that depend on observing mode or instrument used. Their particular format depends on the stage of the JWST Science Calibration Pipeline where they were created.
On this page
See also: Data File Naming Conventions, Associations, JWST Data Calibration Considerations
Words in bold are GUI menus/
panels or data software packages;
bold italics are buttons in GUI
tools or package parameters.
Figure 1. Flow of data from the telemetry packages to high-level data products
Uncalibrated data
After telemetry conversion and transformation from the detector to science coordinates, Science Data Processing (SDP) subsystem will generate uncalibrated (stage 0) FITS files with the pixel data organized by detector and exposure. SDP will format the science data into a 4-D data array, with keywords NAXIS1
= number of columns, NAXIS2
= number of rows, NAXIS3
= Ngroup, and NAXIS4
= Nint. See Understanding Exposure Times for the meaning of these parameters.
Figure 2. Science exposure data cube
The first 2 dimensions are the 2-D science images. The dimension of the 3rd axis is determined by the number of groups in the exposure. And the 4th axis corresponds to the number of integrations. All 4 dimensions will be used, even if Ngroup = 1 or Nint= 1. Each integration within an exposure must have the same number of groups. The standard readout sampling for all JWST detectors is up-the-ramp readout (sometimes referred to as MULTIACCUM), meaning pixel values increase between groups within an integration.
Header keywords and relationships
The FITS header of stage 0 files will have keywords required by the FITS standard and keywords relevant to the observation that are extracted from the telemetry packet headers and science image headers. Figure 3 shows a schematic representation of how data from many sources are used by the SDP subsystem to populate the FITS headers of the science data. After completion of science data processing, the stage 0 data file will be ready for calibration.
Figure 3. Source and usage of information by the SDP system to populate the headers of science data
Knowledge of the keywords is an important first step to understanding JWST data. By examining the file header using tools such as Astropy, observers can find detailed information about the data, including:
- Coordinates of the target, program number, and other observation identifiers
- Date and time of the observation including start, end, and mid-exposure times
- Exposure parameter information, such as the instrument configuration (
DETECTOR
,FILTER
,SUBARRAY
) - Readout definition parameters (
READPATT
,NINTS
,NGROUPS
, andGROUPGAP
) - Exposure-specific information, such as detailed timing and world coordinate system information
- Calibration information, such as the calibration switches and reference files used by the pipeline
Following FITS conventions, each keyword is no longer than 8 characters, and their values can be an integer, real (floating-point) number, or a character string. Several keywords are common to all JWST data, and others are instrument-specific.
Header keywords related to a particular topic are kept together logically, such as the program information or target information. This sample data header shows some of the keywords and groupings. The full sample of schematic headers for all the JWST modes can also be found in MAST. The JWST Keyword Dictionary in the MAST documentation contains the complete list of standard JWST header keywords, the FITS header extension where they can be found, where the information comes from, and their valid values.
Calibrated data
All JWST FITS calibrated data products have a few common features in their structure and organization:
The FITS primary header data unit only contains header information, in the form of keyword records, with an empty data array, which is indicated by the occurrence of
NAXIS
=0
in the primary header. Metadata that pertains to the entire product is stored in keywords in the primary header. Metadata related to specific extensions in the data products is stored in keywords in the headers of each extension.All data related to the product are contained in one or more FITS image ("IMAGE") or binary table ("BINTABLE") extensions. The header of each extension may contain keywords that pertain uniquely to that extension.
The default and optional output files for each stage of processing (stages 1–3) are listed in the table of data products. The number and type of extensions depend on the data product, as shown in the Science Product tables, which also provide more details about the structure and type of information contained in each extension.