JWST Data Structures
JWST data generally shares a basic structure with slight variations that depend on observing mode or instrument used. Their particular format depends on the stage of the JWST Science Calibration Pipeline Overview where they were created.
After telemetry conversion and transformation from the detector to science coordinates, Science Data Processing (SDP) will generate uncalibrated (stage 0) FITS files with the pixel data organized by detector and exposure. SDP will format the science data into a 4D data array, with NAXIS1 = column, NAXIS2 = row, NAXIS3= NGROUPS, NAXIS4 = NINTS. See Understanding Exposure Times for the meaning of these parameters.
The first two dimensions are the 2D science images, the dimension of the third axis is determined by the number of groups in the exposure, and the fourth axis corresponds to the number of integrations. All four dimensions will be used, even if NGROUPS = 1 or NINTS = 1. Each integration within an exposure must have the same number of groups. The standard readout sampling for all JWST detectors is up-the-ramp readout (sometimes referred to as MULTIACCUM), meaning pixel values increase between groups within an integration.
Header keywords and relationships
The FITS header of stage 0 files will have keywords required by the FITS standard and those extracted from the telemetry packet headers and science image headers. Figure 3 shows a schematic representation of how the data from many sources are used by the SDP system to populate the FITS headers of the science data. After the completion of science data processing, the stage 0 data file will be ready for calibration.
Knowledge of the keywords is an important first step to understanding JWST data. By examining the file header using tools such as astropy, observers can find detailed information about the data, including:
- Coordinates of the target, program number, and other observation identifiers
- Date and time of the observation including start, end, and mid-exposure times
- Exposure parameter information, such as the instrument configuration (DETECTOR, FILTER, SUBARRAY)
- Readout definition parameters (READPATT, NINTS, NGROUPS, and GROUPGAP)
- Exposure-specific information, such as detailed timing and world coordinate system information
- Calibration information, such as the calibration switches and reference files used by the pipeline
Following FITS conventions, each keyword is no longer than eight characters, and their values can be an integer, real (floating-point) number, or a character string. Several keywords are common to all JWST data, and others are instrument-specific.
Header keywords related to a particular topic are kept together logically, such as the program information or target information. This sample data header shows some of the keywords and groupings. The full sample of schematic headers for all the JWST modes can also be found in MAST. The JWST Keyword Dictionary in the MAST documentation contains the complete list of standard JWST header keywords, the FITS header extension where they can be found, where the information comes from, and their valid values.
All JWST FITS calibrated data products have a few common features in their structure and organization:
The FITS primary Header Data Unit (HDU) only contains header information, in the form of keyword records, with an empty data array, which is indicated by the occurrence of NAXIS=0 in the primary header. Metadata that pertains to the entire product is stored in keywords in the primary header. Metadata related to specific extensions in the data products is stored in keywords in the headers of each extension.
All data related to the product are contained in one or more FITS IMAGE or BINTABLE extension. The header of each extension may contain keywords that pertain uniquely to that extension.
The default and optional output files for each stage of processing (stages 1 - 3) are listed in the table of data products. The number and type of extensions depend on the data product, as shown in the Science Product tables, which also provide more details about the structure and type of information contained in each extension.