JWST Data Structures

JWST data generally shares a basic structure with slight variations that depend on observing mode or instrument used. Their particular format depends on the stage of the JWST Science Calibration Pipeline where they were created. 

On this page

See also: Data File Naming Conventions, Associations, JWST Data Calibration Considerations

Words in bold are GUI menus/
panels or data software packages; 
bold italics are buttons in GUI
tools or package parameters.

Figure 1 shows the flow of the data through the Data Management System (DMS), from the original telemetry packages to the combined set of data. Uncalibrated data (stage 0 data files) and all science data processed in stage 1 and stage 2 of the JWST Science Calibration Pipeline are FITS files that contain the pixel values for a single exposure from a single detector. Stage 3 of that pipeline associates and combines a set of these files into a single integrated product. Any catalogs generated by the science calibration pipeline are ASCII ECSV (Extended Comma-Separated Value) files. As data travels through the stages of processing, changes to the structure may occur.

Figure 1. Flow of data from the telemetry packages to high-level data products

Uncalibrated data

After telemetry conversion and transformation from the detector to science coordinates, Science Data Processing (SDP) subsystem will generate uncalibrated (stage 0) FITS files with the pixel data organized by detector and exposure. SDP will format the science data into a 4-D data array, with keywords NAXIS1 = number of columns, NAXIS2 = number of rows, NAXIS3= Ngroup,  and NAXIS4 = Nint. See Understanding Exposure Times for the meaning of these parameters.

Figure 2. Science exposure data cube

The first 2 dimensions are the 2-D science images. The dimension of the 3rd axis is determined by the number of groups in the exposure. And the 4th axis corresponds to the number of integrations. All 4 dimensions will be used, even if Ngroup = 1 or Nint= 1. Each integration within an exposure must have the same number of groups. The standard readout sampling for all JWST detectors is up-the-ramp readout (sometimes referred to as MULTIACCUM), meaning pixel values increase between groups within an integration.

Header keywords and relationships

The FITS header of stage 0 files will have keywords required by the FITS standard and keywords relevant to the observation that are extracted from the telemetry packet headers and science image headers.  Figure 3 shows a schematic representation of how data from many sources are used by the SDP subsystem to populate the FITS headers of the science data. After completion of science data processing, the stage 0 data file will be ready for calibration.

Figure 3. Source and usage of information by the SDP system to populate the headers of science data

Knowledge of the keywords is an important first step to understanding JWST data. By examining the file header using tools such as Astropy, observers can find detailed information about the data, including:

  • Coordinates of the target, program number, and other observation identifiers
  • Date and time of the observation including start, end, and mid-exposure times
  • Exposure parameter information, such as the instrument configuration (DETECTOR, FILTER, SUBARRAY)
  • Readout definition parameters (READPATT, NINTS, NGROUPS, and GROUPGAP)
  • Exposure-specific information, such as detailed timing and world coordinate system information
  • Calibration information, such as the calibration switches and reference files used by the pipeline

Following FITS conventions, each keyword is no longer than 8 characters, and their values can be an integer, real (floating-point) number, or a character string. Several keywords are common to all JWST data, and others are instrument-specific. 

Header keywords related to a particular topic are kept together logically, such as the program information or target information. This sample data header shows some of the keywords and groupings. The full sample of schematic headers for all the JWST modes can also be found in MAST. The JWST Keyword Dictionary in the MAST documentation contains the complete list of standard JWST header keywords, the FITS header extension where they can be found, where the information comes from, and their valid values. 

Calibrated data

All JWST FITS calibrated data products have a few common features in their structure and organization:

  1. The FITS primary header data unit only contains header information, in the form of keyword records, with an empty data array, which is indicated by the occurrence of NAXIS = 0 in the primary header. Metadata that pertains to the entire product is stored in keywords in the primary header. Metadata related to specific extensions in the data products is stored in keywords in the headers of each extension.

  2. All data related to the product are contained in one or more FITS image ("IMAGE") or binary table ("BINTABLE") extensions. The header of each extension may contain keywords that pertain uniquely to that extension.

The default and optional output files for each stage of processing (stages 1–3) are listed in the table of data products. The number and type of extensions depend on the data product, as shown in the Science Product tables, which also provide more details about the structure and type of information contained in each extension.

Latest updates
Originally published