JWST Data Structures

JWST data generally shares a basic structure with slight variations that depend on observing mode or instrument used. Their particular format depends on the stage of the JWST Data Reduction Pipeline where they were created.

On this page

See also: File Naming Conventions, Associations, Data Processing and Calibration Files

Figure 1 shows the flow of the data through the Data Management System (DMS), from the original telemetry packages to the combined set of data. Uncalibrated data (stage 0 data files) and all science data processed in stage 1 and stage 2 of the calibration pipeline are FITS files that contain the pixel values for a single exposure from a single detector. Stage 3 of the calibration pipeline associates and combines a set of these files into a single integrated product. Any catalogs generated by the calibration pipeline are ASCII ECSV (Extended Comma-Separated Value) files. As data travels through the stages of processing, changes to the structure may occur.

Figure 1. Flow of data from the telemetry packages to high-level data products

Uncalibrated data

After telemetry conversion and transformation from the detector to science coordinates, Science Data Processing (SDP) will generate uncalibrated (stage 0) FITS files with the pixel data organized by detector and exposure. SDP will format the science data into a 4D data array, with NAXIS1 = column, NAXIS2 = row, NAXIS3= NGROUPS, NAXIS4 = NINTS.  See Understanding Exposure Times for the meaning of these parameters.

Figure 2. Science exposure data cube


The first two dimensions are the 2D science images, the dimension of the third axis is determined by the number of groups in the exposure, and the fourth axis corresponds to the number of integrations. All four dimensions will be used, even if NGROUPS = 1 or NINTS = 1. Each integration within an exposure must have the same number of groups. The standard readout sampling for all JWST detectors is up-the-ramp readout (sometimes referred to as MULTIACCUM), meaning pixel values increase between groups within an integration.

Header keywords and relationships

The FITS header of stage 0 files will have keywords required by the FITS standard and those extracted from the telemetry packet headers and science image headers.  Figure 3 shows a schematic representation of how the data from many sources are used by the SDP system to populate the FITS headers of the science data. After the completion of science data processing, the stage 0 data file will be ready for calibration.

Figure 3. Source and usage of information by the SDP system to populate the headers of science data



Knowledge of the keywords is an important first step to understanding JWST data. By examining the file header using tools such as astropy, observers can find detailed information about the data, including:

  • Coordinates of the target, program number, and other observation identifiers
  • Date and time of the observation including start, end, and mid-exposure times
  • Exposure parameter information, such as the instrument configuration (DETECTOR, FILTER, SUBARRAY)
  • Readout definition parameters (READPATT, NINTS, NGROUPS, and GROUPGAP)
  • Exposure-specific information, such as detailed timing and world coordinate system information
  • Calibration information, such as the calibration switches and reference files used by the pipeline

Following FITS conventions, each keyword is no longer than eight characters, and their values can be an integer, real (floating-point) number, or a character string. Several keywords are common to all JWST data, and others are instrument-specific. 

Header keywords related to a particular topic are kept together logically, such as the program information or target information. This sample data header shows some of the keywords and groupings. The full sample of schematic headers  for all the JWST modes can also be found in MAST. The JWST Keyword Dictionary in the MAST documentation contains the complete list of standard JWST header keywords, the FITS header extension where they can be found, where the information comes from, and their valid values. 

Calibrated data

All JWST FITS calibrated data products have a few common features in their structure and organization:

  1. The FITS primary Header Data Unit (HDU) only contains header information, in the form of keyword records, with an empty data array, which is indicated by the occurrence of NAXIS=0 in the primary header. Metadata that pertains to the entire product is stored in keywords in the primary header. Metadata related to specific extensions in the data products is stored in keywords in the headers of each extension.

  2. All data related to the product are contained in one or more FITS IMAGE or BINTABLE extension. The header of each extension may contain keywords that pertain uniquely to that extension.

The default and optional output files for each stage of processing (stages 1 - 3) are listed in the table of data products. The number and type of extensions depend on the data product, as shown in the Science Product tables, which also provide more details about the structure and type of information contained in each extension.




Published

 

Latest updates