JWST Data Structure

The JWST data files contain information about the science image or spectra; science proposal, planning, and scheduling information; spacecraft position; time conversions; pointing information; and select engineering parameters. Some of this information might change and the data format is updated as the data goes through the calibration steps of the pipeline. There are also optional files that contain extra information about the calibration that might be useful to users to further understand their observations.

On this page

JWST data are FITS format files. Uncalibrated data (also known as stage 0 data files), as well as all data processed in the first two stages of the calibration pipeline, are science data FITS files which contain the pixel values for a single exposure from a single detector. The calibration pipeline in its stage 3, associates a set of these files into a single unit. Figure 1 shows the flow of the data from the telemetry packages to the most combined set of data.

Figure 1. Flow of data from the telemetry packages to the most combined set of data



Stage 0 files (_uncal)

After telemetry conversion and coordinate transformation, Science Data Processing (SDP) will generate stage 0 FITS files with the pixel data by detector and exposure. SDP will format the science data into a 4D data array.

Figure 2. Science exposure data cube

NAXIS1 = column, NAXIS2 = row, NAXIS3= Ngroups, NAXIS4 = Nints

The first two dimensions are the 2D science images, the dimension of the third axis is determined by the number of groups in the exposure, while the fourth axis corresponds to the number of integrations. All four dimensions will be used, even if NGROUPS = 1 or NINTS = 1.  Each integration within an exposure is required to have the same number of groups.  Due to up-the-ramp accumulation, pixel values increase between groups within an integration. The first frame of an integration may be read out separately, even if that frame is also averaged into the first group of the integration.  A zero frame readout is identified with Group ID = 0.  If there is a zero frame readout, there will be one for each integration. The zero frame will be stored in the data cube as the first image in each integration.

The FITS header of these files will have keywords required by the FITS standard and those extracted from the CCSDS telemetry packet headers and science image headers. The second phase of stage 0 of the processing of these files will add additional keywords/value pairs to the stage 0 files.  These additional header data includes the proposal's planning and scheduling information; spacecraft position; time conversions; pointing information; and select engineering parameters. Figure 3 is a schematic representation of how the data from many sources (planing, scheduling, and observatory) is used by the Science Data Processing system to populate the FITS headers of the science data.

Figure 3. Source and usage of information by the Science Data Processing system to populate the headers of the science data.


At the completion of science data processing, the stage 0 (_uncal) data file will be ready for calibration. The overall file structure of the _uncal files is shown in the diagram below

Figure 4. Stage 0 (_uncal) file structure


The Primary Header Data Unit (HDU) contains keywords that apply to all subsequent extensions

  • Science Data (SCI) extension – contains the uncalibrated pixel values in an image extension.
  • Reference image (REFOUT) extension – contains reference pixels from the fifth output line read out simultaneously with the science data for MIRI.
  • Zero frame images (ZEROFRAME) — contains the zero-frame data, if requested to be downlinked,
  • Group Information (GROUP) extension – contains an ASCII table of group information extracted from the telemetry image header



From stage 0 to stage 3 data

When running the full calibration pipeline modules the default and optional output files are listed in the table of data products. All data products will have:

  • a primary header unit
  • one or several extensions, each with their own header unit

just as shown in the stage 0 product diagram above. The type number and type of extensions depend on the type of data product, and in some cases depending on the instrument. The type of products from the calibration pipeline can be classified into 3 main categories:

  1. Default products produced by the calibration pipeline module and archived (default)
  2. Optional products produced only when the calibration pipeline is run from the command line and the parameter to save that particular file is set to True in the particular step that produces it (optional)
  3. The output from each of the calibration steps when each step is run independently rather than as a complete module such as calwebb_detector1. (intermediate)

Table 1 provides information for each of these products.



Output products

In here we provide with the complete list of default and optional products of  the calibration pipeline and their format.


Table 1. Default and optional output products

StepModuleTypeProduct extension


default_uncal
Group Scale Correction

calwebb_detector1


_groupscalestep
Data Quality Initializationintermediate_dqinitstep
Saturation Detectionintermediate_saturationstep
Superbias Subtractionintermediate_superbiasstep
Reference Pixel Correctionintermediate_refpixstep
Linearity Correctionintermediate_linearity
Persistence Correctionintermediate_persistencestep
optional_output_pers
default_trapsfilled
Dark Current Subtractionintermediate_darkcurrentstep
optional
Jump Detectionintermediate_jumpstep
Ramp Fittingdefault_rate
default_rateints
optional_ramp
Last Frame correctionintermediate (MIRI only)_lastframestep
RSCD Correctionintermediate (MIRI only)rscd_step.fits
Assign WCS
intermediate_assignwcsstep
Background Image Subtraction


intermediate_backgroundstep

optional_bsub

optional_bsubints
Flat Field Correction

intermediate_flatstep

optional[_flat]
Photometric Correction
intermediate_photomstep
Resample
default_cal


default

_calints



default_i2d


defailt_s3d


default_s2d


default_x1d


default_x1dints



File format

The format for each of the calibration products is provided below. These appear in alphabetic order.

assignwcsstep.fits 

The output fits file for this step will have the same format as the ramp step

Primary Header: inserts calibration status keyword S_WCS and reference file recorded in R_CAMERA, R_COLLIMATOR, R_DISPERSER, R_DISTORTION, R_FORE, R_FPA, R_MSA, R_OTE, R_REGIONS, R_WAVELENGTHRANGE keywords

backgroundstep.fits 

The output fits file for this step will have the same format as the ramp step

Primary Header: inserts calibration status keyword  S_BKDS keyword

darkcurrentstep.fits

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_DARK and reference file recorded in R_DARK keywords.

dqinitstep.fits 

The output FITS file for this step will have the extensions listed in Table 2.


Table 2. File format for the dqinit file

No.NameTypeCardsDimensionsFormatUnits
0PRIMARYPrimaryHDU218()

1SCIImageHDU11(ncols, nrows, ngroups, nints)float32DN
2PIXELDQImageHDU10(ncols, nrows)int32 (rescales to unit32)n/a
3GROUPDQImageHDU10(ncols, nrows, ngroups, nints)uint8n/a
4ERRImageHDU10(ncols, nrows, ngroups, nints)float32DN
5REFOUT

ImageHDU

(MIRI only)

10(258, 1024, ngroups, nints)

float32

DN
6GROUPBinTableHDU376R x 13C[I, I, I, J, I, 26A, I, I, I, I, 36A, D, D]n/a
7ASDFImageHDU7(4760,)uint8n/a

where the GROUP extension is a table with columns

name:
    ['integration_number', 'group_number', 'end_day', 'end_milliseconds', 'end_submilliseconds', 'group_end_time', 'number_of_columns', 'number_of_rows', 'number_of_gaps', 'completion_code_number', 'completion_code_text', 'bary_end_time', 'helio_end_time']
unit:
    ['', '', '', '', '', '', '', '', '', '', '', 'MJD', 'MJD']

and the ASDF extension is contained in every data product file coming out of the cal pipeline. It stores meta data for the product. The main constituents are the values of header keywords and, most importantly, all of the detailed data used by the World Coordinate System transformations. The metadata contained in the ASDF extension is not terribly useful at the file level but is used extensively within the calibration pipeline processing code.

Header input: inserts calibration status keyword S_GRPSCL and reference file recorded in R_MASK keyword.

flatfieldstep.fits 

The output fits file for this step will have the same format as the ramp step

Primary Header: inserts calibration status keyword  S_FLAT keyword and reference files recorded in R_FLAT keyword

groupscale.fits

The output FITS file for this step will have the extensions listed in Table 3


Table 3. File format for the groupscale file

No.NameTypeCardsDimensionsFormat
0PRIMARYPrimaryHDU42()
1SCIImageHDU309(ncols, nrows, ngroups, nints)float32
2PIXELDQImageHDU10(ncols, nrows)int32 (rescales to unit32)
3GROUPDQImageHDU10(ncols, nrows, ngroups, nints)uint8
4ERRImageHDU10(ncols, nrows, ngroups, nints)float32
5REFOUTImageHDU (MIRI only)10(258, 1024, ngroups, nints)

float32

6ASDFImageHDU7(20445,)uint8

jumpstep.fits 

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_JUMP and reference file recorded in R_READNS, R_GAIN keyword.

lastframestep.fits  

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_LASTFR.

linearitystep.fits  

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_LINEAR and reference file recorded in R_LINEAR keyword.

persistencestep.fits  

The output fits file for this step will have the same format as the dqinitstep step

Optional output file _output_pers has format

Primary Header: inserts calibration status keyword S_PERSIS and reference file recorded in R_TRAPDS, R_ PERSAT, and R_TRAPAR. keywords.

photomstep.fits  

The output FITS file for this step will have the extensions listed in Table 4

Table 4. File format for the photomstep file

No.NameTypeCardsDimensionsFormat
0PRIMARYPrimaryHDU247()

1SCIImageHDU9(ncols, nrows)float32
2DQImageHDU10(ncols, nrows)int32int32 (rescales to unit32)
3ERRImageHDU8(ncols, nrows)float32
4AREA

ImageHDU         8   (1024, 1024)   float32


8(ncols-refpix, nrows)

float32



5ASDFImageHDU7(5037,)uint8


Primary Header: inserts calibration status keyword  S_PHOTOM keyword and reference files recorded in R_PHOTOM keyword

rampfit.fits 

The output FITS file for this step will have the extensions listed in Table 5

Table 5. File format for the rampfit file

No.NameTypeCardsDimensionsFormat
0PRIMARYPrimaryHDU247()

1SCIImageHDU9(ncols, nrows)float32
2DQImageHDU10(ncols, nrows)int32int32 (rescales to unit32)
3ERRImageHDU8(ncols, nrows)float32
4ASDFImageHDU7(5037,)uint8


Primary Header: inserts calibration status keyword S_RAMP and reference file recorded in R_READNS keyword.

This step produces an optional product called ramp.fits with the extensions listed in Table 6


Table 6. File format for the ramp file

No.NameTypeCardsDimensionsFormat
0PRIMARYPrimaryHDU7()
1SLOPEImageHDU10(ncols, nrows,ngroups, nints)float32
2SIGSLOPEImageHDU10(ncols, nrows, ngroups, nints)float32
3YINTImageHDU10(ncols, nrows, ngroups, nints)float32
4SIGYINTImageHDU10(ncols, nrows, ngroups, nints)float32
5PEDESTALImageHDU9(ncols, nrows, nints)float32
6WEIGHTSImageHDU10(ncols, nrowsn, ngroups, nints)float32
7CRMAGImageHDU10(ncols, nrows,11,1)float32
8ASDFImageHDU7(1187,)uint8

refpixstep.fits

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_REFPIX

rscd_step.fits

MIRI only step.

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_RSCD and reference file recorded in R_RSCD keyword

saturationstep.fits

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_SATURA and reference file recorded in R_SATURA keyword.

superbiasstep.fits

The output fits file for this step will have the same format as the dqinitstep step

Primary Header: inserts calibration status keyword S_SUPERB and reference file recorded in R_SUPERB keyword.

trapsfilled.fits

The output FITS file for this step will have the extensions listed in Table 6


Table 6. File format for the trapsfilled file

No.NameTypeCardsDimensionsFormat
0 PRIMARY PrimaryHDU141()
1SCIImageHDU9(100,100,3)float32
2ASDFImageHDU7(2689,)

uint8   

This is a default output of the Persistence Correction step.



References

JWST-STScI-00211, "DMS Level 1 and 2 Data Product Design"

JWST-STScI- 004078, " Design of Imaging Associations"




Published

 

Latest updates