JWST General Calibration Pipeline Caveats

Content Migration

WARNING: This page has been deprecated and the content migrated to Known Issues with JWST Data and Running the JWST Science Calibration Pipeline.  Please update your links accordingly.

Unique features of the JWST Science Calibration Pipeline that are not necessarily instrument- or mode-specific, including caveats for users, are described in this article. Users should also refer to the other articles in this section for characteristics and caveats that are specific for their mode of interest. This information reflects the status for the jwst calibration pipeline package 1.8.2 released with build 9.0 of the JWST Operational Pipeline.

On this page

Summary of specific general calibration pipeline issues

General issues

The information in this table about general calibration pipeline issues is excerpted from Known Issues with JWST Data Products.  

SymptomsCauseWorkaroundMitigation Plan
GI02: Embedded world coordinate system (WCS) in JWST data products is incorrect.

Errors in the guide star catalog, misidentified guide stars, and uncertainties in the spacecraft roll angle result in errors in the WCS of pipeline data products even when target acquisition was performed to place science targets in the correct location. Typical errors are a few tenths of an arcsec, with some cases that are greater than 1 arcsec.

The workaround depends on instrument and mode. Also see issue NC-I01. 

Updated issue

Improve accuracy of the guide star catalog. This is a long-term project. (Updated "Workaround" to mention NC-I01)

GI03: Images contain snowballs and shower artifacts.

These are caused by large cosmic ray impacts.

The calwebb_detector1 pipeline includes a snowball/shower correction, but it is turned off by default while testing is underway.

There is no workaround that works for all science cases.

The correction is not recommended for NIRISS SOSS or AMI,  MIRI coronagraphic data, and data with 1–4 groups.

For general science cases, users can re-run the pipeline calwebb_detector1 with the jump step parameters set:
   find_showers = True  (For MIRI)
   expand_large_events = True (for NIR instruments)


Updated issue

See the section titled "Large Events (Snowballs and Showers)" in the JumpStep documentation, and  the Snowballs and Shower Artifacts article, for information on modifying the parameters for your science case and observations.

Snowball/shower correction in the jump detection step of calwebb_detector1 will be implemented via delivery of new parameter reference files for each instrument, as they become available.

Reprocess affected data products with updated reference files. The schedule is TBD, depending on testing results for each instrument. Reprocessing of affected data typically takes 2–4 weeks.

GI04: NIR instruments only: There is large-scale striping (horizontal for NIRCam, vertical for NIRISS and NIRSpec) across the field and not fully removed in the reference pixel subtraction. Note that IRS2 readout for NIRSpec substantially mitigates this behavior.1/f noise from the SIDECAR ASICs (detector readout electronics) causes this effect.

There are several community tools available that are designed to remove 1/f noise. 

Updated issue

A mitigation plan is being developed.


GI05: Stage 3 processing of large imaging mosaics can take a longer than the normal amount of time to process.Unknown.

None.

Created issue

Updates are planned for the methods used in the tweakreg and resample steps to make them more efficient.

GI06: World Coordinate System (WCS) in pure parallel data products is incorrect by amounts that are different for every dither position.In pure parallel data, the values of the WCS-related header keywords are currently derived using a “coarse” algorithm, because the guide star information is currently not being transferred from the headers of prime exposures to those of the associated pure parallel exposures. Typical errors are of order 0.1 arcsec. While these errors can be benign for imaging data (since the spatial offsets can be corrected in the tweakreg step of the calwebb_image3 pipeline), they are problematic for pure parallel WFSS-mode grism exposures, because the placement of spectral extraction boxes by the calwebb_spec2 pipeline relies on the WCS information in the data headers.

The WCS in pure parallel data products can be corrected by running a script in the JWST caveat examples github repository

Created issue

Over the longer term, this issue will be addressed in front-end processing during visit preparation. An implementation date is to be determined.

GI01: TARG_RA and TARG_DEC in the FITS primary header are not at the epoch of the JWST exposure. This is one reason the 1-D spectral extraction aperture can be offset from the target location in the 2-D extracted spectrum image (see relevant instrument modes below).

Initially, science data processing was not applying proper motion to the target coordinates specified by the user (PROP_RA, PROP_DEC). After a update, science data processing began applying a proper motion correction that was too small by a factor of 0.36533.

Download uncalibrated data. Update TARG_RA and TARG_DEC (see workaround). Rerun calibration pipeline.

Updated Operations Pipeline

Proper motion was applied correctly. STScI reprocessed affected data products with an updated Operations Pipeline installed on August 24, 2023. Reprocessing of affected data typically takes 2-4 weeks after the update.

General issues for time-series observations

The information in this table about general time-series observations calibration pipeline issues is excerpted from Known Issues with JWST Data Products.  

SymptomsCauseWorkaroundMitigation Plan

GI-TS01: For time-series data (for all instruments), FITS primary header keywords are different from the "INT_TIMES" extension. Particularly, this concerns the start/end times (BSTRTIME and BENDTIME) and the barycentric correction (BARTDELT) keyword.

"INT_TIMES" are based on the group times directly read into the engineering data. This is not the case with the header keywords, which do not account for electronic shifts on the reading of the data.

Use "INT_TIMES".

Updated Operations Pipeline

A change to the JWST Science Data Processing subsystem to correctly compute the barycentric and heliocentric time, and JWST barycentric position keywords was part of the updated Operations Pipeline, installed on August 24, 2023. STScI reprocessed affected data products, which  typically takes 2–4 weeks after the update.



About general calibration pipeline caveats

See also: JWST Operational Pipeline Build Information

The following sections highlight some aspects of the JWST calibration pipeline that may affect all or several modes and instruments. It is also important to note that early in the mission our understanding of the observatory performance is evolving quite rapidly, and changes to calibration procedures are expected.

Note: Check the "Latest Update" box at the bottom of each article to see when the last updates were made.

This page highlights some information about general data processing using the JWST Science Calibration Pipeline software that may be helpful for observers. For a list of known issues, bugs, and updates for a particular software release, please see the articles under JWST Operational Pipeline Build Information. A compilation of known pipeline issues is available at Known Issues with JWST Data Products.



All stages of processing

Software documentation outside JDox: Parameters, Reference Files, JWST Data Products 

Reference files

Each stage of the JWST Science Calibration Pipeline uses a set of instrument-specific reference files that ensure the science calibration pipeline meets its accuracy requirements. Calibration reference files are stored in the Calibration Reference Data System (CRDS)CRDS is directly integrated with calibration steps and pipelines, and the reference file mappings are set by default to always access the most recently delivered reference files according to certain selection rules (for example, instrument and filter used for an observation). 

Before launch, instrument teams used dummy or ground testing data to support development of the science calibration pipeline stages. These files can be identified using the PEDIGREE header keyword, which will have a value of DUMMY or GROUNDAfter launch, instrument teams began using data taken during commissioning to produce higher quality reference files; these files have a PEDIGREE header keyword value of INFLIGHT. All reference files include the DATE header keyword, which indicates the UTC creation date of the reference file. Teams are actively developing and delivering improved files, and will continue to do so as we begin Cycle 1 calibration programs. As such, some of the pre-flight reference files are still in use until enough data is available to replace them.

Observers have the flexibility to create their own reference file versions or to override the default reference files when running the calibration pipeline manually. More information about the reference files can be found in this article, JWST Data Calibration Reference Files.

Parameters

The JWST Science Calibration Pipeline, used by the MAST Archive to calibrate all JWST data, was designed to process JWST data as optimally as possible for most instruments and modes using a common set of calibration steps and parameters within those steps. The default set of parameters was chosen based on ground test data or simulations and will likely be updated as we obtain more data on-orbit. The parameters are stored in parameter reference files that are included in CRDS along with the standard calibration reference files. For specific science cases or instrument modes, there may be ways to further improve or optimize the JWST Science Calibration Pipeline by using the parameters to change the standard data flow.

Most calibration steps include a set of parameters for processing that can be tweaked to improve the outputs of various stages of the pipeline when observers are running it manually. For example, in the jump detection step of stage 1 processing, which flags outliers and cosmic rays in the uncalibrated data, there are multiple parameters included in the algorithm. Observers may find that too many (or too few) cosmic rays are flagged in their data, and decide to rerun the step on their own with different settings. Available parameters for a step can be found in multiple ways:

  • By looking at the parameter reference file for a step, which contains the default parameters used in the operations pipeline
  • By visiting the "Arguments" section for the calibration step software documentation (e.g., for jump detection)
  • By importing a calibration step in a Python session and using the .spec attribute, as shown in the example below:

    from jwst.jump import JumpStep
    print(JumpStep.spec)
    
    step = JumpStep()
    step.rejection_threshold = 10

To learn more about how to edit parameters and run the calibration pipeline manually, video tutorials are available in JWebbinars.

Intermediate data products

Words in bold are GUI menus/
panels or data software packages; 
bold italics are buttons in GUI
tools or package parameters.

The operational pipeline was designed to provide a specific set of data products to observers when retrieving data from MAST; however, there are many additional data products that can be produced by the calibration steps. When running the science calibration pipeline on its own, observers can opt to save the data output after each calibration step is completed, or only after the steps of interest. In either case, the additional data products are accessed by manually processing the data and changing the parameters for the step or pipeline. 

For example, the Python code below demonstrates how to save intermediate data products for a single step (jump detection step) and also for the stage 1 pipeline (calwebb_detector1) using the save_results attribute:

from jwst.jump import JumpStep
from jwst.pipeline import Detector1Pipeline

step = JumpStep()
step.save_results = True

pipeline = Detector1Pipeline()
pipeline.jump.save_results = True
pipeline.ramp_fit.save_results = True

Calibration step precedence

The flow of data through the stages and steps of the calibration pipeline was intentionally designed to process the raw data to produce count rate (slope) images, calibrate the slope images, and then carry out any additional processing, including the creation of combined images and spectra. As such, observers should be careful when turning steps on and off, as subsequent steps may rely on a change to the data values, data structure and format, or header keywords that would have been made during a previous step. 

Error propagation

Error arrays are initialized in stage 1 processing and are stored in the "ERR" extension of the data. The uncertainty from each step that contributes noise to the final measurement is separately calculated and propagated by various steps in the calibration pipeline using a noise model. Anytime a step creates or updates variances, the total error array values are recomputed as the square root of the quadratic sum of all variances available at the time. Note that the "ERR" array values are always expressed as standard deviation (i.e., square root of the variance), and the variances are stored in the "VAR_POISSON' (variance due to Poisson noise), and "VAR_RNOISE" (variance due to read noise) arrays. In some cases, the variance arrays are only used internally within a given step.

Different uncertainty sources behave in different ways. Some noise sources (e.g., photon noise) are independent between integrations and others (e.g., flat field noise) are not. Additionally, the spatial covariance of different sources varies. By propagating each term through the calibration pipeline, the use of each term can be customized for the processing. For example, the use of the flat field noise term is different between non-dithered and dithered observations. For the former, the noise does not reduce with the addition of more integrations while for the latter it does.

In level 3 mosaic products, observers may find that the "WHT" (see the Data quality information section below) and "ERR" extensions are useful, since the "WHT"  extension can be used for source detection (as it contains the background noise terms) and the "ERR" extension can be used for calculating photometric errors in an aperture (as it contains all the noise, i.e., background plus photon noise). 



Stage 1 processing

Software documentation outside JDox: Stage 1 Detector Processing

Data quality information

The data quality (DQ) initialization step in the calibration pipeline populates the data quality mask for a dataset to flag any pixels that may be unreliable or unusable for a number of reasons, such as dead pixels, hot pixels, etc. These flags are carried through the steps of the pipeline and may inform how the calculations within a calibration step are performed for a pixel. Different instruments monitor different characteristics and hence may have differing pixel flags; however, the common value name for pixels that get excluded from calculations is "DO_NOT_USE". Other unreliable or sub-optimal pixels may still be included in the calculations for a calibration step, so observers should keep this in mind when analyzing their data products. For a full list of data quality flags that may be used, refer to the software documentation.

Throughout stage 1 processing, this information is stored in the "PIXELDQ" and "GROUPDQ" extensions, until they are replaced by a single "DQ" extension in the final stage 1 data products. More information on data quality flags for subsequent processing stages are provided below.

Jump detection

The jump detection step in the calibration pipeline flags jumps in the ramp where the ADU level between 2 consecutive groups is large relative to those between other consecutive pairs of groups. These ramp jumps are often caused by cosmic rays (CRs) that deposit large amounts of charge in a pixel, and the number of sigmas above the noise threshold (called the rejection threshold) is given as a parameter. The default parameter chosen for the operational pipeline was determined from pre-flight data, so it may not be optimal for some in-flight data. 

Observers running the calibration pipeline manually may find that they need to increase or decrease the jump detection threshold depending on whether they notice over- or under-flagging of jumps in their data, or they may decide to increase the detection threshold in order to speed up the step calculations. This is done by updating the rejection_threshold parameter for the step (see the Parameters section above). Bear in mind that a second pass at flagging outliers happens during stage 3 in the outlier detection step, which uses the overlapping regions observed in different exposures to catch cosmic rays undetected during the jump step. Note that efficiency improvements are underway for a few long-running steps in the calibration pipeline, including the jump and outlier detection steps. Additionally, the jump step will be skipped if the input data contain fewer than 3 groups per integration, but an update to the algorithm that allows CR flagging in 2-group integrations is in development.

One particular type of CR that was seen in ground tests and has now been detected in flight is referred to as a snowball. While these phenomena and how to correct for them are still in discussion, snowballs seem to account for a very small part (possibly less than 0.1%) of the total CR population, but can have a significant impact on affected pixels. They appear to generally be round and feature a heavily saturated "core" and an extended "halo" region, and are often accompanied by a shower of CRs in their immediate vicinity of varying intensity and size. Updates to the jump detection step to improve cosmic ray flagging for snowballs and to handle charge spilling into neighboring pixels are in progress. 

Because this step flags jumps or outliers in the ramp, it is also possible that observations that have guide star instability or any movement of the image during the exposure may end up with false jump flagging in their data, since the calibration pipeline sees this change in the pixel values as a jump in the ramp. While this does not appear to be a common issue, observers who notice false flagging in cases like this can report them via the JWST Help Desk. Also note that jump detection for moving target observations has not been tested extensively using ground data, so the performance in-flight for those types of observations will need to be evaluated. 



Stage 2 and 3 processing

Software documentation outside JDox: Pipeline Stages 

Data quality information

At the end of stage 1 processing and throughout stage 2, the individual "PIXELDQ" and "GROUPDQ" extensions for a ramp are replaced by a single "DQ" extension, which is a data array containing DQ flags for each pixel, for each integration (or for averaged integrations, depending on the data product type). In stage 3 processing, the data is resampled based on the WCS and distortion information and then combined into a single undistorted product. Resampled data products contain "WHT" and "CON"" extensions in place of the "DQ". These extensions provide observers with the 2-D weight image giving the relative weight of the output pixels (WHT) and the 2-D context image, which encodes information about which input images contribute to a specific output pixel (CON). 

Associations

Stage 2 and stage 3 data processing becomes increasingly more specific to the instrument and mode. For stage 1, all data (regardless of the instrument or mode) goes through calwebb_detector1 and the data files are processed individually. For the next 2 stages, association files are used. If observers choose to run the stage 2 or stage 3 pipelines manually, it is likely they will need to download the association file from MAST for their program or create one manually if they wish to use their own set of data files. Example associations are provided in the software documentation. Some stage 2 or stage 3 steps may accept an individual data file as input, but in general, the input to these pipeline stages and steps are expected to be association files. 

Information on how to run the calibration pipeline using association files is available at JWebbinars.




Latest updates
  •  
    Updated pipeline version note to build 9.0
Originally published