Running the JWST Science Calibration Pipeline

Information about how to run the JWST Science Calibration Pipeline is provided in this article.

On this page

See also: Algorithm Documentation, Stages of JWST Data Processing, Understanding JWST Data Files, JWST Data Associations
Software documentation outside JDox: Software DocumentationRunning the PipelineFile Naming Conventions, Data Product Types, Science Product Structures and Extensions, Data File Associations

Standard calibration pipeline processing should produce publication quality data products. However, your science cases may require specialized processing with settings other than the defaults used in pipeline processing. Also, while bulk reprocessing will be performed by STScI as conditions and resources permit, you may wish to expedite reprocessing of data sets of interest to you when new calibration and reference files become available. 

Most data have now been calibrated using on-orbit calibration files from commissioning; over time, these will be replaced with calibrations based on more updated on-orbit data from Cycle 1 and 2 calibration programs. Other conditions under which an observer may need to reprocess data will not be known until JWST has completed Cycle 1 and 2 calibration activities and the instrument data is better characterized; however as more calibration data becomes available, STScI will provide guidance for users on whether they should reprocess their data.

Since the JWST Science Calibration Pipeline software is under continuous development and improvement, there may be limitations to the pipeline products for certain instrument modes and science data types. Users are encouraged to review the JWST Science Calibration Pipeline known issues information, the Data Artifacts and Features article, and the JWST Calibration Pipeline Caveats information to determine if there are any special considerations that should be made for their particular science.

For information on citing your data reduction for publication, see the guidelines in How to Cite JWST Data Reductions and Reference Files



Science calibration pipeline stages

There are 3 main calibration pipeline stages required to completely process a set of exposures for a given observation:

  • Stage 1: Apply detector-level corrections to the raw data for individual exposures and produce count rate (slope) images from the "ramps" of non-destructive readouts
  • Stage 2: Apply physical corrections (e.g., slit loss) and calibrations (e.g., absolute fluxes and wavelengths) to individual exposures
  • Stage 3: Combine the fully calibrated data from multiple exposures

Words in bold are GUI menus/
panels or data software packages; 
bold italics are buttons in GUI
tools or package parameters.

Summaries of the algorithms used for the corrections and calibrations can be found in JDox, while more detailed information is contained in the software documentation. Each stage may use different modules depending on the observation mode and instrument. Individual steps and pipeline modules can be run in the following ways:

There are generally 2 types of input: science data files or associations, and reference files. The reference files are provided by the Calibration Reference Data System (CRDS) unless they are explicitly overwritten.



Reproducing MAST data products

In some cases, observers may want to run the pipeline in exactly the same way as the operational pipeline used for data going into MAST. The syntax for this will depend on the method being used to run the pipeline. The available methods are outlined below.

Note: Observers must be sure to use the same pipeline and CRDS version currently used in operations in order to reproduce the operational pipeline settings. Check the header of the MAST data product for the CAL_VER and CRDS_CTX header values, which will provide the jwst pipeline software version and the CRDS context (on the operational server https://jwst-crds.stsci.edu/).

Using the .call() method

Using the .call() method will search for and use any parameter reference files that exist in CRDS, which can contain parameter overrides that get applied to different exposure types (e.g., TSO vs non-TSO).The particular parameter reference files used during processing are logged in the standard logging method but are not stored in the header, as that information is redundant with the CRDS context. As long as the matching context is used, all parameters will be set in the same way. This is the way data is processed by the Data Management System (DMS) within the operational environment. The table below lists the pipeline modules to use for each stage of processing depending on the mode:

ModePipelinesSyntax
Imaging (non-TSO)calwebb_detector1from jwst.pipeline import Detector1Pipeline
calwebb_image2from jwst.pipeline import Image2Pipeline
calwebb_image3from jwst.pipeline import Image3Pipeline
Imaging (TSO)calwebb_detector1from jwst.pipeline import Detector1Pipeline
calwebb_image2from jwst.pipeline import Image2Pipeline
calwebb_tso3from jwst.pipeline import Tso3Pipeline
Spectroscopy (non-TSO)calwebb_detector1from jwst.pipeline import Detector1Pipeline
calwebb_spec2from jwst.pipeline import Spec2Pipeline
calwebb_spec3from jwst.pipeline import Spec3Pipeline
Spectroscopy (TSO)calwebb_detector1from jwst.pipeline import Detector1Pipeline
calwebb_spec2from jwst.pipeline import Spec2Pipeline
calwebb_tso3from jwst.pipeline import Tso3Pipeline
Coronagraphycalwebb_detector1from jwst.pipeline import Detector1Pipeline
calwebb_image2from jwst.pipeline import Image2Pipeline
calwebb_coron3from jwst.pipeline import Coron3Pipeline
AMIcalwebb_detector1from jwst.pipeline import Detector1Pipeline
calwebb_image2from jwst.pipeline import Image2Pipeline
calwebb_ami3from jwst.pipeline import Ami3Pipeline

After activating a pipeline environment, the 3 pipeline stages can be imported and run as follows in a Python session:

from jwst.pipeline import Detector1Pipeline  

detector1 = Detector1Pipeline()
result = Detector1Pipeline.call("jwxxxxx_uncal.fits")

Note that stages 2 and 3 pipelines will take either an individual file as input or an association table. See the documentation for more information about inputs and outputs for each pipeline. 

Using the strun command line or  Step.from_cmdline method

The strun command can also be used to run the pipeline or pipeline steps from the command line in the same way DMS does to process the data. The table below lists the pipeline modules to use for each stage of processing depending on the mode:

Mode

Pipelines

Syntax

Imaging (non-TSO)

calwebb_detector1

strun calwebb_detector1 <file>

calwebb_image2

strun calwebb_image2 <file>

calwebb_image3

strun calwebb_image3 <file>

Imaging (TSO)

calwebb_detector1

strun calwebb_detector1 <file>

calwebb_image2

strun calwebb_image2 <file>

calwebb_tso3

strun calwebb_tso3 <file>

Spectroscopy (non-TSO)

calwebb_detector1

strun calwebb_detector1 <file>

calwebb_spec2

strun calwebb_spec2 <file>

calwebb_spec3

strun calwebb_spec3 <file>

Spectroscopy (TSO)

calwebb_detector1

strun calwebb_detector1 <file>

calwebb_spec2

strun calwebb_spec2 <file>

calwebb_tso3

strun calwebb_tso3 <file>

Coronagraphy

calwebb_detector1

strun calwebb_detector1 <file>

calwebb_image2

strun calwebb_image2 <file>

calwebb_coron3

strun calwebb_coron3 <file>

AMI

calwebb_detector1

strun calwebb_detector1 <file>

calwebb_image2

strun calwebb_image2 <file>

calwebb_ami3

strun calwebb_ami3 <file>

After activating a pipeline environment, the 3 pipeline stages can be run as follows from the command line:

$ strun <pipeline_name> <input_file>

The first argument to strun must be either a pipeline name, Python class of the step or pipeline to be run, or the name of a parameter file for the desired step or pipeline (see Parameter Files). The second argument to strun is the name of the input data file to be processed. It is also possible to use the command line method from within a Python session.

Using the .run() method

The .run() method is the lowest-level method to executing a step or pipeline, so initialization and parameter settings are left up to the user. This makes it relatively complicated to replicate the operational settings for running the pipeline. To reproduce data products coming from MAST, we do not recommend using this method. More information on this method is provided here: Running a Step in Python.



More examples

Documentation outside JDox: JWebbinars

While the pipeline software documentation offers a general description on how to run the pipeline, a number of intricacies exist in the way in which the various software and data products interact. Several Jupyter notebooks have been developed to help you understand your data or to highlight general science workflows that you may want to consider while reducing your own data.




Latest updates

  • Added information about the caveats and artifacts

  •  
    Added pointer to new citation article
Originally published