To assist Early Release Science proposers in planning their data-analysis and software development efforts, this article provides pointers to information about STScI-provided JWST data products, the data-reduction pipeline, and post-pipeline data-analysis tools.
The JWST pipeline will process JWST data to remove and calibrate instrumental signatures, combine relevant exposures, and perform basic spectral and imaging source extraction. The data sets at various stages of processing can be downloaded from the Mikulski Archive . Users can install, re-run, and customize the JWST pipeline software to modify steps if necessary. STScI also provides a set of data-analysis tools written in Python to assist with standard tasks to inspect, convert, combine, measure, or model the data.
JWST Pipeline Data Processing
The pipeline will process data from all instruments and observing modes. It produces both fully calibrated individual exposures and high level data products (mosaics, extracted spectra, etc.).
The pipeline will be distributed as part of AstroConda. The development is done in a public github repository and is ongoing.
The calibration pipeline consists of three stages. Each stage consists of many steps. Where practical, pipeline stages and steps are shared across instruments and observing modes. For example, the near-infrared instruments use similar detectors, allowing similar calibration steps in the initial detector processing stage. Subsequent stages are factored into different observing modes, for example imaging versus spectroscopy.
The pipeline has three main stages that provide data to the archive (see Fig. 1). The first stage is CALDETECTOR1 and processes the data from raw non-destructively read ramps to uncalibrated slope images. At the second stage, the first branching is seen where the imaging data is processed by the CALIMAGE2 module and the spectroscopic data by the CALSPEC2 module. The second stage calibrates the individual slopes images. The processing of ensembles of slope images is done in the third stage. This stage is the most specific to the particular observations with different modules for imaging (CALIMAGE3), Spectroscopy (CALSPEC3), Coronagraphy (CALCORON3), Aperture Masking Interferometry (CALAMI3), and Time Series Observations (CALTSO3).
Pipeline Data Products
Exposure level data
All the raw data is archived in FITS format. The headers include information from the instrument, observatory, and spacecraft in addition to the relevant proposal details. This is 'level 1b' data.
The intermediate data output from the 1st stage of the pipeline (CALDETECTOR1) is the uncalibrated count rate images. This is 'level 2a' data.
The outputs of CALIMAGE2 and CALSPEC2 are the calibrated images from each exposure. This includes images from each integration in an exposure and an average of all the integrations in an exposure. This is 'level 2b' data.
Ensemble level data
These high level data are labeled 'level 3' data.
CALIMAGE3 produces mosaics of the data in each program, one for each unique band and object combination. In addition, a source catalog is created using simple aperture photometry.
CALSPEC3 produces rectified 2D (slit/slitless) mosaics or 3D (IFU) spectral cubes for each unique grating and object combination. In addition, the 1D extracted spectrum for a point or extended source (as appropriate) using a simple boxcar algorithm is archived.
The final product of CALCORON3 is a 2D combined image after reference PSF subtraction for each unique object and band combination. In addition, aligned stacks of the reference PSFs, object integrations, and object minus reference integrations are produced and archived.
Aperture Masking Interferometry (AMI)
CALAMI3 produces measurements of the normalized fringe parameters and fringe parameters for all the exposures and those averaged for all the reference PSF and target exposures.
Time Series Observations (TSO)
CALTSO3 products are the aperture photometry for each integration (imaging) and extracted spectra for each integration (spectroscopy). For the spectroscopic data, white-light photometry (direct sum of spectrum) is also produced.
For TSO observations, an additional output from the CALDETECTOR1 is saved. This is the ramp data that has been corrected for all detector effects but has not yet been reduced to a count rate image.
Post-Pipeline Data Analysis Tools
JWST Post-pipeline data-analysis tools are distributed as part of AstroConda to assist observers in viewing and analyzing their JWST data. These are generally written in Python to work together with Astropy. Development is ongoing. All software is open source and community contributions are welcome in the form of suggestions, bug reports, or actual code.
The suite of post-pipeline data-analysis tools is intended to help astronomers with the often iterative and interactive workflow involved in converting these data products into meaningful scientific results. This involves tasks such as the following:
- Inspecting data and data-quality information.
- Masking or flagging data for one reason or another and using those markings to guide later steps in the analysis.
- Using the results of interactive analysis to guide a custom run of the pipeline (e.g. tweaking spectral extraction parameters or background estimates).
- Combining data sets in various ways, with careful attention to astrometry, PSF matching, and other issues.
- Source detection and photometry using different choices or algorithms than those used in the pipeline.
- Measuring lines and continuum in spectral data.
- Fitting models to data or otherwise testing hypotheses.
A typical workflow involves highly interactive exploratory analysis on small portions of the data, followed by development of custom scripts to automate the analysis on larger data sets.
The table below provides links to further information about the tools. User-training workshops are being offered to help familiarize new users.
Astronomy-related tools for python.
Linked dataset visualization.
2-D image visualization with python plug-in capability.
|photutils||Tools for detecting and performing photometry of astronomical sources.|
|imexam||Interactive image analysis (Python equivalent of IRAF imexam).|
|specviz||One-dimensional spectra visualization and analysis (Python equivalent of IRAF splot)|
|Visualization and quick interactive analysis for multi-object spectroscopy|
|cubeviz||Visualization and quick interactive analysis for 3-D spectroscopy|
|gwcs||Tools for constructing and manipulating world-coordinate systems (WCS). |
Supports a data model which includes the entire transformation pipeline from input coordinates (detector by default) to world cooridnates.
|asdf||Advanced Scientific Data Format (ASDF) is a next generation interchange format for scientific data (which can be packaged in FITS files, but which has much richer capabilities for handling metadata).|
|astroimtools||Convenience tools for working with astronomical images|
The data-analysis tools are aimed at supporting a wide variety of science topics, but stop shy of providing integrated high-level tools for specialized scientific goals. Astropy and the affiliated packages may nevertheless be a very useful platform for developing such tools and making them available to a broad community. Libraries and data-structures with associated documentation will exist and STScI strongly encourages people developing software in python to use these (and inform STScI if there are problems that prevent that), rather than developing completely parallel suites of software. If in doubt about whether software you might propose to develop as part of an ERS program duplicates the STScI effort, please contact the JWST help desk for guidance.
Reference papers and reports
This page has no comments.