This is a personal fork of the IDIA MeerKAT pipeline, a radio interferometric calibration pipeline designed to process MeerKAT data. It implements cross-calibration, self-calibration, and science imaging. This fork tracks the upstream master and adds a number of changes aimed at full-polarization processing, Python 3.12 compatibility, and science imaging improvements.
- Polarization calibration on linear feeds — L-band polarization calibrator support in
setjy, plus XY-phase ambiguity solving when the polarization calibrator and phase calibrator share a name. polcalfieldconfig option ([crosscal]) — explicit fallback XY-phase calibrator, only used when no canonical pol calibrator (3C286/3C138/3C48/J1130-1449) is found in the MS. Optional; defaults to''and is auto-annotated by-B.atrous_doconfig option ([selfcal]) — enables PyBDSF à-trous (wavelet) decomposition during self-cal source finding to better recover extended/diffuse emission. Defaults toFalse(existing behaviour). Applied inselfcal_part2.py.- Science-imaging masking modes (
[image]) — chooseusemask = 'user'(standard, usesmask) orusemask = 'auto-multithresh'(usessidelobethreshold,noisethreshold,lownoisethreshold,negativethresholdinstead). - PyBDSF-driven spectral-index (alpha) imaging — for multi-Stokes / non-
Imtmfsruns, the science imaging step builds a noise-thresholdedalphamap andalpha.errormap (with a restoring beam inherited from Stokes I so PyBDSF can read it), controlled byalpha_nsigma. - Per-SPW science imaging (
[image]) — setspw_cube = Trueto image each spectral window separately (intoSPW_MFSs/) instead of producing a single full-bandwidth averaged image. Frequency labels are auto-derived from the MS metadata, andspwidoptionally restricts which SPWs are imaged (''= all). Combine withstokes = 'IQUV'for full-Stokes per-SPW imaging. - Automatic log cleanup — once all pipeline jobs finish, a lightweight dependent SLURM job removes stray
casa*.logfiles from the working directory. - Python 3.12 fixes —
SafeConfigParser→RawConfigParser, invalid escape-sequenceSyntaxWarnings resolved.
This pipeline is designed to run on the Ilifu cluster, making use of SLURM and MPICASA. For other uses, please contact the authors. Currently, use of the pipeline requires access to the Ilifu cloud infrastructure. You can request access using the following form.
Note: It is not necessary to copy the raw data (i.e. the MS) to your working directory. The first step of the pipeline does this for you by creating an MMS or MS, and does not attempt to manipulate the raw data (e.g. stored in /idia/projects - see data format).
In order to use the processMeerKAT.py script, source this fork's setup.sh on ilifu:
source /users/amani/processMeerKAT_fork/processMeerKAT/setup.sh
This adds the correct paths to your $PATH and $PYTHONPATH to use the pipeline. You could consider adding this to your ~/.profile or ~/.bashrc for future use.
If you switch between this fork and the upstream install (
source /idia/software/pipelines/master/setup.sh), re-source the one you want and regenerate your sbatch scripts (-R) so they point at the correctprocessMeerKATdirectory.
processMeerKAT.py -B -C myconfig.txt -M mydata.ms
processMeerKAT.py -B -C myconfig.txt -M mydata.ms -P
processMeerKAT.py -B -C myconfig.txt -M mydata.ms -2
processMeerKAT.py -B -C myconfig.txt -M mydata.ms -I
This defines several variables that are read by the pipeline while calibrating the data, as well as requesting resources on the cluster. The config file parameters are described by in-line comments in the config file itself wherever possible. The [-P --dopol] option can be used in conjunction with the [-2 --do2GC] and [-I --science_image] options to enable polarization calibration as well as self-calibration and science imaging.
processMeerKAT.py -R -C myconfig.txt
This will create submit_pipeline.sh, which you can then run with ./submit_pipeline.sh to submit all pipeline jobs to the SLURM queue. After all jobs complete, stray casa*.log files are automatically removed from the working directory.
Other convenience scripts are also created that allow you to monitor and (if necessary) kill the jobs.
summary.shprovides a brief overview of the status of the jobs in the pipelinefindErrors.shchecks the log files for commonly reported errors (after the jobs have run)killJobs.shkills all the jobs from the current run of the pipeline, ignoring any other (unrelated) jobs you might have running.cleanup.shwipes all the intermediate data products created by the pipeline. This is intended to be launched after the pipeline has run and the output is verified to be good.
For help, run processMeerKAT.py -h, which provides a brief description of all the command line options.
These keys are added/used by this fork. They all have sensible defaults, so existing config files keep working unchanged.
| Section | Key | Default | Purpose |
|---|---|---|---|
[crosscal] |
polcalfield |
'' |
Fallback XY-phase calibrator; only used when no canonical pol calibrator is in the MS. |
[selfcal] |
atrous_do |
False |
Enable PyBDSF à-trous (wavelet) decomposition during self-cal source finding. |
[image] |
usemask |
'user' |
'user' uses mask; 'auto-multithresh' uses the thresholds below instead. |
sidelobethreshold |
0.5 |
Only used when usemask = 'auto-multithresh'. |
|
noisethreshold |
5.0 |
||
lownoisethreshold |
0.01 |
||
negativethreshold |
0.0 |
||
alpha_nsigma |
1.0 |
Sigma cut for the final alpha mask (used when stokes != 'I' to produce a spectral-index image). |
|
spw_cube |
False |
Image each SPW separately into SPW_MFSs/ instead of one full-bandwidth averaged image. |
|
spwid |
'' |
Comma-separated SPW IDs to image when spw_cube = True (e.g. '0,1,2'); '' = all SPWs. |
Starting with v1.1 of the processMeerKAT pipeline, the default behaviour is to split up the MeerKAT band into several spectral windows (SPWs), and process each concurrently. This results in a few major usability changes as outlined below:
-
Calibration output : Since the calibration is performed independently per SPW, all the output specific to that SPW is within its own directory. Output such as the calibration tables, logs, plots etc. per SPW can be found within each SPW directory.
-
Logs in the top level directory : Logs in the top level directory (i.e., the directory where the pipeline was launched) correspond to the scripts in the
precal_scriptsandpostcal_scriptsvariables in the config file. These scripts are run from the top level before and after calibration respectively. By default these correspond to the scripts to calculate the reference antenna (if enabled), partition the data into SPWs, and concat the individual SPWs back into a single MS/MMS.
More detailed information about SPW splitting is found here.
The documentation can be accessed on the pipelines website, or on the Github wiki.
