feat: refactored basecalling, updated coverage merge, enhanced modkit, unified...
feat: refactored basecalling, updated coverage merge, enhanced modkit, unified tuple IDs and updated containers
main.nf:
- Updated the use of ‘channel.from’ between steps;
- Refactored ‘fast5_path’ and ‘pod5_path’ channel construction to use tuple ID across inputs;
- The new ID format is now derived from the directory structure and filename (this will be altered in future updates);
- Implemented the logic for selecting basecalling configuration (basecall_arg_val), which was previously defined in ‘BASSECALLING.nf’;
- Improved logging in step selection and execution parameters;
- Resolved a pipeline break caused when the modkit subworkflow expected 1 input but received 2 channels instead;
- Improved readability and code organization.
Basecalling:
- Updated Dorado to the latest version (v1.3.0);
- Removed internal ‘basecall_arg_val’ computation from ‘BASECALLING.nf’; now receives the argument computed once in ‘main.nf’;
- Refactored demultiplexed BAM sorting;
- Improved structure and indentation of the shell block.
Modkit:
- Updated Modkit to the latest version (v0.6.0);
- Modkit pileup now explicitly sets ‘--modified-bases 6mA 5mC 5hmC’ and ‘--reference ${reference_file}’ reflecting changes made in modkit v0.6.0.
Container (debian-nanopore.def):
- Added a dedicated step to install NumPy/Pandas versions compatible with pycoQC (v3.0.0);
- Switched jq installation to a pinned GitHub binary using ‘JQ_VERSION’;
- Updated versions.txt to track new and updated software versions.
Collapse Strands (EXPERIMENTAL):
- Added initial draft of a ‘COLLAPSE_STRANDS’ process/subworkflow for future strand-collapsing logic;
- Kept in the repo as an experimental block for future development, as it will be modified and fixed to address duplication issues in modifications across opposite strands.
Loading
Please register or sign in to comment