ATG · 01
NGS Pipeline Development
Custom Snakemake or Nextflow workflows for Illumina and Oxford Nanopore data. From raw FASTQ/POD5 through QC, alignment, feature quantification, peak calling, and variant annotation - all containerized and cloud-ready. Every pipeline passes CI tests before delivery.
Snakemake
Nextflow
STAR / minimap2
GATK / DeepVariant
Dorado
Docker
scope includes: QC (FastQC, MultiQC, NanoPlot) · trimming (Cutadapt, Trimmomatic) ·
alignment (STAR, HISAT2, minimap2, bwa-mem2) · quantification (featureCounts, HTSeq, Salmon, StringTie) ·
peak calling (MACS3, HOMER) · variant calling (GATK, DeepVariant) · AWS Batch / GCP Batch CI
GCT · 02
Statistical Analysis & Figures
Differential expression, pathway enrichment, ribosome profiling TE analysis, m6A peak calling, and survival modeling. Every statistical test is justified, every threshold documented, and every figure is vector-quality and ready for journal submission - no post-processing required.
DESeq2
edgeR / limma
fgsea
clusterProfiler
ggplot2
seaborn
scope includes: DESeq2 / edgeR / limma-voom (DE) · fgsea / clusterProfiler (enrichment) ·
riboWaltz / riboriboQC (Ribo-Seq) · exomePeak2 / m6Aviewer (m6A) ·
ggplot2 / seaborn / matplotlib (figures) · R Markdown / Quarto (reproducible reports)
CAG · 03
Multi-Omics Integration
MOFA2 latent factor decomposition across RNA-Seq, ATAC-Seq, bisulfite methylation, and proteomics layers. Identifies co-varying signatures across modalities, quantifies per-modality variance contribution, and reveals sample-level heterogeneity invisible to single-omics analysis.
MOFA2
Bismark / DMRcate
ANNOVAR
mixOmics
ATAC-Seq
scope includes: MOFA2 latent factors · variance partition (variancePartition) ·
DNA methylation (Bismark, DMRcate, MethylKit) · chromatin accessibility (ATAC-Seq: MACS3, chromVAR) ·
proteomics integration · cross-modality correlation heatmaps
TAA · 04
ML Model Development
Variant pathogenicity classifiers, survival prediction from transcriptomic features, scRNA-Seq cell type annotation with label transfer, and drug response prediction models. All models are SHAP-interpretable - every feature's contribution is explainable to a biology audience. Deployed on SageMaker or Vertex AI.
PyTorch
XGBoost / scikit-learn
SHAP
SageMaker
Vertex AI
scope includes: feature engineering from genomic data · cross-validation / bootstrap CI ·
SHAP summary + dependence plots · model cards · SageMaker / Vertex AI endpoint deployment ·
Docker model containers · confusion matrices + AUC-ROC reporting
GCA · 05
Cloud Infrastructure
Migrate legacy HPC pipelines to AWS Batch or GCP Life Sciences. Infrastructure-as-code via Terraform - VPC, IAM, S3/GCS, Batch compute environments, spot instance auto-scaling, and CI/CD on GitHub Actions. HIPAA-eligible service configurations available with BAA. Includes a cost model showing projected monthly spend.
AWS Batch
GCP Life Sciences
Terraform
GitHub Actions
HIPAA
scope includes: Terraform modules (VPC, IAM, S3, Batch) · spot instance optimization ·
S3 lifecycle policies · CloudWatch cost alerts · GitHub Actions CI ·
encrypted transfer setup · BAA-eligible architecture · cost model spreadsheet
TGC · 06
Consulting Retainer
Monthly advisory capacity for biotech startups and academic groups who need ongoing bioinformatics support without a full-time hire. 8–20 dedicated hours/month: pipeline design reviews, grant figure support, ad-hoc analyses, team training sessions, and strategic genomics planning. Direct Slack access to the founder.
strategy
pipeline review
grant figs
ad-hoc analysis
team training
scope includes: monthly strategy calls · code review (PRs) ·
pipeline architecture consultation · grant figure delivery · ad-hoc analysis ·
journal club / methods review · vendor evaluation (sequencing providers, platforms)