Research

Computational methods for predictive genomics.

Heru develops the analytical methods behind Haeckel. Every analyzer is calibrated against reference samples, every prediction is backed by confidence intervals, and every method is validated before it ships to users.

Focus areas

What we work on.

Ancestry inference

MLE-EM with spatial thinning, bootstrap Wald tests, and in-silico AMR deconvolution. 26,000 independent AIMs across 62 sub-populations from 1000 Genomes and HGDP. Validated at 97.3% EUR on reference samples with zero ghost signals.

Polygenic risk scoring

Six PRS methods in production: Basic PRS with palindromic skip, PRS-CS (Bayesian MCMC), C+T clumping, LDpred2 (four models), Lassosum (coordinate descent), SBayesR (four-component mixture). Ensemble weighting with R-hat convergence diagnostics.

Haplogroup resolution

Recursive phylogenetic traversal across 140 nodes with back-mutation handling. Bayesian confidence scoring against random expectation. Y-DNA and mtDNA trees annotated with historical figures and migration context.

Archaic introgression

S* statistic for deep coalescence detection, ABBA-BABA D-statistics with block jackknife standard errors, and a 4-state Viterbi HMM for introgression tract detection with coalescent dating. 53 Neanderthal and Denisovan markers.

Phenotype prediction

HIrisPlex-S models for eye, hair, and skin color. GIANT consortium height prediction. Per-SNP contribution tracking for explainability. 67% coverage threshold to prevent systematic bias from missing MC1R variants.

Pharmacogenomics & safety

Star allele calling for 12 CPIC Level-A genes (CYP2D6, CYP2C19, DPYD, SLCO1B1, HLA-B, and more). 10 FDA Black Box interactions surfaced as critical alerts. Activity score modeling for metabolizer classification.

Whole-genome offspring modeling

The CHURCH inheritance calculator combines polygenic scores with Mendelian predictions across traits. Partner compatibility scoring, offspring trait simulation, and embryo comparison across complex phenotypes.

Quality & validation

Data quality control on every upload: call rate, heterozygosity, Ti/Tv ratio, per-chromosome statistics. Reference-sample validation: NA18486 YRI to AFR 100%, NA18525 CHB to EAS 100%, HG03006 BEB to SAS 87.6%.

Methodology

Validated, not just shipped.

Our methodology is grounded in established scientific principles. Every analyzer is validated against reference samples from 1000 Genomes and HGDP before reaching users. Confidence intervals, convergence diagnostics, and per-SNP contribution tracking are default, not optional.

We publish calibration results openly inside the product so users can see exactly how each prediction was made. Where a method has limitations, we surface them. Where data is missing, we mark it. Where a population is underrepresented, we say so.

This is research-grade analysis, open to scrutiny and refinement.

Open to collaboration.

If you work in genomics, bioinformatics, population genetics, or reproductive medicine and want to collaborate on methods, data, or validation, reach out.