Phase 2: Transformation
┌───────────┐ ┌────────────────────────────┐
│ ImmunData │ │ AnnData / Seurat / TCRdist │
└───────────┘ │ seur@meta.data / adata.obs │
│ └────────────────────────────┘
│ │
├─────────────────────────────┘
│
▼
annotate_immundata() ──── Import external annotations to ImmunData
│
▼
agg_repertoires() ──── Aggregate repertoires
│
▼
filter_immundata() ──── Filter receptors or repertoires
│
▼
mutate_immundata() ──── Create or modify columns, compute statistics
│
│ ┌────────────────┐
├────►│ save / plot #1 │
│ └────────────────┘
▼
annotate_immundata() ──── Annotate ImmunData with the computed statistics
│
│ ┌────────────────┐
├────►│ save / plot #2 │
│ └────────────────┘
│
▼
┌────────────────────────┐
│seur@meta.data[:] <- ...│ ──── Export ImmunData annotations
│ adata.obs = ... │
└────────────────────────┘
Transformation is a loop of annotation → modification and computation → visualisation, always producing a new ImmunData while leaving the parent intact. That immutability is what turns every notebook into a reproducible pipeline.
-
Import external annotations to ImmunData:
annotate_immundata()(or its thin wrappersannotate_barcodes()/annotate_receptors()) merges labels from Seurat/AnnData/TCRdist/anything that can be expressed as a keyed data frame to the main table, so each chain has a corresponding annotation. -
Aggregate repertoires:
Now that extra labels are present, you might regroup receptors, for example, by donor × cell-state.
-
Filter receptors or repertoires:
filter_immundata()accepts tidy-verse predicates on chains, receptors, or repertoires. -
Create or modify columns, compute statistics:
On this step, you compute statistics per-repertoire or per-receptor, using input receptor features. There are several scenarios depending on what you try to achieve.
1) use
immunarchfor the most common analysis functions. The package will automatically annotate both receptors/barcodes/chains (!) and repertoires (!!) if it is possible;2) simply mutate on the whole dataset using
dplyrsyntax, like compute edit distance to a specific pattern usingmutate_immundata;3) more complex compute that requires a function to apply to values and is probably not supported by
duckplyr. -
Save / plot #1:
Cache the
ImmunData. Useggplot2to visualise the statistics, computed fromImmunData. -
Annotate ImmunData with the computed statistics:
annotate_immundata()(again) joins the freshly minted statistics back to the canonical dataset. -
Save / plot #2:
Save the
ImmunDatawith new annotations to disk. Plot the results of analysis. -
Export ImmunData annotations:
Write the annotated data back to the cell-level dataset (Seurat / AnnData) for the subsequent analysis. Additionally, you could write the
ImmunDataitself to disk if needed.