HDC tutorial

Engineering Biology Doctoral School 2025 28 November 2025 [email protected] How
to Build a Brain: Introduction to Hyperdimensional Computing Michiel Stock 1

Outline • Lecture outline: 1. Motivation & theory (what?) 2.
Computing & operations (how?) 3. Applications (why?) • Each part has some theory and notebook exercises • See https://kermit-ugent.github.io/HDC_tutorial/ for the notebook! • About one hour per part, feel free to take a break during the exercises 2

• Studied Bioscience Engineering (cellular biotech) at Ghent University •
As of September 2023, a lecturer in applied dynamical systems at Faculty of Bioscience Engineering • Teaches mathematical modeling courses • Co-leading the KERMIT (knowledge- based systems) research unit • Primary research interest: computing for and computing by living systems My background 3

1. Motivation & theory 4

5 How do I represent the ingredients? Can I make
in f initely many new combinations? How do I represent new combinations, techniques? How do I compare two recipes? Which ingredient would be best to add?

(cooking is a metaphor for synbio) 6

HDC in one slide 7

HDC vs deep learning 8 Deep learning/arti fi cial neural
networks (ANN) and HDC are both (highly abstracted) models of the brain: •Arti fi cial Neural Networks model neurons and synapses (connectionist) •HDC models distributed memory and processing HDC is a kind of neuro-symbolic AI, i.e. brain-inspired models based on symbolic reasoning What is 🍏? ANN: inputs → Neural network → output = apple HDC: sphere ⊗ fruit ⊗ green ⊗ tree = apple

Main idea of bioinformatics it’s all about similarity 9 sequence-based
similarity homology phylogeny structure prediction gene annotation …

Prototype methods • For classi f ication or regression, pick
the label of the most similar annotated data point. • Each category is represented by one or more “prototypes” • It can be made more robust by taking the majority of the k nearest neighbors. • Simple, f lexible and powerful, but does not scale to high dimensions 10

Computational biology (from BLAST to AlphaFold) is driven by similarity
11

A bit of history of HDC https://web.stanford.edu/class/ee380/Abstracts/171025-slides.pdf •In the 90s,
much of AI work was focused on connectionist models (Hinton, Hop fi eld, etc.) where the challenge was setting weights. •A different group focused on representations (how does the brain represent concepts without symbols?) •Pioneers: •Paul Smolensky (1990): Tensor Product Representations •Tony Plate (1995): Holographic Reduced Representations •Pentti Kanerva (1988/2009): Sparse Distributed Memory & HDC Family of methods also of Vector Symbolic Architectures (VSA)

Why would we care about this stuff? • As biologists:
• toy model of the mind • meaning/information from randomness • As AI/bioinformatics practitioners • low-energy/low-resource AI • compositional & reasoning on complex structures 13

Energy-ef f icient machine learning 14 Brain uses 20W Gaming
(PS5) uses 200W Energy use of LLMs (inference)

Concepts are (probably) vectors the mathematics of psychology 15 Piantadosi,
S.T., Muller, D.C.Y., Rule, J.S., Kaushik, K., Gorenstein, M., Leib, E.R., Sanford, E., 2024. Why concepts are (probably) vectors. Trends in Cognitive Sciences 0. https://doi.org/ 10.1016/j.tics.2024.06.011

Neural computation vs symbolic computation 16 P. Smolensky, R. T.
McCoy, R. Fernandez, M. Goldrick, and J. Gao, “Neurocompositional computing: from the central paradox of cognition to a new generation of AI systems,” AI Magazine, vol. 43, no. 3, pp. 308–322, 2022, doi: 10.1002/aaai.12065. symbolic compositional computing encodes concepts via symbols and works via the composition principle neural computation encodes concepts via vectors and works via the continuity principle allows learning by gradients

Representation building blocks (in the hyperdimensional space) 17

Realistic hypervector (it just has to be big) 18 unique
hypervectors (“merely” atoms in the universe) 2 × 103010 1080 hypervector: very big (ca. 10,000 dimensions) randomly- generated binary vector Often binary (0/1) or bipolar (-1/1), but can also be real, graded, or even complex.

Generation of hypervectors i.i.d. sampling 19 generating Atomic hypervectors are
the building blocks for your concepts (letters, symbols, atoms…) type of the elements does not matter too much Clever idea: use a hash of the object as a seed for vector generation!

Hallmarks of HDC hyperdimensional Law of Large Numbers kicks in
for many properties homogeneous most of the HV “look the same” holographic information is spread over the whole HV robust every vector is surrounded by thousands of similar vectors

High dimensional spaces are peculiar 21 In high dimensions, there
is no “typical” representative.

Why does it work? The Blessing of Dimensionality 22 A.
N. Gorban and I. Y. Tyukin, “Blessing of dimensionality: mathematical foundations of the statistical physics of data,” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 376, no. 2118, p. 20170237, Mar. 2018, doi: 10.1098/rsta.2017.0237. In large dimensions, most almost all points have the same distance (homogeneous) simplifying the geometrical structure but makes similarity search hard or useless. Samples of a high-dimensional Gaussian: Expected value of a high-dimensional Gaussian: cfr. Gaussian soap bubbles Thinking out of the 10-dimensional box

Similarity detect whether two vector are related 23 For bipolar
HDV, the cosine similarity is a good similarity measure: τcos (u, v) = u ⋅ v |u| |v| • Orthogonal: • Quasi-orthogonal: τcos (u, v) = 0 τcos (u, v) ≈ 0

Similarities are tightly bounded 24 For , we expect on
average 5000 matches with a standard deviation of 50. N = 10 000 Concentration inequalities such as Hoeffding's inequality bound mismatches by chance: P(more than 500 matches than expected | unrelated) < 1.4 × 10−11

Bounding using Hoeffding most HV are equally (dis)similar 25 We
can give a good bound on the expected number of matches (t) via Hoeffding’s inequality P(SN − E[SN ] ≥ t) ≤ exp − t2 ∑N i=1 (bi − ai ) number of matches: ∑ i Xi expected number of matches: N/2 Number of matches more than expected lower ( ) and upper ( ) bounds for matches (hence ) ai bi Xi bi − ai = 1 P(number of matches more than expected ≥ t) ≤ exp ( − t2 N )

There are many quasi-orthogonal HVs an exponentially large number of
them! 26 Draw random hypervectors, let be the chance that at least one pair shares more bits than expected. We can write So, the size of a quasi-orthogonal set is given by M δ ε δ ≤ ( M 2 ) e−ε2/N ≈ M 2 e−ε2/N M ≥ 2δeε2/N In an -dimensional space, it is only possible to fi nd a set of orthogonal vectors, the chance of randomly drawing an orthogonal set is extremely low! N N Let , then • for : • for : • for : δ = 0.0001 ε = 100 M ≈ 0 ε = 500 M ≈ 2 × 108 ε = 1000 M ≈ 1041 Independent of if similarity cutoff scales with ! N N/N

Linear separation in high dimensions always possible! 27 In the
hyperspace, you can separate the classes using a cheap linear classi fi er (e.g. logistic regression) Why?  For large N there are more separating hyperplanes than ways to label your points!

Robustness 28 robust every vector is surrounded by thousands of
similar vectors Give a vector representing a concept. It will share 10000 elements with itself. Flip 40% of its elements: it will now share only 6000 elements with itself. It is still within the closest % of the closest vectors to the original concept! 10−42

Closeness is not transitive 29 holographic apple Adam & Eve
computer green fruit water fi sh Bouillabaisse French

Why does it work? HDC works similar to a Bloom
f ilter 30 A Bloom fi lter is a stochastic data structure to test whether an element is part of a set. Every element is hashed times to fl ip certain bits into a bit array k Checking for presence of an element is done similarly and checking whether all required bits are “on”. No false positives but false negatives are possible. Variants such as the count Bloom fi lter or autoscaling Bloom fi lter allow more fl exibility.

Time for exercises! • Open the Pluto notebook and get
started (ideally live, otherwise via the HTML) • Go at your own pace through the examples and exercises of 2. Theory & Foundations: 1. Low and high dimensions 🧊 2. Hypervectors 🎲 3. Holographic property 👽 4. Robustness 🪨 5. Example: amino acids 🥩 • Make sure to do the 2.2 Hypervectors part; the remainder of the exercises will build upon it! 31

Julia crash course Everything you need! 32 1. Vectors and
matrices 2. Function one-liners 5. Pipes via “|>” 4. Loops and comprehensions 3. Broadcasting via “.”

2. Reasoning & computing 33

Basic ingredients for HDC What we already have: • a
way to generate new unrelated concepts (random vectors) • a way to compare two vectors if they are related What is missing: • combining multiple HVs into a new, overacting concept • turning one or multiple HVs into a new, independent concept 34

Bundling or superposition combining multiple HV in one that is
similar to all 35 Bundling is usually done by element-wise majority: u = [v1 + v2 + v3 ] bundling normalization […] might be needed

Bundling combines concepts 36 Merge concepts so they similar to
all its constituents

Some thoughts about bundling • Trivially easy to implement for
bipolar vectors! • Adding vectors without normalizing (thresholding) retains information on the relative importance of the elements and the number of elements that were combined. • Possibility of weighting concepts • Usually, it's best to normalize to compare sets of di ff erent sizes • Zero as neutral element (no longer bipolar!) 37

Binding combining multiple HVs in one that is di ff
erent to all 38 Binding is done by xor-ing bits or multiplying bipolar vectors element-wise: u = v1 ∘ v2 binding binding allows to generate completely new HV from other vectors! Often reversible: v1 ⊘ u = v1 ⊘ (v1 ∘ v2) = v2 when using element-wise multiplication, bind it is own inverse

Binding problem and superposition catastrophe 39 Combining multiple distinct embeddings
into a new embedding without them leaking is hard! For example composition of an image [1] K. Greff, S. van Steenkiste, and J. Schmidhuber, “On the binding problem in arti fi cial neural networks.” arXiv, Dec. 09, 2020. doi: 10.48550/arXiv.2012.05208. result of me generating an image of a fox and rabbit on my GPU to illustrate Lotka-Volterra equations Fundamental problem of neural embeddings! Binding operator solves this elegantly!

Permutation or shifting generate a new HV from an old
one 40 Permutation is a special case of binding to create a variant of a HV: ρ(v) ≁ v shifting

Binding hypervectors to new concepts 41 ρ ρ ∘ combine
concepts into something that is different from the original concepts

Drifting exchanging / perturbing bits 42 Creating a “continuous” interpolation
between two HVs can be done by exchanging bits. Make one HV more similar to another vector. Flipping random bits is randomly drifting in the hyperspace

Bipolar vs binary HDC 43 Bipolar Binary Example [-1, 1,
1, 1, -1, -1, 1, …, -1] [0, 1, 0, 1, 0, 0, 1, …, 0] Bundle Majority (summing + normalizing) Majority Bind element-wise multiply element-wise xor Permutate circle shift circle shift Comparing cosine similarity Tanimoto or Hamming

Other HDC architectures 44 Schlegel, K., Neubert, P., Protzel, P.,
2022. A comparison of vector symbolic architectures. Artif Intell Rev 55, 4523–4555. https://doi.org/10.1007/s10462-021-10110-3

Dictionary lookup 45 These simple operations suf fi ce for
building a learning system! Suppose we have a bunch of key-value pairs. We can bind each key ( ) to its corresponding value ( ) and bundle all pairs into a single HV: ui vi h = [u1 ∘ v1 + … + un ∘ vn ] Inference: ui ⊘ h ≈ [ui ∘ u1 ∘ v1 + … + ui ∘ un ∘ vn ] = [vi + noise] ≈ vi Most schemes of HDC are more advanced versions of this simple procedure! ui ∘ uj = noise (i ≠ j) ui ∘ ui = 1

Example: dollar of Mexico Learning by association • Represent countries
by name (NAM), capital (CAP) and currency (MON) • USTATES = [(COUNTRY ◦ USA) + (CAPITAL ◦ WDC) + (MONEY ◦ DOL)] • MEXICO = [(COUNTRY ◦ MEX) + (CAPITAL ◦ MXC) + (MONEY ◦ PES)] • Dollar of Mexico? • DOL ◦ (USTATES ◦ MEXICO) ≈ PES 46 Kanerva, P., n.d. What We Mean When We Say “What’s the Dollar of Mexico?:” Prototypes and Mapping in Concept Space. links each of the corresponding concepts (USA - MEX, WDC - MXC, DOL - PES)

Encoding data from atoms to complex objects 47 Bundling, binding
and permuting suf fi ces to create HV representation of almost every data structure you require! sets: bundle the elements u = [v1 + v2 + v3 + v4 ] graphs: encode edges by binding + bundling sequences: position encoded via shifting, n-grams for longer sequences u = ρ0(v1 ) ∘ ρ1(v2 ) ∘ ρ2(v3 ) ∘ ρ3(v4 ) u = [ρ0(v1 ) + ρ1(v2 ) + ρ2(v3 ) + ρ3(v4 )]

Encoding sequences • Short sequences are encoded directly • binding
does not encode position: A * B * C = B * A * C = C * A * B • permuting + binding does encode position: A * (B) * ≠ B * (A) * • longer sequences: split in k-mers + bundling ρ ρ2(C) ρ ρ2(C) 48

Encoding graphs/molecules 49

Encoding numbers numerical objects 50 Numerical composite objects, such as
real-valued vectors or functions, can be constructed using the aforementioned atomic scalar representations and operations. scalars: determine endpoints of an interval + randomly switch elements to obtain interpolation

Encoding vectors Random projections 51 Vectors can be obtained by
random projection (cfr. the Johnson–Lindenstrauss lemma). Retains relative distance with high probability! Projection matrices are typically i.i.d. (e.g., scaled normal) or sparse

Pretrained embeddings into the hyperspace HDC as a glue for
combining embeddings 52 images proteins text molecules pretrained deep neural network embedding vector hypervector random projection Sutor, P., Yuan, D., Summers-Stay, D., Fermuller, C., Aloimonos, Y., 2022. Gluing neural networks symbolically through hyperdimensional computing. https://doi.org/10.48550/arXiv.2205.15534

Learning and reasoning with HDC memory-based learning 53 data atoms
memory output learning reasoning encoding decoding Bundling often does not suf fi ce to achieve competitive performance: a form of retraining is often needed

Retraining and learning Just f ind a linear boundary 54
•In an ideal world, a computed HV would closely match its target concept. In practice, this might not perform well and a form of retraining is needed. •Common method: iteratively add/bundle wrongly classi fi ed data points to improve the prototype HV. •One can also use a linear classi fi er, such as Logistic Regression, Fisher Discriminant Analysis or the Perceptron

Perceptron algorithm 55 Frank Rosenblatt kickstarting the fi rst AI
hype cycle Remember: all classes are linearly separable in the hyperspace!

Back to work! • Get back to the notebook and
go through Section 2: Computing in the hyperspace. • Implementation and examples of bundling, binding, shifting and retraining (optional) • Example 1: text analysis • Example 2: color spaces 56

Example 1: language recognition • Cleaned snippets of the Wikipedia
article in eight languages • Explore the N-gram/k-mer representation • Which languages are most similar? • Can you detect the presence of a sentence using the HVs? 57

Example 2: color spaces 58 • Random projection of vectors
in the hyperspace via RGB color representation • Comparing and interpolating colors • Reverse mapping via dictionary lookup

Example 2: color spaces Learning emoji colors 59 Data set
1: set of one (noisy) color per emoji Data set 2: three colors per emoji (only one is correct!)

3. Adventures in biology & beyond 60

HDC for computational biology 61 fast and efficient HDC traditional
extremely fast and energy-ef fi cient multimodal combining several data sources explainable interpretable and explainable! neuro-symbolic compositional ...ATCAAC... can represent complex compositions Stock M, Van Criekinge W, Boeckaerts D, Taelman S, Van Haeverbeke M, Dewulf P, et al. (2024) Hyperdimensional computing: A fast, robust, and interpretable paradigm for biological data. PLoS Comput Biol 20(9): e1012426. https://doi.org/10.1371/journal.pcbi.1012426

Solving Raven’s progressive matrices 62 M. Hersche, M. Zeqiri, L.
Benini, A. Sebastian, and A. Rahimi, “A neuro-vector-symbolic architecture for solving Raven’s progressive matrices,” Nat Mach Intell, vol. 5, no. 4, Art. no. 4, Apr. 2023, doi: 10.1038/ s42256-023-00630-8. https://www.quantamagazine.org/a-new-approach-to-computation-reimagines- arti f icial-intelligence-20230413/ Raven’s progressive matrices is a type of IQ test and machine learning benchmark deep learning does poorly human performance neuro-symbolic architecture obtains accuracy of 87% (and is two orders or magnitude faster)

Low power machine learning 63 [1] A. Rahimi, P. Kanerva,
L. Benini, and J. M. Rabaey, “Ef fi cient biosignal processing using hyperdimensional computing: network templates for combined learning and classi fi cation of ExG signals,” Proceedings of the IEEE, vol. 107, no. 1, pp. 123–143, Jan. 2019, doi: 10.1109/JPROC.2018.2871163. HDC is effective for biosignals such as EEG data due to ultra-low energy usage, robustness under low signal-to- noise ratios and online, fast learning “That is, good performance depends on good design rather than automated training, and this is a harder research task” Rahimi et al., 2019, p. 6 “(1) The HD classi fi er demands much less training data thanks to its simple and one-shot learning; (2) It also naturally operates with noisy and less preprocessed inputs; (3) There is no need for domain expert knowledge or electrode selection process. Last, but not least, the produced HD code is analyzable and interpretable.” Rahimi et al., 2019, p. 15

Omics at scale Traditional alignment is slow and depends on
sequence length. HDC converts sequences into fi xed-size vectors, enabling constant-time matching and incredibly fast performance. • Genomics (BioHD): Uses Processing-in-Memory (PIM) to achieve 100x speedups and massive energy ef fi ciency compared to standard accelerators. • Metagenomics (Demeter): A food pro fi ler that is 100x faster and uses 30x less memory than state-of-the-art tools (Kraken2), enabling real-time monitoring on small devices. • Epigenetics: Successfully classi fi es tumor vs. non-tumor samples based on methylation pro fi les. 64

Demeter: metagenomics pro f iler for food 65 Shahroodi, T.,
Zahedi, M., Firtina, C., Alser, M., Wong, S., Mutlu, O., Hamdioui, S., 2022. Demeter: A Fast and Energy-Ef fi cient Food Pro fi ler Using Hyperdimensional Computing in Memory. IEEE Access 10, 82493–82510. https://doi.org/10.1109/ACCESS.2022.3195878 Infers relative microbial abundance based on metagenomics readings. Uses binary HV with k-mer encoding Accuracy is within 2% of existing method (Kraken2) Using hardware acceleration, it can achieve more than a 100-fold throughput improvement and 30-fold memory improvement, making real-time analysis possible!

Drug discovery predicting pharmaceutical properties from molecules 66 [1] D.
Ma, R. Thapa, and X. Jiao, “MoleHD: Ultra-Low-Cost Drug Discovery using Hyperdimensional Computing.” arXiv, Feb. 05, 2022. Accessed: Jun. 27, 2023. [Online]. Available: http://arxiv.org/abs/2106.02894 “For all the reported datasets together, MoleHD is able to achieve the reported accuracy within 10 minutes using CPU only from the commodity desktop” can compete with graph convolutional neural networks

Trees of life beyond sequences 67 Ongoing research by Carlos
Vigil Vásquez Reconstruct Tree-of-Life from metabolic network hypervectors • Compare to reconstruction based on classical methodologies and metabolism-centric approaches • Showcase ability of hypervector representations to explain evolutionary events

HDC-based phylogeny 68 Encoding strategy effect on phylogenetic reconstruction (cont.)
Ongoing research together by Carlos Vigil Vásquez

Final exercise: transmembrane domains • Dataset of 500 transmembrane domains
of plant or bacterial origin • Classify the origin based on sequence and/or ESM-2 embeddings • Context: synthetic biology to transfer cytochrome P450 enzymes from plants to bacteria as a bioproduction platform 69

Further reading • Our review paper on HDC for computational
biology (link) • The review papers by Denis Kleyko, such as this one and this one • The papers of Denis Kanerva, in particular this slide deck 70

Available software • For many applications easy (and fun!) to
implement from scratch • Available libraries: • Torchhd • OpenHD • hdlib • 🚧 HyperdimensionalComputing.jl • contributions welcome 71 Make use of hardware accelerations (GPU, FPGA…)!

Acknowledgements and further reading This work has bene fi ted
from discussions with Dimi Boeckaerts, Steff Taelman, Maxime Van Haeverbeke, Pieter Dewulf, Bernard De Baets, Denis Kleyko, Carlos Vigil Vásquez and several thesis students.    Thanks to Bram Spanoghe for many of the visuals.    Thanks to Miguel De Block for the TMD dataset! Thanks to Tom Gorochowski for the invitation!

Q&A! 73

Engineering Biology Doctoral School 2025 28 November 2025 [email protected] How
to Build a Brain: Introduction to Hyperdimensional Computing Michiel Stock 74

HDC tutorial

HDC tutorial

More Decks by Michiel Stock

Other Decks in Science

Featured

Transcript