Catégorie : Contextual Hyper-Embedding

  • Comparison between Google S2R, Pribor’s Combinatorial Magic and Pribor’s CHE (Contextual Hyper-Embedding)

    Comparison between Google S2R, Pribor’s Combinatorial Magic and Pribor’s CHE (Contextual Hyper-Embedding)

    This document presents the characteristics, divergences and synergies between three approaches: Google S2R, Pribor’s Combinatorial Magic and Pribor’s CHE (Contextual Hyper-Embedding).

    1. Google’s S2R *

    “S2R” means Speech-to-Retrieval. It is a recent voice search architecture that Google is deploying, which bypasses the explicit speech → text transcription step to try to directly establish a match between the spoken audio and the information sought. The model relies on a dual encoder: one processes the audio, the other the candidate texts, in order to bring their vector representations closer together in the same semantic space.

    2. Pribor’s Combinatorial Magic

    Combinatorial Magic is a bijective, lossless and fixed-dimensional encoding of simple sentences into 4D or 5D vectors: three symbolic components (Subject, Verb, Object) plus a “meta” register of 8 or 16 bits. It is distinguished by O(1) complexity, total absence of information loss, and perfect interpretability.

    3. CHE (Contextual Hyper-Embedding uint8)

    CHE is an extremely economical contextual encoding approach, which represents each token by a uint8 integer. Unlike the floating-point attention of Transformers, it avoids matrices and softmax, reducing energy consumption by a factor of up to 5000.

    4. Comparative Table

    Feature

    S2R (Google)

    Combinatorial Magic

    CHE

    Data type

    float16 / float32

    symbolic indices + meta uint8

    uint8

    Dimension

    512–4096D

    4D / 5D

    1 byte/token

    Complexity

    O(n²)

    O(1)

    O(n) linear

    Information loss

    with loss

    none

    bounded / quantized

    Energy efficiency

    low

    extreme

    extreme (×500–5000)

    Interpretability

    low

    total

    medium

    5. Synergies and Integration

    The three approaches can be integrated into a hybrid architecture: S2R provides the global semantic geometry, CHE ensures contextual efficiency through uint8 quantisation, and Combinatorial Magic formalises symbolic propositions without loss. This combination gives rise to a family of S2R–CHE–CM models combining semantic generalisation, energy frugality and complete interpretability.

    * Ehsan Variani and Michael Riley, Research Scientists, Google Research, « Speech-to-Retrieval (S2R): A new approach to voice search », October 7, 2025

  • PRIBOR : CHE (Contextual Hyper-Embedding uint8)

    CHE (Contextual Hyper-Embedding uint8) est plus économique que l’attention classique des LLMs. Des processus similaires sont déjà utilisés mais moins économiques que CHE.

    ————————————————–

    1. Économie de mémoire

      Attention standard : matrices float16/float32 → 700 à 4000 bits par token

      CHE uint8 → 8 bits par token

    → gain × 500 à × 5000 en mémoire

    ————————————————–

    2. Processus similaires déjà utilisés

      INT-FlashAttention (Peking University, 2024) : attention entièrement en INT8, 72 % plus rapide, 82 % moins d’erreur   

      SageAttention (OpenReview, 2024) : attention en INT8 + lissage, plug-and-play   

      LLM.int8() (NeurIPS 2022) : multiplication matricielle entièrement en INT8   

    → uint8 est déjà standard dans l’attention quantifiée.

    ————————————————–

    3. Compatibilité avec CHE

      CHE = uint8 comprimé (SHA-256[0:8]) → 8 bits par token

      Pas de matrice 700×700, pas de softmax, pas de float ;

      Juste un uint8 dans le triplet ℝ⁴ ;

    → Plus économique et déjà utilisé dans l’attention quantifiée.

    Contact : pauljorion@pribor.ai