Native Hierarchical and Compositional Representations with Subspace Embeddings

Code for the KDD 2026 paper Native Hierarchical and Compositional Representations with Subspace Embeddings.

WordNet reconstruction

The WordNet noun hierarchy is fetched from nltk.corpus, so no manual download is required.

Train (128×128 projection matrices, ridge λ = 0.2, noun synset):

python train_wordnet_reconstruction.py --N 128 --D 128 --lbd 0.2 --synset n

The resulting ReconstructionData (optimized embeddings + optimization config) is saved to:

./wn_r_embeddings/{synset}_{N}x{D}_{lbd}_{group_size}/

Evaluate:

python eval_wordnet_reconstruction.py \
    --embed-path <path to the ReconstructionData saved above> \
    --device cuda

HyperLex

Download HyperLex from https://github.com/cambridgeltl/hyperlex, then:

python eval_hyperlex.py \
    --embed-path <path to the ReconstructionData> \
    --hyperlex-path <hyperlex>/nouns-verbs/hyperlex-nouns.txt

WordNet link prediction

Download the WordNet splits from https://github.com/lapras-inc/disk-embedding/tree/master/data/maxn. The directory should contain:

noun_closure.tsv.vocab
noun_closure.tsv.train_{percent}percent
noun_closure.tsv.valid
noun_closure.tsv.test
noun_closure.tsv.full_neg
noun_closure.tsv.valid_neg
noun_closure.tsv.test_neg

Train (10% closure coverage, 128×128 projection matrices, ridge 0.2, γ⁺ = 0.8, γ⁻ = 0.1):

python train_wordnet_lp.py \
    --dataset-path <root folder of the files above> \
    --closure 0.1 \
    --gamma-pos 0.8 --gamma-neg 0.1 \
    --N 128 --D 128 --lbd 0.2

The resulting LinkPredictionData is saved to:

./wn_lp_embeddings/{seed}_{int(100*closure)}_wordnet_subspace_{N}x{D}_{lbd}_{group_size}/

Evaluate:

python eval_wordnet_lp.py \
    --embed-path <path to the LinkPredictionData saved above> \
    --dataset-path <same root folder used for training>

SNLI

Train (sentence-transformers/all-mpnet-base-v2, 128×128 projection matrices, two-way):

python train_nli.py \
    --base-model-name sentence-transformers/all-mpnet-base-v2 \
    --N 128 --D 128 --two-way

The NLITrainingData (state dict + training config) is saved to:

./nli_models/{base_model}_{N}x{D}_lbd{lbd}_context{max_length}_seed{seed}[_2way][_benchmark]/

Evaluate:

python eval_snli.py --root ./nli_models --model-name <name generated above>

Citation

@inproceedings{moreira2026native,
  author    = {Moreira, Gabriel and Marinho, Zita and Marques, Manuel and Costeira, Jo{\~a}o Paulo and Xiong, Chenyan},
  title     = {Native Hierarchical and Compositional Representations with Subspace Embeddings},
  booktitle = {Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '26)},
  year      = {2026},
}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
data		data
LICENSE		LICENSE
README.md		README.md
composite_entailment.txt		composite_entailment.txt
eval_composite_entailment.ipynb		eval_composite_entailment.ipynb
eval_encoding_time.py		eval_encoding_time.py
eval_hyperlex.py		eval_hyperlex.py
eval_snli.py		eval_snli.py
eval_wordnet_lp.py		eval_wordnet_lp.py
eval_wordnet_reconstruction.py		eval_wordnet_reconstruction.py
model.py		model.py
subspaces.py		subspaces.py
train_nli.py		train_nli.py
train_nli_baseline.py		train_nli_baseline.py
train_nli_box.py		train_nli_box.py
train_wordnet_lp.py		train_wordnet_lp.py
train_wordnet_reconstruction.py		train_wordnet_reconstruction.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Native Hierarchical and Compositional Representations with Subspace Embeddings

WordNet reconstruction

HyperLex

WordNet link prediction

SNLI

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Native Hierarchical and Compositional Representations with Subspace Embeddings

WordNet reconstruction

HyperLex

WordNet link prediction

SNLI

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages