Memorization in 3D Shape Generation: An Empirical Study

1Princeton University, 2Harvard University
Teaser

Abstract

Generative models are increasingly used in 3D vision to synthesize novel shapes, yet it remains unclear whether their generation relies on memorizing training shapes. Understanding their memorization could help prevent training data leakage and improve the diversity of generated results. In this paper, we design an evaluation framework to quantify memorization in 3D generative models and study the influence of different data and modeling designs on memorization. We first apply our framework to quantify memorization in existing methods. Next, through controlled experiments with a latent vector-set (Vecset) diffusion model, we find that, on the data side, memorization depends on data modality, grows with data diversity and finer conditioning; on the modeling side, it peaks at moderate guidance and can be mitigated by longer Vecsets and simple rotation augmentation. Together, our framework and analysis provide an empirical understanding of memorization in 3D generative models and suggest simple yet effective strategies to reduce it without degrading generation quality.


Methodology

Memorization in 3D generative models is still underexplored. We propose a simple evaluation framework to quantify memorization in generated 3D shapes.

Distance Metrics

We benchmark distance metrics for memorization detection using 133 generated shapes. A retrieval is counted as correct if the generated shape is visually near-identical to the retrieved training neighbor.

As shown below, Light Field Distance (LFD) achieves the highest accuracy among all seven metrics. Therefore, we adopt LFD as our primary retrieval metric.

Metric LFD Uni3D ULIP-2 CD DinoV2 PointNet++ SSCD
Acc. (%) 78.4 74.8 66.2 46.8 20.1 18.7 7.2

Top-1 retrieval accuracy (%) of distance metrics on 133 generated shapes from four ShapeNet categories. LFD achieves the highest accuracy.


Memorization Metric

With LFD as our metric, we define a memorization score using the Mann-Whitney U test. This outputs a standardized z-score ZU based on the nearest-neighbor distances from generated samples (Q) and held-out test samples (Ptest) to the training set (T).

When ZU < 0, the generated set is, on average, closer to the training data than the test set is. Thus, we treat ZU < 0 as evidence of memorization, with the strength of memorization increasing as ZU decreases.


Evaluation Framework

Our evaluation framework incorporates a quality indicator using Fréchet Distance (FD). We interpret ZU only among models with similar FD values, ensuring that we decouple memorization from generation quality.


Evaluating Existing Methods

Memorization on Single Category (ShapeNet Chairs)

Method LAS-Diffusion
(uncond.)
LAS-Diffusion
(class)
Wavelet
Generation
3DShape2VecSet Michelangelo
ZU -7.02 -4.93 -1.35 4.56 9.25
Gen &
Retrieved
Gen Ori
Gen Ori
Gen Ori
Gen Ori
Gen Ori

ZU scores on ShapeNet chairs. Lower scores indicate stronger memorization. The images show generated shapes (left) vs. their nearest training neighbors (right).


Memorization on Entire Training Sets

method LAS-Diffusion
(class)
3DShape2VecSet Michelangelo Trellis-small Trellis-large Trellis-xlarge
split IM-NET 3DILG 3DILG Trellis500K Trellis500K Trellis500K
ZU 0.46 1.07 -0.33 -0.67 -1.57 -2.19

ZU scores evaluated on full datasets. Scores near zero indicate the model generalizes well rather than memorizing.

Qualitative Retrieval Results

Qualitative retrieval on ShapeNet Chairs
Qualitative retrieval on ShapeNet Chairs. Earlier models (left) show strong memorization even at high percentiles (60th), while modern models (right) generate novel geometries.
Qualitative retrieval on Entire Datasets
Qualitative retrieval on Entire Datasets. Across all large-scale models, retrieved training shapes become visually distinct from generated samples at moderate percentiles (20th–60th).

Controlled Experiments

BibTeX


@article{pu2025memorization,
  title={Memorization in 3D Shape Generation: An Empirical Study},
  author={Pu, Shu and Zeng, Boya and Zhou, Kaichen and Wang, Mengyu and Liu, Zhuang},
  journal={arXiv preprint arXiv:2512.23628},
  year={2025}
}