Skip to content

Find Similar

  • SAM detections are initially unknown.
  • Find Similar uses embeddings to group visually similar objects and automatically propagate labels.
  • It combines unsupervised structure (K-Means clusters) with semi-supervised refinement (LDA classifier) for efficient label scaling.

Key insight

Find Similar converts visual similarity into semantic labels, leveraging both automation and SME guidance for scalable dataset curation.


How it works

Pooling embeddings

  • Per-detection embeddings (leip_embedding) are extracted from the SAM feature map.
  • Encodes appearance information: texture, shape, size.

Clustering

  • K-Means groups similar embeddings → assigns cluster_id.
  • Majority vote within clusters propagates known labels to unknowns if above threshold.

LDA (semi-supervised step)

  • Activated once enough labeled examples exist (min_classes_for_lda, min_examples_per_class).
  • Trains a simple linear classifier on labeled embeddings.
  • Assigns unknowns to classes while preserving confirmed labels.

UMAP visualization

  • Projects embeddings into 2D for interpretability.
  • SMEs can inspect clusters, correct misgrouped objects, and feed corrections back into FS.
  • Re-running FS updates clusters and propagates corrections.

Why it matters

  • Minimizes manual labeling effort while maintaining quality.
  • Progressive pipeline: start unsupervised → inject SME labels → expand coverage with LDA.
  • Embeddings act as a bridge: unknown objects become semantically meaningful.
  • Visual feedback allows human-in-the-loop correction, preserving accuracy and trustworthiness.