Discovery Queries

The Discovery API has three top level queries for browsing, searching and autocomplete. There are all optimized for read performance.

Discovery queries

The Discovery API offers three query types — browse, search, and autocomplete — that share the same toolbox: full-text search, filtering, faceting, sorting, pagination, ranking, and personalization. This page is the reference for those shared concepts; the per-query pages cover how each query type applies them.

Browse queries

Browse queries are your go-to for fetching items of a given shape from your Crystallize tenant. They share the same capabilities as search queries — filtering, faceting, sorting, and pagination — but target a single shape.

Typical use cases:

  • Fetching all products from a specific path
  • Fetching all categories
  • Fetching the latest blog posts matching a topic

See a sample browse query below.

Search queries

Search queries have the same capabilities as browse queries, but query across every shape in your tenant. Because the result set can therefore contain documents of different shapes, your query must handle each shape it expects to receive.

Autocomplete queries

Autocomplete queries are designed for partial matching, typically returning a small subset of the data available through search and browse.

Use cases:

  • Typeahead in search
  • Auto-completion of input fields

Indexed fields, types & vocabularies

Every Discovery capability operates on the fields you index. Each indexed field has a type that determines how it can be used — filtered, faceted, sorted, or used to rank results.

Fields are referenced in GraphQL by their mapped name (graphqlMapping), a GraphQL-safe alias for the underlying document path — for example a path such as variants.stock is exposed under a normalized name, since dots and colons are not valid in GraphQL field names. The supported field types and what each one supports:

Field typeFilterFacetSortrankBy
token
string / keyword
autocomplete✓ (typeahead)
number
date
boolean
geo (GEO_POINT)✓ (geo)

Vocabularies are named vector (embedding) spaces configured per tenant. Each item stores a materialized vector per vocabulary (vectors.<vocabulary>). Vocabularies power the vector-based capabilities: tasteCosine ranking, userTaste personalization, and nearestTo similarity search.

Direct filter parameters

Alongside the structured filter argument, queries accept top-level parameters that scope what is searched:

  • language — one or more languages to query (defaults to the tenant's default language).
  • publicationState — one or more of published or draft (defaults to published).
  • path — restrict results to a branch of the catalogue tree.
  • pathResolutionMethod — how path is matched: Canonical, Alias, History, or Shortcut.
  • topic — restrict results to items tagged with a given topic.
  • parentId — fetch the direct children of an item or topic (takes precedence over path on the topics query).
  • itemId — identify a specific item (for example, as the anchor of a nearestTo search).

Filtering

The filter argument is a tree of field conditions. Each filterable field appears under its mapped name and accepts a type-specific set of operators:

OperatorApplies toDescription
existsall typesWhether the field is present on the document.
equals / not_equalsstring, number, date, booleanExact (in)equality.
in / not_instring, number, dateValue is (not) one of a list.
contains / not_containsstringSubstring match.
phrasestringMatch an ordered sequence of words.
regex / not_regexstringRegular-expression match.
range (gt, gte, lt, lte)number, dateOpen or closed range bounds.
autocompleteautocompletePrefix/typeahead match against a partial term; accepts fuzzy options.
withingeoGeo containment (see Geo filtering).

Geo filtering

Fields of type GEO_POINT support the within operator, which matches points contained in a region. Three region shapes are available:

  • box — an axis-aligned rectangle defined by bottomLeft and topRight points.
  • circle — a center point and a radius in meters.
  • polygon — an ordered list of points forming a closed area.

Each point is a { lat, long } pair.

Boolean composition

Field conditions combine through nested AND and OR lists, each holding further filter objects, so you can express arbitrarily nested boolean logic. Conditions on different fields at the same level are combined with AND.

On tenant-wide queries (no single shape), type_in restricts results to a list of item types.

Faceting

Facets return aggregated counts alongside results, for building filterable navigation. Request facets per field; the available facet shapes depend on the field type:

  • String facets — counts per distinct value. An optional limit (default 10) caps the number of buckets.
  • Number facets — counts grouped into ranges you define with boundaries, plus an optional limit.
  • Date facets — counts grouped into date ranges defined with boundaries, plus an optional limit.

Every facet accepts an optional key — the name under which its result appears in the response. Facet results are returned on the summary (see Results & SearchSummary).

Sorting

The sorting argument orders results by one or more fields, each with a direction of asc or desc. Token, number, and date fields are sortable. You can also sort by score, the relevance (or rankBy) score of each hit.

Ranking & relevance (rankBy)

rankBy gives you structured control over result ordering. It evaluates a list of weighted terms and sums them into a final score (finalScore = Σ weight × signal). Each term selects one signal:

  • relevance — full-text match strength against the search term.
  • tasteCosine — cosine similarity between a taste vector and the item's stored vector for a vocabulary; requires vocabulary and from (userTaste or nearestTo).
  • recency — exponential time-decay favouring newer items; requires a date field and halfLifeDays.
  • fieldBoost — uses a numeric field's value directly as the signal; requires field.
  • inStockBoost — contributes 1 when a numeric field is greater than 0; requires field.

Each term takes a weight (negative weights penalize). Term-level options include vocabulary, from, field, halfLifeDays, and normalize (min–max normalize the term across the result window before weighting). rankBy itself takes a tieBreaker field for stable ordering of equal scores, and explain to include a per-term score breakdown on each hit. Scoring runs within a bounded top-K window of the text-search candidates.

Personalization (context.userTaste)

context.userTaste personalizes ranking using the current shopper's taste. Provide one taste vector per vocabulary; each is compared (cosine similarity) against items' stored vectors and the per-vocabulary scores are summed.

Each userTaste entry has three fields:

  • vocabulary — the embedding space these weights belong to.
  • weights — a sparse vector, given as a JSON object mapping each dimension key (e.g. "color:red") to a preference number: positive likes, negative dislikes, magnitude is strength.
  • magnitude — the vector's L2 length (√Σ weightᵢ²), which you precompute so the server can normalize the similarity without recomputing it per item.
tip

Taste in rankBy

For per-vocabulary weighting, or to blend taste with other signals, use rankBy with tasteCosine terms (from: userTaste) instead of relying on context.userTaste alone.

Similarity search (nearestTo)

nearestTo powers "more like this": it ranks items by similarity to an anchor item's own stored vector, rather than a supplied taste vector. It takes vocabulary (which vector space), like (the anchor, identified by exactly one of sku or itemId), and k (how many nearest neighbours to return). The anchor item itself is excluded from the results.

Typo tolerance (fuzzy)

Full-text and autocomplete matching can tolerate typos through options.fuzzy:

  • fuzziness — maximum single-character edits allowed to still match: NONE, SINGLE, or DOUBLE (default NONE).
  • prefixLength — number of leading characters that must match exactly (default 0).
  • maxExpensions — maximum number of term variations to generate and search (default 50).

Results & SearchSummary

Every browse and search response contains hits (the matched documents) and a summary.

Each hit is a document exposing common fields such as itemId, name, path, language, type, shape, publicationState, the timestamps createdAt / updatedAt / publishedAt / indexedAt, its score, and a paginationToken for cursor pagination.

The summary carries query-level metadata:

  • totalHits — total number of matching documents.
  • hasMoreHits / hasPreviousHits — whether further pages exist (when paging with skip).
  • endToken — cursor token for the next page (use with after).
  • facets — the aggregated facet results requested via facets.
  • priceRange — the min and max price across the result set (arguments: priceIdentifier, quantity).
  • profiling — execution metadata such as executionTime.

The topics query

The topics query returns the topics of your tenant. It accepts language, and either path or parentId (parentId takes precedence) to fetch the children of a given topic. Each returned topic exposes id, name, path, parentId, displayColor, and meta.

Price queries

Product documents expose price helpers so you can resolve prices directly from Discovery results:

  • priceVariantFor(identifier, marketIdentifiers, count) — the price for a given price variant, optionally scoped to markets and a line quantity (for tier prices).
  • priceVariantForEveryone(identifier, count) — the market-agnostic price for a price variant.
  • bestPriceList / bestPriceListFor / bestPriceListForEveryone(identifier, …) — the resolved price list for a price variant (optionally per market).

At the result-set level, summary.priceRange returns the min and max price across all hits.