Rule Matching Algorithm
On this page
Rules can contradict or override each other. On this page, we explain the logic that we use to handle contradicting Rules and consequences.
Conflicting Rules
Your index may contain many Rules, so you need to be careful not to repeat yourself, or create conflicting Rules. Most Rules target different and distinct situations, so this is usually not a concern.
However, consider the following two Rules:
- Rule 1: if a query contains “potter”, then promote “Harry Potter’s Deluxe DVD collection” to the first result.
- Rule 2: if a query contains “potter”, then promote the “Harry Potter 8” to the first result.
What would be the first result, “Harry Potter DVD collection” or “Harry Potter 8”? In other words, which Rule wins?
The Rule engine resolves this kind of conflict with a precedence logic preset, as outlined below. It resolves every conflict depending on a set of criteria.
Precedence acts mainly along two axes: specificity (the more specific a Rule is, the higher its precedence—similar to CSS selectors) and query text.
Note that multiple Rules can still match a given query, provided they match a distinct set of words.
Precedence logic
Algolia uses a tie-breaking algorithm, much like the ranking formula, to determine the precedence of all the Rules that apply to any given query. In other words, a criterion is only considered when all its preceding criteria rank equal.
The precedence logic, ranked by importance, is as follows:
- Position: the earliest match wins (i.e., closest to the beginning of the query string).
- Match length: the longest match wins; match length is determined by the number of words matched from the query string and the number of filters matched from the query parameters.
- Anchoring: is > starts with > ends with > contains.
- Context: a contextual Rule has higher priority than a general Rule or than a Rule using filters.
- Filters: a Rule using filters has higher priority than a general Rule.
- Literals over placeholders: if a word could match both a literal or a facet value, the literal takes precedence.
- Temporary over permanent: if both a temporary and a permanent Rule match, the temporary Rule takes precedence.
- Rule ID: if there are still conflicts after all other criteria have been applied, we take the smallest
objectID
in lexicographical order. This final criterion guarantees to break every tie. It most likely applies only when there are duplicate Rules.
Essentially, Rules apply from the beginning of the query to its end.
Note that the precedence logic applies the same way with multi-conditional Rules. The engine evaluates each condition as if it were an independent Rule.
Conflicting consequences
We only make use of the precedence logic above if two or more conditions conflict. If the conditions do not directly conflict, then there is no conflict, and so the precedence does not apply.
So what happens when two consequences conflict, or overlap? For example:
- Rule 1: if a query contains “Shakespeare”, then promote “The Lost Shakespeare Diaries” to the first result.
- Rule 2: if a query contains the phrase “how much is”, then promote your company’s “Full Price List” to the first result.
What happens when the query is “how much is Shakespeare”?
In this case, there is no conflict because the conditions are not the same. Yet, the consequences overlap: namely, two different consequences are fighting for the first position in the results. In this case, there is no clear precedence logic to determine the winner. The result can sometimes be based on the creation date of the Rules. The best is to avoid this kind of situation.
Edge cases
- If a Rule removes a word from the query string, all subsequent Rules that would have been triggered by this word (be it via a literal or a facet placeholder) are disabled.
- If a Rule replaces the query string entirely, all subsequent Rules are disabled.
Some further considerations
Hit promotion’s effect on relevance
Only objects coming from the same index can be promoted. Promoted objects have to be explicitly identified by their objectID
.
A promoted object is considered a hit, even if it doesn’t match the query. If it matches the query, it is removed from its original position and inserted at its promoted position, even if the original position was been better than the promoted position (in other words, promoted hits can also be “demoted”). For performance reasons, promoted positions are restricted to the range [0, 300] (zero-based).
Inside the same Rule, each promoted object must have a unique promoted position. If promoted objects from two distinct Rules are triggered for the same query:
- Duplicates are merged, using the best position.
- If the resulting positions conflict between distinct objects, objects are shifted down until a free slot is found.
- All regular hits are shifted down as many times as necessary to ensure that all promoted objects get as close to their promoted position as possible (modulo conflicts between objects, as stated above).
Hidden objects are removed from the hits. The following hits are shifted up so that pagination works seamlessly.
Injecting user data
User data lets you inject data inside the results that are not objects coming from the index, and, as such, doesn’t compete with other hits for pagination. A typical use case is to display a banner on top of the result list.
User data can be any JSON object. It is not interpreted by the API whatsoever.