Filter By Null or Missing Attributes
On this page
What happens when your index contains an attribute that is not present in all records?
For example, consider an online book store where people can buy, but also rate books, from 0 to 5. Any record without the rating
attribute is assumed not to be rated yet.
Object are schemaless, so this isn’t a problem until you want to filter on records with and without a specific attribute.
Generally speaking, selective filtering becomes a problem when the existence or non-existence of a filter value actually means something. The Algolia engine does not support filtering on null
value or missing attributes.
In other words, taking the example above, if we wanted to combine books with a specific rating and books that aren’t yet rated in the same filtering statement, this would require some modification of the data.
There are two approaches:
- using the
_tags
attribute, - using a boolean attribute.
Dataset
Let’s look at the three following records: one with a correctly filled rating
attribute, a second with a null rating
, and the third without rating
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[
{
"title": "The Shining",
"author": "Stephen King",
"rating": 5
},
{
"title": "Fantastic Beasts and Where to Find Them",
"author": "J. K. Rowling",
"rating": null
},
{
"title": "Run Away",
"author": "Harlan Coben"
}
]
Here, only the first record has a rating. The other two are assumed not to have been rated yet. Note that a null
or nonexistent attribute is different from zero, which represents a book with a rating equal to 0.
Creating a Tag
At indexing time, you can compute a tag that specifies what it means when the attribute is present, set, or absent.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[
{
"title": "The Shining",
"author": "Stephen King",
"rating": 5,
"_tags": ["is_rated"]
},
{
"title": "Fantastic Beasts and Where to Find Them",
"author": "J. K. Rowling",
"rating": null,
"_tags": ["is_not_rated"]
},
{
"title": "Run Away",
"author": "Harlan Coben",
"_tags": ["is_not_rated"]
}
]
To search for records that do not have the attribute or attribute value present, you can now use tags filtering:
1
2
3
$index->search('query', [
'filters' => '_tags:is_not_rated'
]);
Creating a Boolean Attribute
At indexing time, you can compute a boolean attribute named is_rated
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[
{
"title": "The Shining",
"author": "Stephen King",
"rating": 5,
"is_rated": true
},
{
"title": "Fantastic Beasts and Where to Find Them",
"author": "J. K. Rowling",
"rating": null,
"is_rated": false
},
{
"title": "Run Away",
"author": "Harlan Coben",
"is_rated": false
}
]
To search for records that do not have the attribute or attribute value present, you can now use boolean filtering:
1
2
3
$index->search('query', [
'filters' => 'is_rated = 0'
]);