Range Predicates Unable to Avoid Filter Computation Due to Null Column Values #14694

ankitsultana · 2024-12-21T07:28:55Z

Say we have this column: event_timestamp, which may have null values. These values are expected to be in "mostly" non-decreasing order over time.

If we have queries with predicates such as event_timestamp IS NOT NULL AND event_timestamp BETWEEN x and y, then because the default value can practically only be an extremal value (0 or MAX_VALUE), we end up unnecessarily computing the range predicate for segments with non-null values in the range [x, y].

At high qps, this starts becoming a bottleneck even with a range index. For queries where the other filters are selective, I am planning to try and disable the range index altogether for one of our tables so we could rely on the Scan based filter which only runs for the filtered out docs.

Another related optimization is to early terminate the AndFilterOperator if any of the other filters have turned out empty. To do this, BlockDocIdSet can add a new method which can return the cardinality of the underlying docs. This may not always be possible, so the method could also return -1 indicating that cardinality is unknown at the moment.

Edit: I may be missing something since I didn't get a chance to take a deeper look into this.

pinot/pinot-core/src/main/java/org/apache/pinot/core/operator/filter/AndFilterOperator.java

Lines 52 to 59 in 1ed25c0

    
           protected BlockDocIdSet getTrues() { 
        
             Tracing.activeRecording().setNumChildren(_filterOperators.size()); 
        
             List<BlockDocIdSet> blockDocIdSets = new ArrayList<>(_filterOperators.size()); 
        
             for (BaseFilterOperator filterOperator : _filterOperators) { 
        
               blockDocIdSets.add(filterOperator.getTrues()); 
        
             } 
        
             return new AndDocIdSet(blockDocIdSets, _queryOptions); 
        
           }

The text was updated successfully, but these errors were encountered:

Jackie-Jiang · 2024-12-21T08:01:06Z

I don't fully follow. To solve event_timestamp IS NOT NULL AND event_timestamp BETWEEN x and y, we do need to evaluate both side right? Or are all value NULL here?

ankitsultana · 2024-12-22T20:07:21Z

Yeah I realized this needs more context.

Problem

Consider that we have a Realtime table with a single Kafka partition, that has event_timestamp values in increasing order. But event_timestamp can also be null occasionally. Visually the segments will look as follows:

Figure-1:

range of event_timestamp values per segment for only the non-null values
+------------------+------------------+------------------+------------------+
| S0: [0, 1000]    | S1: [1000, 2000] | S2: [2000, 3000] | S3: [3000, 4000] | 
+------------------+------------------+------------------+------------------+

Figure-2:

range of event_timestamp values per segment when you also consider the default null value (0 in this case)
+------------------+------------------+------------------+------------------+
| S0: [0, 1000]    | S1: [0, 2000]    | S2: [0, 3000]    | S3: [0, 4000]    | 
+------------------+------------------+------------------+------------------+

From a customer perspective, if they have range filters on the event_timestamp value (say event_timestamp BETWEEN 2500 AND 2800), they usually expect that they will only be evaluating the segments whose min/max values have overlap with the range predicate in the query. And technically this is what happens today, but what is easily overlooked is that when you can have nulls in the column, then the default null values end up becoming the min or max of almost every segment, as can be seen in Figure-2.

This means that segment pruning will no longer work and the range predicate will have to be evaluated for every segment. Even with a range index, the cost of evaluating these predicates ends up dominating the CPU of quite a few workloads I have seen internally.

Factors Affecting Prominence of this Issue

When you have a large number of segments
When segment pruning via bloom filters, etc. is not sufficient
etc. etc.

Optimizing When Other Selective and/or Efficient Predicates

While segment pruning is not easy to do (see section below), we can still try to limit the number of times the range predicate is evaluated per-query when the other filters are quite selective. Example: uuid_col1 = uuid1 AND uuid_col2 = uuid2 ... AND event_timestamp BETWEEN x AND y.

To achieve this, we should short circuit the AndFilter computation when the currently evaluated filters have already let to 0 matching rows. This is something we should do regardless of this issue since it can also help many other cases. (e.g. multiple text_match predicates in a query)

Possible Long Term Fix?

If we could set the expectation on the users that they should add event_timestamp IS NOT NULL to their query alongside their range predicates to achieve better segment pruning, then we might have some possible solutions to implement. But most of the options I can think of will look awkward.

richardstartin · 2024-12-22T20:57:35Z

FWIW the data structure underlying the range index allows passing in a RoaringBitmap representing an existing filter, so the range predicate is implicitly intersected with the filter, skipping over any part of the index not in the filter. This could only help here.

richardstartin · 2024-12-22T22:23:49Z

I lie, it turns out that intersection pushdown was never implemented for between but I implemented it here just now, it should be available soon with an upgrade.

gortiz · 2025-01-07T14:05:00Z

I think what we actually need to do here is to invest some time to find a proper solution that treats null values natively. They should be understood by dictionaries and min/max values

ankitsultana mentioned this issue Dec 23, 2024

Short Circuit And Filter Operator During Index Evaluation #14700

Open

Jackie-Jiang added the bug label Jan 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Range Predicates Unable to Avoid Filter Computation Due to Null Column Values #14694

Range Predicates Unable to Avoid Filter Computation Due to Null Column Values #14694

ankitsultana commented Dec 21, 2024 •

edited

Loading

Jackie-Jiang commented Dec 21, 2024

ankitsultana commented Dec 22, 2024

richardstartin commented Dec 22, 2024

richardstartin commented Dec 22, 2024

gortiz commented Jan 7, 2025

Range Predicates Unable to Avoid Filter Computation Due to Null Column Values #14694

Range Predicates Unable to Avoid Filter Computation Due to Null Column Values #14694

Comments

ankitsultana commented Dec 21, 2024 • edited Loading

Jackie-Jiang commented Dec 21, 2024

ankitsultana commented Dec 22, 2024

Problem

Factors Affecting Prominence of this Issue

Optimizing When Other Selective and/or Efficient Predicates

Possible Long Term Fix?

richardstartin commented Dec 22, 2024

richardstartin commented Dec 22, 2024

gortiz commented Jan 7, 2025

ankitsultana commented Dec 21, 2024 •

edited

Loading