Let’s say that we have a sequence *A* of *1<=n<=10^5* elements, and that this sequence may or may not be ordered but all the numbers *{1…n}* are present exactly once. And now given 1<=k<=10, you’re asked to find all the increasing sub-sequences of *A* with exactly *k+1* elements.

This is a problem I get from a competitive programming portal, you may find the details here. If at the first glance one don’t have any intuition on how to solve a problem, the best approach is to write down and analyze an example, we may put *A={1,2,3,5,4}* with* n=5* and *k=2 -Totally random numbers! -.*

So, we want to have a look at all the increasing subsequences of three elements, at first let’s check all the possible *k<=n* subsequences (From now on, let’s be our *k* just the *k+1* from the problem statement) so for k={1..5} we have:

*1;2;3;4;5,*five sub sequences*1,2;1,3;1,5;1,4;2,3;2,5;2,4;3,5;3,4,*nine subsequences*1,2,3;1,2,5;1,2,4;1,3,5;1,3,4;2,3,5;2,3,4,*seven subsequences*1,2,3,5;1,2,3,4*, two subsequences- no subsequences

And now we know that for *k=3* the answer is seven. A brute force approach for the problem is just to check all the possible *nC(k+1)* combinations of A and count the amount of them which have strictly increasing elements, this is going to work but will take ages for big *n’s*.

To do better we need to make some observations, let’s note first that for increasing *k* we are building the multi subset by taking all the *k-1* subsets and checking if we can append a new element maintaining the condition that the subsequence is in strictly increasing order, let’s have a look at *k=3:*

- For
*k-1*we have:*1,2;1,3;1,5;1,4;2,3;2,5;2,4;3,5;3,4,*can we use any of them bo build*k*? *{1,2}*is good with*3,5,4:*We have*{1,2,3},{1,2,5},{1,2,4}*.*{1,3}*is good with*5,4*: We have*{1,3,5},{1,3,4}**{1,5}*is good with nothing, no*Ai>5*are available*{1,4}*is good with nothing, 4 is the last element of A, nothing to append available.*{2,3}*is good with*5,4*:*{2,3,5},{2,3,4}*- The remaining elements of
*k-1*are not good, we cannot build an increasing sequence of*k*elements with them.

Note that we are appending the new element to the *k-1/n-1* subsequence ending with the bigger number smaller than the new one, which of course make sense since we want to build increasing subsequences and since from the problem statement we know that all the numbers from *{1…n}* appear exactly once, then the closest number to a given *Ai* must be *Ai-1 (*or smaller if A is not in order*)*

This is a good starting point, but is not sufficient. Let’s think about how the amount of subsequences increase if we increase* n* adding some more elements to *A*. If for a given *n* we have counted say *c* subsequences, increasing *n* will not decrease *c*! *c* will not change, but will eventually increase!.

So for *k=1* we have:

For bigger *k* the same rule apply, so let’s start drawing a table for* {1,2,3,5,4}* and *k=1…5:*

Drawing the table is easy for *k=1*, we know the values of *c*. for increasing *k* we can only say that if *n<k* then *c=0* for sure (you cannot build a k-element sequence from n if *n<k*), and if *k==n* then it may contain one subsequence or eventually zero. For example *k=4* to *n=4 *we have *{1,2,3,5}*, which is increasing.

But what about the rest of the table? Well we may use our first observation, that is possible to build *k* from *k-1 (and n-1 of course)*:

So for example, if *n=5* and *k=4* then *c=7* (the correct solution of our sample problem). We built 7 from *4+3*, where *4* is *1+3* and *3* came from the calculated c for *{1,2,3}*. we are not picking *{1,2,3,5}* since* 5>4*!

At any point in the table once we have calculated *c*, we are using that* c* for all the next columns when adding new *n’s*. Let’s have a look again at the row *k:3* we have: *0,0,1,1+3,1+3+3* and so forth, what we have here are cumulative sums! As already said before once a value of *c* is calculated for a given *k/n* then increasing *n* will give us *c’>=c* or in other words *c’=c+q*, where *q* is the amount of additional subsequences introduced by the new element!

Now we can write down our first algorithm to solve this problem:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
using ll = long long; ll num_of_elements; ll max_k; ll cum[100001][12]; //Update the cumulative values for the row k void update_cum(ll val,int start_index,int k) { for(int i{start_index};i<=num_of_elements;i++) { cum[i][k] += val; } } void run() { cin >> num_of_elements >> max_k; ++max_k; ll value; update_cum(1,0,0); for(ll i{1};i<=num_of_elements;i++) { cin >> value; for(int k{1};k<=max_k;k++) { ll cur_cum = cum[value-1][k-1]; //calculate c update_cum(cur_cum,value,k); } } cout<<cum[num_of_elements][max_k]<<endl; } |

The code is performing the same operations completed on the table above. The first call to *update_cum(1,0,0)* set *c* for *k=1*, we know for sure what values this row will have. For the other rows we use the observations made before to find all the *c*‘s.

This code works perfectly fine but has one major flaw: Is slow! The most expensive operation this algorithm performs is updating of the cumulative table, if you look at the inner loop you’ll notice that a call to *update_cum *will be performed for all the *Ai* from *A*, even if not in order we’ll have a pass to *update_cum* value’s from *one* to *num_of_elements*;

Since *update_cum *iterate from *i* to *num_of_elements* for each *1<=i<=num_of_elements*, then the execution time is given by *n+n-1+n-2+…+1=n(n+1)/2* or *O(n^2). *We should really improve *update_cum (*Yes, you can ignore *k* since *k<<n)*!

The Fenwick tree (Or Binary Indexed Tree) is a beautiful data structure which is exceptionally well suited for operating on cumulative tables. If this is the first time you hear about the Fenwick’s tree then you must absolutely read the original paper from Peter Fenwick or at least have a look at here or here! You’ll find this data structure useful very frequently.

Let’s have a look at the modified version of the algorithm, which is identical but the way it handle the cumulative tables. For the sake of performance comparison I introduced one minor modification at the code presented above, the function *run *now do not contain any *cin* and *cout* operation, the data are collected in main and then provided to *run*.

Here’s the code:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
ll cum_sum(ll idx,int k) { if(k==0) return 1; ll sum{0}; while(idx>=1) { sum += cum[idx][k]; idx -= idx & -idx; } return sum; } void cum_add(ll val,ll idx,int k) { while(idx<=num_of_elements) { cum[idx][k] += val; idx += idx & -idx; } } ll run() { for(ll i{1};i<=num_of_elements;i++) { for(int k{1};k<=max_k;k++) { ll cur_cum = cum_sum(data[i]-1,k-1); cum_add(cur_cum,data[i],k); } } return cum_sum(num_of_elements, max_k); } |

If you don’t understand how *cum_add* or *cum_sum* work’s don’t worry and have a look at the links I provided before. My point is that *run* is exactly the same as before, but now is much more faster! *sum_sum* and *sum_add* are O(*log2(n))* functions, and the new algorithm is thus *O(n*log2(n))*.

You may find on my github the code of those algorithms, if you execute both the brute force approach and the *smart force* approach –*Using fenwick tree*– you’ll notice a huge difference in the execution time. Let’s have a look at this very simple testing procedure:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
void bench(std::function<ll()> fnc, const string& name) { //Make sure the cum table is clear for(ll i{0};i<=num_of_elements;i++) for(int k{0};k<=max_k;k++) cum[i][k] = 0; //Ok, let's go.. cout<<"Running: "<<name<<endl; auto start = chrono::high_resolution_clock::now(); auto ret_value = fnc(); auto stop = chrono::high_resolution_clock::now(); cout<<"Execution time: "<<chrono::duration_cast(stop-start).count()<<"msn"; cout<<"Result: "<<ret_value<<endl; } int main() { data.resize(10000); std::iota(data.begin(),data.end(),0); num_of_elements = data.size()-1; max_k = 10; bench(brute_force::run, "Brute"); bench(smart_force::run, "Smart"); } |

For *n=9999* (*note the 0 is ignored, 1<=n<=N*) and *k = 10*, the output is:

1 2 3 4 5 6 |
<span style="color:#ff0000;">Running: Brute Execution time: 1515ms</span> Result: 4152669778378200987 <span style="color:#00ff00;">Running: Smart Execution time: 4ms</span> Result: 4152669778378200987 |

I think no more comments are required here, Fenwick tree’s are absolutely awesome!

Thanks for reading!