## Segment Tree – Basics

This is one of the most popular data structures in competitive programming, we will be covering its basic implementation and how it works in this tutorial. You will find a number of its applications in many questions, so let’s get started.

Suppose we have an array of numbers of order 1e5, now the question gives queries of order 1e5 and in each query it asks the sum of numbers in the range l to r. By doing this naively, one might end up with a solution that takes q*n time which is not suitable for contests.

We can easily conquer this problem using this data structure that divides our query into at most logn number of pieces.

Now, first of all we need to build a tree for that.

1. The root node of this is having the sum of all elements, this root node has two children that will contain the sum of ranges 0 to n/2 in the left child and n/2+1 to n-1 in the right child.

2. Similarly, this happens to each and every node, and a tree would be formed. This could be easily done by recursively calling on nodes. An example tree is given below.

3. This tree can be stored in an array where i’th position will have its children at 2*i and 2*i +1 positions. Note that root node is at 0 th position. Also, since this is a recursive function, position i will always get a value after its children got the values.

4. Also, note that the height of tree will always be logn base 2 because at each node its divided in to two equal ranges. And because of this, in general case, at each level there will be atmost n mergings, so the building complexity will be in limit of N*logN, look at below example for better understanding of the tree

5. Here the blue colored values in the tree represents the values stored at corresponding red colored indexes, for example 36 is stored at 0 th position covering the range (0-5) of actual array, 27 is stored at 2nd position covering the range(3-5) of actual array, 7 is stored at 11th position covering the range(3) of actual array.

Querying part :

1. Now we want to query some range(it is of the form left,right), we will start from root node of that tree and check if the given range(query) is a part of the root node, so we will recursively call left and right childs, whenever there is a complete overlap of our query with the current node, we will simply return the value present at that node, otherwise if it lies outside completely then we will return 0, or else we will do recursion on both childs.

2. It looks simple and easy to understand as well, each query works in logn time since the height of the tree is logn, also in each recursion it will stop after some point of time and if we analyse it closely then it hardly goes 2*logn time, so giving us an asymptotic time of logn for each query.

For example in the above tree, if we want to know the sum of range (2,5). Remember its 0 based indexing.

Then our function would go like as follows,

1. Starting with root node, our query lies in it, so we will call left and right child

2. Left child contains partial range of our query, call left and right child of node 1

3. Left of node 1 lies completely outside so we will return 0.

4. Right of node 1 lies completely in our query so we will return this value(5).

5. Now right child od node 0 which is 2 contains partial range so we call left and right childs of 2

6. Left child of 2 completely overlaps the query range so we will return this value(16).

7. Right child of 2 lies outside the query range so we will return 0.

8. Now for root node, we got the values from its left child as 16 and from right child as 5, so we will add them which is 21 and return this as our answer.

Updating part:

Update can be either a point update or range update, range update will be covered later because it involves another concept of lazy propagation, and adding it here will make this article huge, so we will be discussing point update.

We will start from the root node and find the leaf node recursively just like we did in querying part, and while coming back from the recursion we will update all the upper nodes. It is straight O(logn) complexity.

Lets take an example to better understand this,

Take the tree which was build above, and now we want to update the value at index 3 which is 7 now to 10.

1. Start from root node, index 3 is present in root nodes’ range so we call left and right, left child of root node doesn’t contain 3, right child of root node does.

2. So we call left and right child of node 2.

3. Again right of node 2 doesn’t contain this 3, but left one does.

4. So, we call left and right of node 5.

5. Again, right child of 5 doesn’t contain 3, but left child of 5 is 3.

6. So we got our point, we will update this value to 10.

7. Now our recursion will go back by changing the results,

8. First it will update the value of node 5 to 19.

9. Then it will update the value at 2 to 30.

10. Then finally the root node to 39 by adding their left and child values.

Now it is also clear that time complexity would remain logn.

Implementation:

Practice Problems