Front-end data structure and algorithm (11): A binary search algorithm that seems simple and maddening

Front-end data structure and algorithm (11): A binary search algorithm that seems simple and maddening

Preface

The binary search method is an efficient search algorithm. Its idea is very easy to understand, but it is not simple to write a correct binary search. This chapter starts with the most basic implementation of binary search, explains its writing skills, and then introduces the compilation of its four common variants, and finally applies binary search to solve real interview questions, so that you can apply what you have learned and have a deeper understanding of it. Really master it.

What is binary search?

This is an efficient algorithm for quickly finding an element in an ordered array . For example, the example given in Chapter 1: You borrowed a stack of books to go out of the bookstore, and one of them forgot to register. How to find that book quickly? You can try it out one by one, or you can directly detect half of a stack of books each time, and repeat the process for the remaining half stack of books. In this way, you only need to use 4 books for 12 books. The search algorithm that divides the data in two every time like this is the binary search method.

Limitations of binary search

Although the search efficiency is very high, like all algorithms, it also has its limitations.

1. Must use array

You need to use the array to access only O(1)the characteristics of the element through the subscript . This linked list cannot be achieved, because access to an element must be traversed, which will increase the complexity.

2. The data must be in order

Because each time is compared with the intermediate element, the result is to discard half of the search range, so this intermediate element needs to play the role of a critical value.

3. The amount of data is too small to be used

If the amount of data is too small, it is not much different from the speed of traversal, so there is no need to use binary search. There are exceptions. If the comparison operation is time-consuming, binary search can effectively reduce the number of comparisons, and binary search can be used even if the amount of data is small.

Basics: the realization of binary search

Every time we compare the intermediate value of the sequence array with the value to be searched, if the intermediate value is greater than the value to be searched, it is searched in the cell, and vice versa. After the idea is clear, then go to the code, follow the code to better understand:

function binarySearch(arr, val) {
  let l = 0;//left border
  let r = arr.length-1;//right border
 //Find in the range of [l ... r-1]
  
  while (r >= l) {//termination condition
    const mid = (l + (r-l)/2) | 0;//take the middle value
    if (arr[mid] === val) {//just found
      return mid;//return the corresponding subscript
    } else if (arr[mid]> val) {//If the current value is greater than the search value, go to the left edge to find
      r = mid-1;//Redefine the right boundary to mid-1
    } else {
      l = mid + 1;//redefine the left boundary
    }
    
  }
  return -1;//return -1 if not found
}

const arr = [1, 2, 3, 4, 5, 6, 9, 10];
binarySearch(arr, 10);//7

Here, including the intermediate values ​​in the previous chapters are all adopted l + (r - l)/2 | 0, but not adopted (r + l)/2 | 0, because if the lsum ris very large, the addition will cause an integer overflow bug.

First need mid, so that it is centered in the array into two, the left is the [l ... mid - 1]inter-cell right is [mid + 1 ... r]large range.

Because it is the order of the array, provided that when arr[mid] > valthe time, you can simply ignore a large range, the value you're looking for affirmation between the district, so let the right side of the boundary is r = mid - 1, why here - 1, because before the judgment has shown that this time arr[mid]is not equal to valthe , So the current element needs to be midexcluded from the scope of the next search. Vice versa, half of the range can be ignored for each search. In the end l > r, it said that it was not found.

Pay attention to the difference between open and closed intervals

Why is the binary search brain-burning? The fundamental reason lies in its boundary problem, so at the beginning, you must define the meaning of the boundary.

The binary search in the above version uses a closed interval definition, and the right boundary is defined as r = arr.length - 1, that is [l ... r - 1], to search within the range, so the termination condition of the loop is r >= lbecause there is an element in the array that does not participate in the search when they are equal.

You can also use the open interval definition method, that is, the right boundary is to search r === arr.lengthin [l ... r )the range. At this time, the cut-off condition of the loop cannot be r >= l, because ras a subscript, there is a situation where the array is out of bounds, and the condition needs to be adjusted to r > l. And the redefinition of the right boundary cannot be mid - 1, the right boundary must be greater than the subscript of the element to be searched. The open interval binary search code is as follows:

function binarySearch(arr, val) {
  let l = 0;
  let r = arr.length;//open interval definition
 //Find in the range of (l...r)
  
  while (r> l) {//pay attention to the end condition
    const mid = (l + (r-l)/2) | 0;
    if (arr[mid] === val) {
      return mid;
    } else if (arr[mid]> val) {
      r = mid;//Redefine no longer-1
    } else {
      l = mid + 1;
    }
  }
  return -1;
}

Therefore, it is not difficult to write a good definition of the boundary, and be familiar with its meaning, and then write an accurate binary search.

Recursive binary search

In fact, the recursive version is better understood, because the repeatability of the sub-process is clear at a glance, but the performance will be slightly worse than the loop, but they are all at a level of complexity, and the difference is small. code show as below:

function binarySearch(arr, val) {
  const _helper = (arr, l, r) => {
    if (l> r) {//Because it is a closed interval, there are elements to compare when equal
      return -1;
    }
    const mid = (l + (r-l)/2) | 0;
    if (arr[mid] === val) {
      return mid;
    } else if (arr[mid]> val) {
      return _helper(arr, l, mid-1);
    } else {
      return _helper(arr, mid + 1, r);
    }
  };
  return _helper(arr, 0, arr.length-1);//use closed interval
}

Advanced: a variant of binary search

The above binary search is based on the case that there is no duplicate data in the array, but if you want to return the first matching element in an array of repeated data, the binary search implemented above will not be suitable. This is the maddening variant problem of binary search. Basic binary search needs to be modified to meet demand. There are four common variants.

1. Find the first matching element

This search rule is established after the matched element has been found. Because the first one is found, it is first to determine whether this element is the first element of the array, and then the previous element of the found element does not match. Just work. The modified code is as follows:

function binarySearch(arr, val) {
  let l = 0;
  let r = arr.length-1;//closed interval
  while (r >= l) {
    const mid = (l + (r-l)/2) | 0;
    if (arr[mid] <val) {
      l = mid + 1;
    } else if (arr[mid]> val) {
      r = mid-1;
    } else {//If a matching element has been found
      if (mid === 0 || arr[mid-1] !== val) { 
       //The first or previous element of the array does not match, indicating that it was found
        return mid;
      } else {
       //Otherwise the previous element continues as the right interval
        r = mid-1
      }
    }
  }
  return -1;
}

const arr = [1, 2, 2, 2, 2, 3, 4];
binarySearch(arr, 2)//1

2. Find the last matching element

Similar to the previous variant, after finding the matching element, you need to determine whether the element is the last bit of the array, and you also need to determine whether the last bit of the matching element does not match. The transformation code is as follows:

function binarySearch(arr, val) {
  let l = 0;
  let r = arr.length-1;//closed interval
  while (r >= l) {
    const mid = (l + (r-l)/2) | 0;
    if (arr[mid] <val) {
      l = mid + 1;
    } else if (arr[mid]> val) {
      r = mid-1;
    } else {//If a matching element has been found
      if (mid === arr.length-1 || arr[mid + 1] !== val) {
       //is the last element of the array or the mismatch after it
        return mid;
      } else {
       //Otherwise the next element continues as the left interval
        l = mid + 1
      }
    }
  }
  return -1;
}

const arr = [1, 2, 2, 2, 2, 3, 4];
binarySearch(arr, 2);//4

3. Find the first element that is greater than or equal to the match

It needs to be judged at the position greater than or equal to that has been found. If this element is the first or the element before it is less than the element to be matched, it means that it has been found. code show as below:

function binarySearch(arr, val) {
  let l = 0;
  let r = arr.length-1;//closed interval
  while (r >= l) {
    const mid = (l + (r-l)/2) | 0;
    if (arr[mid] >= val) {//The element that is greater than or equal to has been found
      if (mid === 0 || arr[mid-1] <val) { 
     //If it is the first element of the array or the previous element does not match 
        return mid;
      } else {
        r = mid-1;//shrink the right margin
      }
    } else {
      l = mid + 1;
    }
  }
  return -1;
}

const arr = [1, 5, 5, 7, 8, 9];
console.log(binarySearch(arr, 4));//1

4. Find the last element that is less than or equal to the match

Similar to the previous variant problem, just rewrite the set of elements less than or equal to, the code is as follows:

function binarySearch(arr, val) {
  let l = 0;
  let r = arr.length-1;//closed interval
  while (r >= l) {
    const mid = (l + (r-l)/2) | 0;
    if (arr[mid] >= val) {//The element that is greater than or equal to has been found
      if (mid === 0 || arr[mid-1] <val) { 
     //If it is the first element of the array or the previous element does not match 
        return mid;
      } else {
        r = mid-1;//shrink the right margin
      }
    } else {
      l = mid + 1;
    }
  }
  return -1;
}

const arr = [1, 2, 2, 7, 8, 9];
console.log(binarySearch(arr, 5));//2

Practice: Solve real problems

Do you think you have mastered the above two binary search methods? The details of the real binary search are more particular, let's feel why the binary search makes people crazy.

35-Search for insertion position

Given a sorted array and a target value, find the target value in the array and return its index.
If the target value does not exist in the array, return the position where it will be inserted in order.

You can assume that there are no duplicate elements in the array.

Example 1:
Input: [1, 3, 5, 6], 5
Output: 2

Example 2:
Input: [1, 3, 5, 6], 2
Output: 1

Example 3:
Input: [1, 3, 5, 6], 7
Output: 4

Example 4:
Input: [1, 3, 5, 6], 0
Output: 0

The example of this question has been described very clearly. If it happens to be the right element, the subscript of this element is returned. Ordered, search, and no repetition , these keywords are completely tailored for binary search, but there is a difference from regular binary search. When there is no such element, the index of the previous element is returned. The code is as follows:

var searchInsert = function (nums, target) {
  let l = 0;
  let r = nums.length-1;
  while (r >= l) {
    const mid = (l + (r-l)/2) | 0;
    if (target> nums[mid]) {
      l = mid + 1;
    } else if (target <nums[mid]) {
      r = mid-1;
    } else {
      return mid;//If it is equal then it is just right
    }
  }
  return l;//Otherwise return the left element
};

540 - A single element in an ordered array

Given an ordered array containing only integers, each element will appear twice, and only one number will appear only once. Find this number.

Example 1:
Input: [1,1,2,3,3,4,4,8,8]
Output: 2

Example 2:
Input: [3,3,7,7,10,11,11]
Output: 10

Note: Your solution should run in O(log n) time complexity and O(1) space complexity.

If you use a violent solution, then this intermediate problem is too simple, and the final note has an emphasis, to ensure that the solution is at the O(logn)level. The question explains that it is an ordered array , so this question naturally thinks of using the binary search method, but the binary search written above does not seem to be applicable. The trouble is how to determine which side of the element to be searched, and how to throw half of it every time. The data to be used.

After observation, this array has a characteristic, the length is odd, because only one number appears once, and the other numbers appear twice. Also, even if they are all odd numbers, if they are divided equally by the middle subscript, it can be divided into two situations:

1. The length of the arrays on both sides of the split is odd

For example, an array with an array length of 7is divided into two length 3arrays. At this time, we need to treat two identical numbers as one number. There are two situations:

1.1-The middle element is the same as the previous one. At this time, the only different element must be on the right side. The left side is negligible, and the next starting index mid + 1can be from the beginning.

1.2-The middle element is the same as the latter one, the only different element must be on the left side, the right side can be ignored, and the next cut-off position is mid - 1.

2. The length of the arrays on both sides of the split is even

At this time, it is divided into two cases to deal with, which is exactly the opposite of the previous split into odd numbers:

2.1-When the middle element is the same as the previous element, the only different element is on the left side, because since they are all even numbers, which side has the same element, which side is an odd array, the next search cutoff position is mid - 2.

2.2-In the same way, the middle element is the same as the next element, the only different element is on the right, and the next start position is mid + 2.

Knowing how to divide the array equally, and how to redefine the boundary, this solution is easy to write. But one more note is that when there is only one element in the array, there is no need to compare, it must be that element. The complete code is as follows:

var singleNonDuplicate = function (nums) {
  let l = 0;
  let r = nums.length-1;
  while (r> l) {//keep the last element in the closed interval
    const mid = (l + (r-l)/2) | 0;
    const splitIntoEven = (mid-l)% 2 === 0;//Is it even length after split
    if (nums[mid] === nums[mid + 1]) {//if the same as the previous one
      if (splitIntoEven) {//And left and right are even length
        l = mid + 2;//in line with 2.2
      } else {
        r = mid-1;//meet 1.2
      }
    } else if (nums[mid] === nums[mid-1]) {//same as the latter
      if (splitIntoEven) {//is an even length
        r = mid-2;//meet 2.1
      } else {
        l = mid + 1;//meet 1.1
      }
    } else {
      return nums[mid];//The current element and before and after are not the same, return to this heterogeneous
    }
  }
  return nums[l];//return the last element
};

436-Find the right interval

There is a two-dimensional array, in which each individual array i represents an interval value, and the first element is the starting point of the interval.
The second element is the end of the interval. Ask if there is another interval j whose starting point is greater than or equal to the end point of interval i,
Such an interval can be said to be the right interval of i.

For each interval i, you need to find the smallest right interval of its subscript value, and return -1 if there is none.

note:
You can assume that the end of the interval is always greater than its starting point.
You can assume that none of these intervals have the same starting point.

Example 1:
Input: [[3,4], [2,3], [1,2]]
Output: [-1, 0, 1]

Explanation:
For [3,4], there is no "right" interval that satisfies the condition.
For [2,3], the interval [3,4] has the smallest "right" starting point;
For [1,2], the interval [2,3] has the smallest "right" starting point.

For each interval i, another interval needs to be found j, and jthe starting point of this interval must be greater than or equal to ithe ending point. However, there may be multiple such intervals, and we need to find the one that meets this requirement and has the smallest subscript value.

We can use sorting, sorting in ascending order according to the starting point, and when traversing in sequence, the first item that meets the condition is definitely the item with the smallest subscript value. However, another problem will be encountered at this time, that is, the subscript after sorting is different from the initial subscript. At this time, we can use the maprecord of the initial subscript value. After finding the matching interval, check the initial subscript of the interval and add it to the returned set. code show as below:

var findRightInterval = function (intervals) {
  const res = [];//the final result returned
  const map = new Map();//record the initial subscript
  for (let i = 0; i <intervals.length; i++) {
    map.set(intervals[i], i);//use interval as key
  }
  intervals.sort((a, b) => a[0]-b[0]);//sort in ascending order according to the starting point

  for (let i = 0; i <intervals.length; i++) {
    let minIndex = -1;
    let l = i;
    let r = intervals.length-1;
    while (r >= l) {//Closed interval, the last element should also be compared
      const mid = (l + (r-l)/2) | 0;
      if (intervals[mid][0] >= intervals[i][1]) {//if it matches
        minIndex = map.get(intervals[mid]);//get the original index
        r = mid-1//change to a smaller starting point to continue
      } else {
        l = mid + 1//does not meet the starting point of the change to continue
      }
    }
    res[map.get(intervals[i])] = minIndex; 
   //intervals[i] is the position after sorting
   //And map.get(intervals[i]) finds its original subscript position again
   //minIndex corresponds to the smallest right interval of its original item
  }
  return res;//return to the collection
};

Because binary search can only work on ordered arrays, when the array is unordered, we can sort it first. If the binary search is not used, the second-level loop is still traversed, and the overall complexity will become O(n²).

At last

In this chapter, we have implemented binary search and its four variants by hand. For how to write the correct binary search, we also explained the problem of opening and closing intervals when defining boundaries. There is also that when you encounter the two keywords in order and search , it is easy to think of using the binary search method to try first. What if it is disorderly? It's the same as the last topic, just look for it after sorting.

Reference: https://cloud.tencent.com/developer/article/1743941 Front-end data structure and algorithm (11): A seemingly simple and crazy binary search algorithm-Cloud + Community-Tencent Cloud