'분류 전체보기' 카테고리의 글 목록 (4 Page)

분류 전체보기

06. Top K Elements Technique 2024.08.06
05. Modified Binary Search Technique 2024.08.06
04. Binary Tree BFS Techniques 2024.08.06
03. Binary Tree DFS Techniques 2024.08.06
02. Sliding Window Technique 2024.08.06
01. Two Pointers Technique 2024.08.06
Differences between subarrays, substrings, subsequences, and subsets 2024.08.06
Python 2 vs 3 Difference 2024.04.12

06. Top K Elements Technique

2024. 8. 6. 06:17

Top K Elements Technique

The "Top K Elements" technique involves finding the largest (or smallest) ( k ) elements from a dataset. This is a common problem in computer science with applications in data analysis, search engines, recommendation systems, and more.

Key Concepts

Priority Queue (Min-Heap or Max-Heap): A heap data structure that allows efficient extraction of the minimum or maximum element.
Quickselect Algorithm: A selection algorithm to find the ( k )th largest (or smallest) element in an unordered list.
Sorting: Sorting the entire dataset and then selecting the top ( k ) elements.
Bucket Sort or Counting Sort: Special techniques for integer data with a limited range.

Methods to Find Top K Elements

Heap (Priority Queue) Method: Efficient for dynamic data and large datasets.
Quickselect Algorithm: Efficient for static data with good average-case performance.
Sorting: Simple but less efficient for large datasets.
Bucket Sort or Counting Sort: Efficient for specific types of data.

1. Heap (Priority Queue) Method

Using a min-heap (for the largest ( k ) elements) or max-heap (for the smallest ( k ) elements) to maintain the top ( k ) elements.

Code:

import heapq

def top_k_elements(nums, k):
    # Use a min-heap for the largest k elements
    if k == 0:
        return []

    min_heap = nums[:k]
    heapq.heapify(min_heap)

    for num in nums[k:]:
        if num > min_heap[0]:
            heapq.heappushpop(min_heap, num)

    return min_heap

# Example usage:
nums = [3, 2, 1, 5, 6, 4]
k = 2
print(top_k_elements(nums, k))  # Output: [5, 6]

Explanation:

Initialize: Create a min-heap with the first ( k ) elements.
Process: For each remaining element, if it is larger than the smallest element in the heap, replace the smallest element.
Result: The heap contains the top ( k ) largest elements.

2. Quickselect Algorithm

Quickselect is a selection algorithm to find the ( k )th largest (or smallest) element in an unordered list, which can then be used to find the top ( k ) elements.

Code:

import random

def quickselect(nums, k):
    def partition(left, right, pivot_index):
        pivot_value = nums[pivot_index]
        nums[pivot_index], nums[right] = nums[right], nums[pivot_index]
        store_index = left
        for i in range(left, right):
            if nums[i] < pivot_value:
                nums[store_index], nums[i] = nums[i], nums[store_index]
                store_index += 1
        nums[right], nums[store_index] = nums[store_index], nums[right]
        return store_index

    def select(left, right, k_smallest):
        if left == right:
            return nums[left]
        pivot_index = random.randint(left, right)
        pivot_index = partition(left, right, pivot_index)
        if k_smallest == pivot_index:
            return nums[k_smallest]
        elif k_smallest < pivot_index:
            return select(left, pivot_index - 1, k_smallest)
        else:
            return select(pivot_index + 1, right, k_smallest)

    n = len(nums)
    return select(0, n - 1, n - k)

# Example usage:
nums = [3, 2, 1, 5, 6, 4]
k = 2
print(quickselect(nums, k))  # Output: 5

Explanation:

Partition: Use the partitioning step from the QuickSort algorithm to position the pivot element correctly.
Select: Recursively select the ( k )th smallest element, which corresponds to the ( (n - k) )th largest element.
Result: The algorithm returns the ( k )th largest element, and the elements larger than it can be found in the list.

3. Sorting

Sorting the array and selecting the top ( k ) elements.

Code:

def top_k_elements_sort(nums, k):
    nums.sort(reverse=True)
    return nums[:k]

# Example usage:
nums = [3, 2, 1, 5, 6, 4]
k = 2
print(top_k_elements_sort(nums, k))  # Output: [6, 5]

Explanation:

Sort: Sort the array in descending order.
Select: Take the first ( k ) elements from the sorted array.
Result: These are the top ( k ) largest elements.

4. Bucket Sort or Counting Sort

For integer data with a limited range, bucket sort or counting sort can be efficient.

Code (Counting Sort for a limited range):

def top_k_elements_counting_sort(nums, k):
    max_val = max(nums)
    count = [0] * (max_val + 1)

    for num in nums:
        count[num] += 1

    result = []
    for num in range(max_val, -1, -1):
        while count[num] > 0 and len(result) < k:
            result.append(num)
            count[num] -= 1

    return result

# Example usage:
nums = [3, 2, 1, 5, 6, 4]
k = 2
print(top_k_elements_counting_sort(nums, k))  # Output: [6, 5]

Explanation:

Count Frequency: Count the frequency of each element.
Select: Traverse the count array from the highest value, selecting elements until ( k ) elements are selected.
Result: These are the top ( k ) largest elements.

If you want to find the largest K numbers and the smallest K numbers in an array simultaneously, you can use a combination of Min-Heap and Max-Heap. Here's a step-by-step approach in English:

Approach:

Initialization:
- Create a Min-Heap to keep track of the largest K elements.
- Create a Max-Heap to keep track of the smallest K elements. (Since Python's heapq module only provides a Min-Heap, you can simulate a Max-Heap by pushing negative values.)
Iterate through the array:
- For each element in the array:
  - Add it to the Min-Heap if it's larger than the smallest element in the heap (the root). If the Min-Heap exceeds size K, remove the smallest element.
  - Add it to the Max-Heap (with a negative sign) if it's smaller than the largest element (the root). If the Max-Heap exceeds size K, remove the largest element.
Results:
- After processing all elements, the Min-Heap contains the largest K elements, and the Max-Heap (with the negative signs removed) contains the smallest K elements.

This method ensures that you maintain both the largest and smallest K elements as you process the array, and the time complexity remains efficient.

Code:

import heapq

def find_largest_and_lowest_k_numbers(nums, k):
    # Min-Heap for largest K numbers
    min_heap = nums[:k]
    heapq.heapify(min_heap)

    # Max-Heap for smallest K numbers (using negative values)
    max_heap = [-num for num in nums[:k]]
    heapq.heapify(max_heap)

    # Process the remaining elements in the array
    for num in nums[k:]:
        # Maintain the largest K numbers
        if num > min_heap[0]:
            heapq.heapreplace(min_heap, num)

        # Maintain the smallest K numbers
        if -num > max_heap[0]:
            heapq.heapreplace(max_heap, -num)

    # Convert the Max-Heap back to positive numbers
    lowest_k = [-x for x in max_heap]

    return min_heap, lowest_k

# Example usage
nums = [3, 2, 1, 5, 6, 4, 8, 7]
k = 3
largest_k, lowest_k = find_largest_and_lowest_k_numbers(nums, k)
print("Largest K:", largest_k)  # Output: [6, 7, 8]
print("Lowest K:", lowest_k)    # Output: [1, 2, 3]

215. Kth Largest Element in an Array

Problem: Given an integer array nums and an integer k, return the kth largest element in the array.

Approach:

Min-Heap: Use a min-heap to keep track of the largest k elements in the array. The top element of the heap will be the kth largest element.
Quickselect: Another approach is to use the Quickselect algorithm, which is based on the partition method used in QuickSort. This method is efficient with an average time complexity of (O(n)).

Min-Heap Code:

import heapq

def find_kth_largest(nums, k):
    heap = nums[:k]
    heapq.heapify(heap)
    for num in nums[k:]:
        if num > heap[0]:
            heapq.heappushpop(heap, num)
    return heap[0]

# Example usage:
nums = [3, 2, 1, 5, 6, 4]
k = 2
print(find_kth_largest(nums, k))  # Output: 5

Quickselect Code:

def find_kth_largest(nums, k):
    k = len(nums) - k

    def quickselect(l, r):
        pivot, p = nums[r], l
        for i in range(l, r):
            if nums[i] <= pivot:
                nums[p], nums[i] = nums[i], nums[p]
                p += 1
        nums[p], nums[r] = nums[r], nums[p]
        if p > k:
            return quickselect(l, p - 1)
        elif p < k:
            return quickselect(p + 1, r)
        else:
            return nums[p]

    return quickselect(0, len(nums) - 1)

# Example usage:
nums = [3, 2, 1, 5, 6, 4]
k = 2
print(find_kth_largest(nums, k))  # Output: 5

23. Merge k Sorted Lists

Problem: You are given an array of k linked-lists, each linked-list is sorted in ascending order. Merge all the linked-lists into one sorted linked-list and return it.

Approach:

Min-Heap: Use a min-heap to efficiently merge the k sorted linked lists. Each time, extract the smallest element from the heap and add it to the merged list.

Code:

import heapq

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

    def __lt__(self, other):
        return self.val < other.val

def merge_k_lists(lists):
    min_heap = []
    for root in lists:
        if root:
            heapq.heappush(min_heap, root)

    dummy = ListNode()
    curr = dummy

    while min_heap:
        smallest_node = heapq.heappop(min_heap)
        curr.next = smallest_node
        curr = curr.next
        if smallest_node.next:
            heapq.heappush(min_heap, smallest_node.next)

    return dummy.next

# Example usage:
# Assuming ListNode class is defined and linked lists are created.
lists = [list1, list2, list3]  # Example linked-lists
result = merge_k_lists(lists)

Explanation:

Min-Heap: Each list's head is added to the heap. The heap helps efficiently find and add the smallest element across all lists to the final merged list.

703. Kth Largest Element in a Stream

Problem: Design a class to find the kth largest element in a stream. Implement the KthLargest class:

KthLargest(int k, int[] nums) Initializes the object with the integer k and the stream of integers nums.
int add(int val) Appends the integer val to the stream and returns the element representing the kth largest element in the stream.

Approach:

Min-Heap: Maintain a min-heap of size k. The smallest element in the heap is the kth largest element in the stream.

Code:

import heapq

class KthLargest:
    def __init__(self, k, nums):
        self.k = k
        self.min_heap = nums
        heapq.heapify(self.min_heap)
        while len(self.min_heap) > k:
            heapq.heappop(self.min_heap)

    def add(self, val):
        heapq.heappush(self.min_heap, val)
        if len(self.min_heap) > self.k:
            heapq.heappop(self.min_heap)
        return self.min_heap[0]

# Example usage:
k = 3
nums = [4, 5, 8, 2]
kthLargest = KthLargest(k, nums)
print(kthLargest.add(3))  # Output: 4
print(kthLargest.add(5))  # Output: 5
print(kthLargest.add(10))  # Output: 5
print(kthLargest.add(9))  # Output: 8
print(kthLargest.add(4))  # Output: 8

Summary

Heap (Priority Queue) Method:
- Pros: Efficient for dynamic data and large datasets.
- Cons: Slightly complex to implement.
Quickselect Algorithm:
- Pros: Good average-case performance, efficient for static data.
- Cons: Worst-case performance can be poor.
Sorting:
- Pros: Simple to implement.
- Cons: Less efficient for large datasets ((O(n \log n))).
Bucket Sort or Counting Sort:
- Pros: Very efficient for integer data with a limited range.
- Cons: Not general-purpose, requires specific data characteristics.

Choosing the right method depends on the characteristics of the dataset and the specific requirements of the problem at hand.

'ML Engineering > python' 카테고리의 다른 글

08. Math (0)	2024.08.07
07. Subset Techniques (0)	2024.08.06
05. Modified Binary Search Technique (0)	2024.08.06
04. Binary Tree BFS Techniques (0)	2024.08.06
03. Binary Tree DFS Techniques (0)	2024.08.06

05. Modified Binary Search Technique

2024. 8. 6. 06:15

Modified Binary Search Technique

Modified Binary Search techniques are variations of the traditional binary search algorithm that are adapted to solve specific problems or work on specialized data structures. The traditional binary search algorithm is used to find the position of a target value within a sorted array in (O(\log n)) time. Modified versions extend this principle to handle more complex scenarios.

Key Concepts

Binary Search: A divide-and-conquer algorithm that repeatedly divides the search interval in half.
Modification: Adjusting the standard binary search to handle variations in data or specific problem requirements.

Applications

Finding the First or Last Occurrence of an Element.
Searching in a Rotated Sorted Array.
Finding the Peak Element in an Array.
Finding the Square Root of a Number.
Searching in a Nearly Sorted Array.

1. Finding the First or Last Occurrence of an Element

Problem: Given a sorted array with duplicate elements, find the first or last occurrence of a target value.

Code:

def find_first_occurrence(arr, target):
    left, right = 0, len(arr) - 1
    result = -1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            result = mid
            right = mid - 1  # continue to search on the left side
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return result

def find_last_occurrence(arr, target):
    left, right = 0, len(arr) - 1
    result = -1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            result = mid
            left = mid + 1  # continue to search on the right side
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return result

# Example usage:
arr = [1, 2, 2, 2, 3, 4, 5]
target = 2
print(find_first_occurrence(arr, target))  # Output: 1
print(find_last_occurrence(arr, target))   # Output: 3

Explanation:

Initialization: Start with the left and right pointers at the ends of the array.
Midpoint Calculation: Calculate the midpoint of the current search interval.
Adjust Search Range: If the target is found, update the result and adjust the search range to continue looking for the first or last occurrence.
Result: Return the index of the first or last occurrence.

2. Searching in a Rotated Sorted Array

Problem: Given a rotated sorted array, find the index of a target value.

Code:

def search_rotated_sorted_array(nums, target):
    left, right = 0, len(nums) - 1
    while left <= right:
        mid = (left + right) // 2
        if nums[mid] == target:
            return mid
        if nums[left] <= nums[mid]:  # left half is sorted
            if nums[left] <= target < nums[mid]:
                right = mid - 1
            else:
                left = mid + 1
        else:  # right half is sorted
            if nums[mid] < target <= nums[right]:
                left = mid + 1
            else:
                right = mid - 1
    return -1

# Example usage:
nums = [4, 5, 6, 7, 0, 1, 2]
target = 0
print(search_rotated_sorted_array(nums, target))  # Output: 4

Explanation:

Initialization: Start with the left and right pointers at the ends of the array.
Midpoint Calculation: Calculate the midpoint of the current search interval.
Determine Sorted Half: Check which half of the array is sorted.
Adjust Search Range: Narrow down the search range to the half that might contain the target.
Result: Return the index of the target if found, otherwise return -1.

3. Finding the Peak Element in an Array

Problem: Given an array, find the index of a peak element. A peak element is greater than its neighbors.

Code:

def find_peak_element(nums):
    left, right = 0, len(nums) - 1
    while left < right:
        mid = (left + right) // 2
        if nums[mid] > nums[mid + 1]:
            right = mid
        else:
            left = mid + 1
    return left

# Example usage:
nums = [1, 2, 3, 1]
print(find_peak_element(nums))  # Output: 2 (index of peak element 3)

Explanation:

Initialization: Start with the left and right pointers at the ends of the array.
Midpoint Calculation: Calculate the midpoint of the current search interval.
Compare Midpoint with Neighbor: Compare the midpoint with its right neighbor to determine the direction of the peak.
Adjust Search Range: Move the search range towards the peak.
Result: Return the index of the peak element.

4. Finding the Square Root of a Number

Problem: Find the integer square root of a non-negative integer ( x ).

Code:

def my_sqrt(x):
    if x < 2:
        return x
    left, right = 2, x // 2
    while left <= right:
        mid = (left + right) // 2
        num = mid * mid
        if num == x:
            return mid
        elif num < x:
            left = mid + 1
        else:
            right = mid - 1
    return right

# Example usage:
x = 8
print(my_sqrt(x))  # Output: 2

Explanation:

Initialization: Start with the left pointer at 2 and the right pointer at ( x // 2 ).
Midpoint Calculation: Calculate the midpoint of the current search interval.
Square Comparison: Compare the square of the midpoint with ( x ).
Adjust Search Range: Narrow down the search range based on the comparison.
Result: Return the integer part of the square root.

5. Searching in a Nearly Sorted Array

Problem: Given a nearly sorted array (where each element may be misplaced by at most one position), find the index of a target value.

Code:

def search_nearly_sorted_array(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        if mid - 1 >= left and arr[mid - 1] == target:
            return mid - 1
        if mid + 1 <= right and arr[mid + 1] == target:
            return mid + 1
        if arr[mid] < target:
            left = mid + 2
        else:
            right = mid - 2
    return -1

# Example usage:
arr = [10, 3, 40, 20, 50, 80, 70]
target = 40
print(search_nearly_sorted_array(arr, target))  # Output: 2

Explanation:

Initialization: Start with the left and right pointers at the ends of the array.
Midpoint Calculation: Calculate the midpoint of the current search interval.
Check Neighboring Elements: Check the midpoint and its immediate neighbors.
Adjust Search Range: Narrow down the search range based on the comparison.
Result: Return the index of the target if found, otherwise return -1.

1. 1004. Max Consecutive Ones III

This problem is typically solved using the sliding window technique, not binary search. However, if you wanted to use binary search to determine the maximum window size, you could explore different window sizes and use a helper function to check the validity, but this would be less efficient than the sliding window method.

Problem:

Given a binary array nums and an integer k, find the maximum number of consecutive 1s in the array if you can flip at most k 0s.

Code:
This problem doesn't naturally lend itself to a binary search solution in the traditional sense. Instead, the optimal approach is the sliding window method:

def longestOnes(nums: List[int], k: int) -> int:
    left = 0
    max_length = 0
    zeros_count = 0

    for right in range(len(nums)):
        if nums[right] == 0:
            zeros_count += 1

        while zeros_count > k:
            if nums[left] == 0:
                zeros_count -= 1
            left += 1

        max_length = max(max_length, right - left + 1)

    return max_length

Explanation:

We expand the window size until we exceed the number of 0s that can be flipped, and then we start shrinking the window from the left side.
This problem is best solved using the sliding window technique, and binary search is not a natural fit here.

33. Search in Rotated Sorted Array

Problem:
Given a rotated sorted array and a target value, find the index of the target. If not found, return -1.

Code:

def search(nums: List[int], target: int) -> int:
    left, right = 0, len(nums) - 1

    while left <= right:
        mid = (left + right) // 2

        if nums[mid] == target:
            return mid

        if nums[left] <= nums[mid]:  # Left half is sorted
            if nums[left] <= target < nums[mid]:
                right = mid - 1
            else:
                left = mid + 1
        else:  # Right half is sorted
            if nums[mid] < target <= nums[right]:
                left = mid + 1
            else:
                right = mid - 1

    return -1

Explanation:

We use binary search to identify which half of the array is sorted.
Depending on whether the target lies within the sorted half, we adjust our search range.

34. Find First and Last Position of Element in Sorted Array

Problem:
Given a sorted array of integers nums and a target value, find the starting and ending position of the target.

Code:

def searchRange(nums: List[int], target: int) -> List[int]:
    def findLeft():
        left, right = 0, len(nums) - 1
        while left <= right:
            mid = (left + right) // 2
            if nums[mid] < target:
                left = mid + 1
            else:
                right = mid - 1
        return left

    def findRight():
        left, right = 0, len(nums) - 1
        while left <= right:
            mid = (left + right) // 2
            if nums[mid] <= target:
                left = mid + 1
            else:
                right = mid - 1
        return right

    left_index = findLeft()
    right_index = findRight()

    if left_index <= right_index:
        return [left_index, right_index]
    return [-1, -1]

Explanation:

Two binary searches are conducted: one for the first occurrence of the target (findLeft), and one for the last occurrence (findRight).
This ensures both the start and end positions of the target are found efficiently.

69. Sqrt(x)

Problem:
Given a non-negative integer x, compute and return the square root of x. Since the return type is an integer, only the integer part of the result is returned.

Code:

def mySqrt(x: int) -> int:
    left, right = 0, x

    while left <= right:
        mid = (left + right) // 2
        if mid * mid == x:
            return mid
        elif mid * mid < x:
            left = mid + 1
        else:
            right = mid - 1

    return right

Explanation:

The binary search finds the integer square root by narrowing down the possible values.
If mid * mid is too small, the left boundary is adjusted; if too large, the right boundary is adjusted.

162. Find Peak Element

Problem:
A peak element is an element that is greater than its neighbors. Given an array of integers, find a peak element and return its index.

Code:

def findPeakElement(nums: List[int]) -> int:
    left, right = 0, len(nums) - 1

    while left < right:
        mid = (left + right) // 2

        if nums[mid] > nums[mid + 1]:
            right = mid
        else:
            left = mid + 1

    return left

Explanation:

Binary search is used to find the peak by comparing the middle element with its neighbors.
If the middle element is larger than the next, the peak is to the left; otherwise, it's to the right.

825. Friends Of Appropriate Ages

Problem:
Given an array representing the ages of a group of people, determine the number of friend requests made according to specific rules.

Code:
This problem does not naturally lend itself to binary search as the primary technique. Instead, a combination of sorting and two-pointer technique (which is sometimes referred to as a variation of binary search) is used to solve it efficiently.

def numFriendRequests(ages: List[int]) -> int:
    ages.sort()
    requests = 0

    for i in range(len(ages)):
        if ages[i] <= 14:
            continue

        left = bisect.bisect_left(ages, 0.5 * ages[i] + 7 + 1)
        right = bisect.bisect_right(ages, ages[i])

        requests += right - left - 1

    return requests

Explanation:

The ages are sorted, and for each person, valid friend requests are counted by using binary search (bisect_left and bisect_right) to find the valid age range.
This ensures that the number of friend requests is calculated efficiently.

Binary search is an essential technique for several of these problems, particularly when dealing with sorted arrays or when trying to minimize/maximize a certain condition within a range of values.

Summary

Modified binary search techniques adapt the basic binary search algorithm to solve a variety of problems more efficiently. By leveraging the divide-and-conquer strategy, these techniques can handle different data structures and problem constraints. Understanding and implementing these variations can significantly improve the performance of your algorithms for specific use cases.

'ML Engineering > python' 카테고리의 다른 글

07. Subset Techniques (0)	2024.08.06
06. Top K Elements Technique (0)	2024.08.06
04. Binary Tree BFS Techniques (0)	2024.08.06
03. Binary Tree DFS Techniques (0)	2024.08.06
02. Sliding Window Technique (0)	2024.08.06

04. Binary Tree BFS Techniques

2024. 8. 6. 06:11

Binary Tree BFS Techniques

Breadth-First Search (BFS) is a fundamental algorithm used to traverse or search tree or graph data structures. In the context of binary trees, BFS is often referred to as level-order traversal because it visits all nodes at each level of the tree before moving to the next level.

Key Concepts

Queue: BFS uses a queue to keep track of nodes at the current level before moving to the next level.
Level by Level Traversal: Nodes are processed level by level, starting from the root.
First In, First Out (FIFO): Nodes are added to the queue in the order they are encountered and processed in the same order.

Steps

Initialize: Start with a queue containing the root node.
Process: Dequeue a node, process it, and enqueue its children.
Repeat: Continue until the queue is empty, indicating that all nodes have been processed.

Applications

Level Order Traversal: Printing or collecting nodes level by level.
Finding the Depth/Height: Calculating the maximum depth or height of the tree.
Shortest Path in Unweighted Trees: Finding the shortest path in terms of number of edges from the root to any node.
Checking Completeness: Determining if a binary tree is complete.

Example: Level Order Traversal

Problem: Traverse the binary tree level by level and print each level.

Code:

from collections import deque

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def level_order_traversal(root):
    if not root:
        return []

    result = []
    queue = deque([root])

    while queue:
        level_size = len(queue)
        level = []

        for _ in range(level_size):
            node = queue.popleft()
            level.append(node.val)

            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)

        result.append(level)

    return result

# Example usage:
# Constructing the binary tree
#       3
#      / \
#     9   20
#        /  \
#       15   7
tree = TreeNode(3)
tree.left = TreeNode(9)
tree.right = TreeNode(20, TreeNode(15), TreeNode(7))

print(level_order_traversal(tree))  # Output: [[3], [9, 20], [15, 7]]

Explanation:

Initialize: A queue is initialized with the root node.
Process: Nodes are dequeued one by one. Their values are collected, and their children are enqueued.
Level by Level: The process repeats until the queue is empty, resulting in a list of lists where each sublist contains nodes at the same level.

Example: Finding Maximum Depth

Problem: Calculate the maximum depth (height) of a binary tree.

Code:

from collections import deque

def max_depth_bfs(root):
    if not root:
        return 0

    queue = deque([root])
    depth = 0

    while queue:
        depth += 1
        level_size = len(queue)

        for _ in range(level_size):
            node = queue.popleft()

            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)

    return depth

# Example usage:
print(max_depth_bfs(tree))  # Output: 3

Explanation:

Initialize: A queue is initialized with the root node and depth is set to 0.
Process: Nodes are dequeued level by level, and the depth counter is incremented after processing each level.
Depth Calculation: The process repeats until the queue is empty, resulting in the maximum depth of the tree.

Summary of BFS Techniques in Binary Trees

Level Order Traversal: Collecting or printing nodes level by level using a queue.
Finding Maximum Depth: Using BFS to determine the maximum depth by processing nodes level by level.
Shortest Path in Unweighted Trees: BFS naturally finds the shortest path in terms of edges from the root to any node.
Checking Completeness: Ensuring that all levels of the tree are fully filled except possibly the last level, which should be filled from left to right.

Advantages of BFS

Simple and Intuitive: Easy to understand and implement using a queue.
Level by Level Processing: Ideal for problems that require processing nodes level by level.
Shortest Path: Naturally finds the shortest path in unweighted trees or graphs.

When to Use BFS

When you need to process nodes level by level.
When finding the shortest path in an unweighted graph or tree.
When calculating metrics that depend on levels, such as the maximum depth or checking the completeness of a tree.

1. 200. Number of Islands

Problem:

Given a 2D grid of '1's (land) and '0's (water), count the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically.

Approach (BFS):

Use Breadth-First Search (BFS) to explore each island. Start BFS whenever you encounter '1', mark all connected '1's as visited (by changing them to '0'), and increment the island count.

Python Code:

Certainly! Here is the code with comments explaining each step.

1. 200. Number of Islands

from collections import deque

def numIslands(grid: List[List[str]]) -> int:
    if not grid:
        return 0

    rows, cols = len(grid), len(grid[0])
    island_count = 0

    # BFS function to traverse the island
    def bfs(r, c):
        queue = deque([(r, c)])
        while queue:
            row, col = queue.popleft()
            # Check boundaries and if the current cell is part of an island
            if 0 <= row < rows and 0 <= col < cols and grid[row][col] == '1':
                grid[row][col] = '0'  # Mark as visited by setting to '0'
                # Add all neighboring cells (up, down, left, right) to the queue
                queue.extend([(row-1, col), (row+1, col), (row, col-1), (row, col+1)])

    # Iterate over each cell in the grid
    for r in range(rows):
        for c in range(cols):
            # If the cell is a '1', start BFS to mark the entire island
            if grid[r][c] == '1':
                bfs(r, c)
                island_count += 1  # Increase island count for each BFS

    return island_count  # Return the total number of islands found

Example:

grid = [
  ["1","1","1","1","0"],
  ["1","1","0","1","0"],
  ["1","1","0","0","0"],
  ["0","0","0","0","0"]
]
print(numIslands(grid))  # Output: 1

Explanation:

We start BFS whenever we find an unvisited '1' and mark all connected lands as visited by changing them to '0'.
The BFS continues until all parts of the island are explored.
This continues for each island found, resulting in the total count.

2. 339. Nested List Weight Sum

Problem:

Given a nested list of integers, return the sum of all integers in the list weighted by their depth. For example, [1,[4,[6]]] would return 1*1 + 4*2 + 6*3 = 27.

Approach (BFS):

Use a Breadth-First Search (BFS) approach to traverse the nested list and calculate the weighted sum by tracking the depth level of each element.

Python Code:

from collections import deque

def depthSum(nestedList: List[NestedInteger]) -> int:
    queue = deque([(nestedList, 1)])  # Queue stores (list, depth) pairs
    total_sum = 0

    # BFS loop to process each element in the queue
    while queue:
        current_list, depth = queue.popleft()
        for element in current_list:
            if element.isInteger():
                # If it's an integer, add to the total sum weighted by depth
                total_sum += element.getInteger() * depth
            else:
                # If it's a list, add it to the queue with increased depth
                queue.append((element.getList(), depth + 1))

    return total_sum  # Return the weighted sum of all integers

Example:

nestedList = [NestedInteger(1), NestedInteger([NestedInteger(4), NestedInteger([NestedInteger(6)])])]
print(depthSum(nestedList))  # Output: 27

Explanation:

The BFS traverses the nested list level by level.
Each integer’s contribution to the sum is calculated based on its depth, which is tracked as we traverse each level.

3. 827. Making A Large Island

Problem:

You are given an n x n binary grid. You can change exactly one 0 to 1. Find the largest island size you can create by making this change.

Approach (BFS):

First, identify all islands using BFS, and assign unique island identifiers.
Then, for each 0 in the grid, check its neighboring islands and calculate the possible new island size by merging these islands.
Track the maximum possible island size.

Python Code:

from collections import deque, defaultdict

def largestIsland(grid: List[List[int]]) -> int:
    n = len(grid)
    island_size = defaultdict(int)  # Dictionary to store island sizes
    island_id = 2  # Start with an island id of 2 to distinguish from 1 and 0

    # BFS function to determine the size of each island
    def bfs(r, c, island_id):
        queue = deque([(r, c)])
        size = 0
        while queue:
            row, col = queue.popleft()
            if 0 <= row < n and 0 <= col < n and grid[row][col] == 1:
                grid[row][col] = island_id  # Mark as part of the current island
                size += 1  # Increment the size of the island
                # Add neighboring cells to the queue
                queue.extend([(row-1, col), (row+1, col), (row, col-1), (row, col+1)])
        return size

    # First pass: assign ids to islands and record sizes
    for r in range(n):
        for c in range(n):
            if grid[r][c] == 1:  # If it's part of an island
                island_size[island_id] = bfs(r, c, island_id)
                island_id += 1

    # Second pass: check every 0 cell and calculate potential island size
    max_island = max(island_size.values(), default=0)  # Start with the largest existing island
    for r in range(n):
        for c in range(n):
            if grid[r][c] == 0:  # Consider flipping this 0 to a 1
                seen_islands = set()  # Track unique neighboring islands
                new_size = 1  # Start with size 1 for the flipped cell
                for nr, nc in [(r-1, c), (r+1, c), (r, c-1), (r, c+1)]:
                    if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] > 1:
                        seen_islands.add(grid[nr][nc])  # Add neighboring island ids
                new_size += sum(island_size[i] for i in seen_islands)  # Add sizes of all unique neighboring islands
                max_island = max(max_island, new_size)  # Update max island size if necessary

    return max_island  # Return the size of the largest possible island

Example:

grid = [
  [1, 0],
  [0, 1]
]
print(largestIsland(grid))  # Output: 3

Explanation:

The BFS first identifies all islands and calculates their sizes.
We then evaluate the effect of converting each 0 to 1, merging adjacent islands, and keeping track of the maximum island size.

4. 1091. Shortest Path in Binary Matrix

Problem:

Given a binary matrix, return the length of the shortest clear path from the top-left corner to the bottom-right corner. If such a path does not exist, return -1. The path can only move in 8 possible directions.

Approach (BFS):

Use BFS to explore all possible paths starting from the top-left corner. The first time you reach the bottom-right corner, the length of the path is the answer.

Python Code:

from collections import deque

def rightSideView(root: TreeNode) -> List[int]:
    if not root:
        return []

    view = []  # List to store the rightmost nodes
    queue = deque([root])  # Start BFS with the root

    while queue:
        level_length = len(queue)  # Number of nodes at the current level
        for i in range(level_length):
            node = queue.popleft()
            if i == level_length - 1:  # If it's the last node in the current level
                view.append(node.val)  # Add it to the view

            if node.left:
                queue.append(node.left)  # Add left child to the queue
            if node.right:
                queue.append(node.right)  # Add right child to the queue

    return view  # Return the list of rightmost nodes

Example:

grid = [
  [0, 1],
  [1, 0]
]
print(shortestPathBinaryMatrix(grid))  # Output: 2

Explanation:

BFS explores all paths level by level. The first time the BFS reaches the bottom-right corner, it returns the path length as the answer.
This guarantees the shortest path is found.

5. 199. Binary Tree Right Side View

Problem:

Given a binary tree, return the values of the nodes you can see when looking at the tree from the right side.

Approach (BFS):

Use BFS to traverse the tree level by level, and take the last node of each level, as it represents the rightmost node visible from the right side.

Python Code:

6. 133. Clone Graph

from collections import deque

def cloneGraph(node: 'Node') -> 'Node':
    if not node:
        return None

    clones = {node: Node(node.val)}  # Dictionary to keep track of cloned nodes
    queue = deque([node])  # BFS queue starting with the original node

    # BFS loop to clone the graph
    while queue:
        current = queue.popleft()
        for neighbor in current.neighbors:
            if neighbor not in clones:  # If the neighbor hasn't been cloned yet
                clones[neighbor] = Node(neighbor.val)  # Clone the neighbor
                queue.append(neighbor)  # Add the original neighbor to the queue
            clones[current].neighbors.append(clones[neighbor])  # Link the clone of the current node to the clone of the neighbor

    return clones[node]  # Return the clone of the original node

7. 102. Binary Tree Level Order Traversal

from collections import deque

def levelOrder(root: TreeNode) -> List[List[int]]:
    if not root:
        return []

    result = []  # List to store the level order traversal
    queue = deque([root])  # Start BFS with the root

    # BFS loop to traverse the tree level by level
    while queue:
        level = []
        level_length = len(queue)  # Number of nodes at the current level
        for _ in range(level_length):
            node = queue.popleft()
            level.append(node.val)  # Add the node's value to the current level

'ML Engineering > python' 카테고리의 다른 글

06. Top K Elements Technique (0)	2024.08.06
05. Modified Binary Search Technique (0)	2024.08.06
03. Binary Tree DFS Techniques (0)	2024.08.06
02. Sliding Window Technique (0)	2024.08.06
01. Two Pointers Technique (0)	2024.08.06

03. Binary Tree DFS Techniques

2024. 8. 6. 06:09

Binary Tree DFS Techniques

Depth-First Search (DFS) is a fundamental algorithm used to traverse or search tree or graph data structures. In the context of binary trees, DFS can be implemented in three primary ways: Inorder, Preorder, and Postorder traversal. Each of these traversals has a distinct order of visiting nodes.

1. Inorder Traversal (Left, Root, Right)

In Inorder traversal, the nodes are visited in the following order: left subtree, root, right subtree. This traversal is particularly useful for binary search trees (BSTs) because it visits the nodes in ascending order.

Algorithm:

Traverse the left subtree.
Visit the root node.
Traverse the right subtree.

Code:

def inorder_traversal(root):
    if root is not None:
        inorder_traversal(root.left)
        print(root.val, end=' ')
        inorder_traversal(root.right)

# Example usage:
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

# Constructing the binary tree
#       3
#      / \
#     9   20
#        /  \
#       15   7
tree = TreeNode(3)
tree.left = TreeNode(9)
tree.right = TreeNode(20, TreeNode(15), TreeNode(7))

inorder_traversal(tree)  # Output: 9 3 15 20 7

2. Preorder Traversal (Root, Left, Right)

In Preorder traversal, the nodes are visited in the following order: root, left subtree, right subtree. This traversal is useful for creating a copy of the tree or for prefix expression evaluation.

Algorithm:

Visit the root node.
Traverse the left subtree.
Traverse the right subtree.

Code:

def preorder_traversal(root):
    if root is not None:
        print(root.val, end=' ')
        preorder_traversal(root.left)
        preorder_traversal(root.right)

# Example usage:
preorder_traversal(tree)  # Output: 3 9 20 15 7

3. Postorder Traversal (Left, Right, Root)

In Postorder traversal, the nodes are visited in the following order: left subtree, right subtree, root. This traversal is useful for deleting nodes in a tree or for postfix expression evaluation.

Algorithm:

Traverse the left subtree.
Traverse the right subtree.
Visit the root node.

Code:

def postorder_traversal(root):
    if root is not None:
        postorder_traversal(root.left)
        postorder_traversal(root.right)
        print(root.val, end=' ')

# Example usage:
postorder_traversal(tree)  # Output: 9 15 7 20 3

Iterative Approaches

While the above examples use recursion, DFS traversals can also be implemented iteratively using a stack.

Iterative Inorder Traversal

Code:

def iterative_inorder_traversal(root):
    stack, result = [], []
    current = root
    while current or stack:
        while current:
            stack.append(current)
            current = current.left
        current = stack.pop()
        result.append(current.val)
        current = current.right
    return result

# Example usage:
print(iterative_inorder_traversal(tree))  # Output: [9, 3, 15, 20, 7]

Iterative Preorder Traversal

Code:

def iterative_preorder_traversal(root):
    if root is None:
        return []
    stack, result = [root], []
    while stack:
        node = stack.pop()
        result.append(node.val)
        if node.right:
            stack.append(node.right)
        if node.left:
            stack.append(node.left)
    return result

# Example usage:
print(iterative_preorder_traversal(tree))  # Output: [3, 9, 20, 15, 7]

Iterative Postorder Traversal

Code:

def iterative_postorder_traversal(root):
    if root is None:
        return []
    stack1, stack2, result = [root], [], []
    while stack1:
        node = stack1.pop()
        stack2.append(node)
        if node.left:
            stack1.append(node.left)
        if node.right:
            stack1.append(node.right)
    while stack2:
        node = stack2.pop()
        result.append(node.val)
    return result

# Example usage:
print(iterative_postorder_traversal(tree))  # Output: [9, 15, 7, 20, 3]

Summary

Inorder Traversal: Visits nodes in ascending order for BSTs (Left, Root, Right).
Preorder Traversal: Useful for copying the tree and prefix expression (Root, Left, Right).
Postorder Traversal: Useful for deleting nodes and postfix expression (Left, Right, Root).

Each traversal method has its unique applications and can be implemented both recursively and iteratively, depending on the problem requirements and constraints.

'ML Engineering > python' 카테고리의 다른 글

05. Modified Binary Search Technique (0)	2024.08.06
04. Binary Tree BFS Techniques (0)	2024.08.06
02. Sliding Window Technique (0)	2024.08.06
01. Two Pointers Technique (0)	2024.08.06
Differences between subarrays, substrings, subsequences, and subsets (0)	2024.08.06

02. Sliding Window Technique

2024. 8. 6. 06:08

Sliding Window Technique

The sliding window technique is an efficient way to solve problems involving subarrays or substrings of a given size within an array or string. It is particularly useful for problems that involve finding a subset of contiguous elements that meet a specific condition.

Key Concepts

Window: A subarray or substring that represents a portion of the original array or string.
Fixed or Variable Size: The window can be of a fixed size (static) or can change size dynamically (dynamic).
Sliding: The window slides over the array or string by moving its starting and ending indices to cover all possible positions.

Steps

Initialize: Start with a window at the beginning of the array or string.
Expand/Contract: Move the window by adjusting its starting and ending indices.
Update: Calculate the desired value or check conditions for the current window.
Slide: Move the window one step forward and repeat the process until the end of the array or string is reached.

Examples Covered:

By understanding and applying the sliding window technique, you can significantly improve the efficiency of algorithms for problems involving contiguous data segments.

Advantages of Sliding Window Technique

Efficiency: Reduces the time complexity from (O(n^2)) to (O(n)) for many problems by avoiding unnecessary repeated calculations.
Simplicity: Provides a straightforward approach to handle problems involving contiguous subarrays or substrings.
Flexibility: Can be adapted to both fixed-size and variable-size window problems.

Applications

Fixed-Size Window:
- Maximum sum of a subarray of size k.
- Finding the average of each subarray of size k.
Variable-Size Window:
- Longest substring with at most k distinct characters.
- Smallest subarray with a sum greater than or equal to S.
Max Consecutive Ones III
- Finding the longest subarray with at most k zeroes.
Longest Substring Without Repeating Characters
- Finding the longest substring without repeating characters.
Find K Closest Elements
- Finding the k closest elements to a given value in a sorted array.

Examples

Example 1: Maximum Sum of Subarray of Size K

Problem: Given an array of integers and a number k, find the maximum sum of a subarray of size k.

Code:

def max_sum_subarray(arr, k):
    n = len(arr)
    if n < k:
        return -1

    max_sum = 0
    window_sum = sum(arr[:k])
    max_sum = window_sum

    for i in range(n - k):
        window_sum = window_sum - arr[i] + arr[i + k]
        max_sum = max(max_sum, window_sum)

    return max_sum

# Example usage:
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
k = 3
print(max_sum_subarray(arr, k))  # Output: 27 (sum of subarray [8, 9, 10])

Explanation:

Initialize: Calculate the sum of the first window of size k.
Expand/Contract: Slide the window by removing the first element of the previous window and adding the next element in the array.
Update: Keep track of the maximum sum encountered.
Slide: Repeat the process until the end of the array is reached.

Example 2: Longest Substring with K Unique Characters

Problem: Given a string and an integer k, find the length of the longest substring that contains exactly k unique characters.

Code:

def longest_substring_k_unique(s, k):
    from collections import defaultdict

    n = len(s)
    if n == 0 or k == 0:
        return 0

    left = 0
    max_length = 0
    char_count = defaultdict(int)

    for right in range(n):
        char_count[s[right]] += 1

        while len(char_count) > k:
            char_count[s[left]] -= 1
            if char_count[s[left]] == 0:
                del char_count[s[left]]
            left += 1

        if len(char_count) == k:
            max_length = max(max_length, right - left + 1)

    return max_length

# Example usage:
s = "araaci"
k = 2
print(longest_substring_k_unique(s, k))  # Output: 4 ("araa")

Explanation:

Initialize: Use two pointers (left and right) to represent the window bounds and a dictionary to count character frequencies.
Expand/Contract: Expand the window by moving the right pointer and adding the new character to the count. Contract the window by moving the left pointer and updating the count when the number of unique characters exceeds k.
Update: Update the maximum length of the window that meets the condition.
Slide: Continue expanding and contracting the window until the end of the string is reached.

1004. Max Consecutive Ones III

https://leetcode.com/problems/max-consecutive-ones-iii/description/?envType=company&envId=facebook&favoriteSlug=facebook-three-months

Approach

The problem requires finding the maximum number of consecutive 1s when flipping at most one 0 in the binary array. This can be solved using the sliding window (two-pointer) approach.

Key Steps:

Initialize:
- Maintain two pointers, left and right, representing the sliding window.
- Keep a count of the number of zeros in the current window (zero_count), since you can flip only one 0.
- Initialize max_len to store the maximum length of consecutive 1s after flipping at most one 0.
Expand Window:
- Traverse the array using the right pointer.
- Every time you encounter a 0, increment the zero_count because you can flip only one 0.
Shrink Window:
- If the zero_count exceeds 1 (meaning you have encountered more than one 0), move the left pointer forward until zero_count is reduced to 1 again. This maintains the validity of having at most one 0 in the current window.
Continue & Update:
- At each step, update max_len with the length of the current window (calculated as right - left + 1) if the zero_count is within the limit of one flip.

Python Code:

def findMaxConsecutiveOnes(nums):
    left = 0
    zero_count = 0
    max_len = 0

    for right in range(len(nums)):
        if nums[right] == 0:
            zero_count += 1

        # If there are more than one 0, shrink the window from the left
        while zero_count > 1:
            if nums[left] == 0:
                zero_count -= 1
            left += 1

        # Update the maximum length of the window
        max_len = max(max_len, right - left + 1)

    return max_len

Explanation:

Initialize:
- left = 0: Marks the beginning of the window.
- zero_count = 0: Counts the number of zeros in the current window.
- max_len = 0: Stores the maximum consecutive 1s after flipping one 0.
Expand Window:
- Traverse the array with right pointer. If the current element is 0, increment zero_count.
Shrink Window:
- If zero_count > 1, increment the left pointer to shrink the window from the left, ensuring at most one 0 remains in the window.
Continue & Update:
- For each valid window (with at most one 0), calculate the length (right - left + 1) and update max_len.

Test Cases:

Example 1:
- Input: nums = [1, 0, 1, 1, 0]
- Output: 4
- Explanation: Flipping the first 0 results in [1, 1, 1, 1, 0], which gives 4 consecutive ones.
Example 2:
- Input: nums = [1, 0, 1, 1, 0, 1]
- Output: 4
- Explanation: Flipping either 0 results in a maximum of 4 consecutive ones.
Edge Case:
- Input: nums = [1, 1, 1, 1]
- Output: 4
- Explanation: No flip is needed, as there are already 4 consecutive ones.
Edge Case 2:
- Input: nums = [0, 0, 0]
- Output: 1
- Explanation: The maximum number of consecutive ones after flipping one 0 is just 1.

This sliding window approach ensures an optimal solution with a time complexity of O(n), where n is the length of the array, because each element is processed at most twice (once by right and once by left).

3. Longest Substring Without Repeating Characters

Problem: Given a string s, find the length of the longest substring without repeating characters.

Approach:

Use a sliding window to maintain the longest substring without repeating characters.

Optimized Python Code:

def lengthOfLongestSubstring(s):
    char_map = {}
    left = 0
    max_len = 0

    for right, char in enumerate(s):
        if char in char_map:
            left = max(left, char_map[char] + 1)
        char_map[char] = right
        max_len = max(max_len, right - left + 1)

    return max_len

Explanation of the Optimized Approach:

Initialize:
- char_map = {}: Stores the most recent index of each character.
- left = 0: Points to the start of the current valid window.
- max_len = 0: Tracks the length of the longest substring without repeating characters.
Expand Window:
- The right pointer iterates through the string (for right, char in enumerate(s)).
- For each character char, check if it has been seen before. If so, adjust left to the right of the previous occurrence (left = max(left, char_map[char] + 1)). This ensures that left only moves forward, preventing it from moving backward.
Update:
- Store the current index of the character in char_map.
- Calculate the length of the current window and update max_len if necessary.

Key Differences from Previous Code:

Simplified Condition: The check for updating the left pointer is done with max(left, char_map[char] + 1) to ensure it only moves forward.
Conciseness: The code is more compact by combining variable assignment and calculations in one step, using the enumerate() function for iteration.
Efficiency: Fewer condition checks and shorter logic, making the code both easy to understand and efficient.

Example:

Input: s = "abcabcbb"
Output: 3

Test Cases:

Example 1: s = "abcabcbb", Output: 3
Example 2: s = "bbbbb", Output: 1
Example 3: s = "pwwkew", Output: 3
Edge Case 1: s = "", Output: 0
Edge Case 2: s = "a", Output: 1

This solution runs in O(n) time complexity, where n is the length of the input string, and is more concise than the earlier version.

658. Find K Closest Elements

Problem: Given a sorted array arr, two integers k and x, find the k closest elements to x in the array.

Approach:

Use a sliding window to find the closest k elements to x.

Approach

To solve the problem of finding the k closest elements to x in a sorted array arr, a common approach is to utilize binary search and the two-pointer technique. This ensures an optimal time complexity given that the array is already sorted.

Key Steps:

Binary Search for the Closest Element:
- We use binary search to find the position in the array where the element is closest to x. This helps us narrow down the window of elements to explore.
Two-Pointer Technique:
- Once we have a starting point, we can expand outwards using two pointers (left and right). These pointers will help us find the k closest elements to x by comparing differences |arr[left] - x| and |arr[right] - x|.
Sorting the Result:
- After selecting the k closest elements, we return them in sorted order as required by the problem.

Optimized Python Code:

def findClosestElements(arr, k, x):
    left, right = 0, len(arr) - 1

    # Narrow the window to k elements
    while right - left >= k:
        if abs(arr[left] - x) > abs(arr[right] - x):
            left += 1
        else:
            right -= 1

    # Return the k closest elements sorted in ascending order
    return arr[left:left + k]

Explanation:

Binary Search-Like Narrowing:
- Initially, left is set to the start of the array, and right is set to the end.
- The while loop continues narrowing the window until only k elements remain. In each iteration, we compare the absolute difference between arr[left] and x with arr[right] and x. We discard the element that is farther from x by adjusting the left or right pointer.
Return the Result:
- Once the loop finishes, the left pointer will point to the start of the k closest elements, so we return the subarray arr[left:left + k].

Example 1:

Input: arr = [1, 2, 3, 4, 5], k = 4, x = 3
Output: [1, 2, 3, 4]
- Explanation: The 4 closest numbers to 3 are [1, 2, 3, 4].

Example 2:

Input: arr = [1, 1, 2, 3, 4, 5], k = 4, x = -1
Output: [1, 1, 2, 3]
- Explanation: The 4 closest numbers to -1 are [1, 1, 2, 3].

Test Cases:

Example 1:
- Input: arr = [1, 2, 3, 4, 5], k = 4, x = 3
- Output: [1, 2, 3, 4]
Example 2:
- Input: arr = [1, 1, 2, 3, 4, 5], k = 4, x = -1
- Output: [1, 1, 2, 3]
Edge Case 1:
- Input: arr = [1, 2, 3, 4], k = 2, x = 5
- Output: [3, 4]
- Explanation: Since x = 5, the closest numbers are the two largest numbers in the array, 3 and 4.
Edge Case 2:
- Input: arr = [1], k = 1, x = 0
- Output: [1]
- Explanation: The array only contains one element.

Time Complexity:

Time complexity: O(log(n) + k). The binary search-like narrowing runs in O(log(n)), and selecting the k closest elements runs in O(k).
Space complexity: O(k) for the output result.

'ML Engineering > python' 카테고리의 다른 글

04. Binary Tree BFS Techniques (0)	2024.08.06
03. Binary Tree DFS Techniques (0)	2024.08.06
01. Two Pointers Technique (0)	2024.08.06
Differences between subarrays, substrings, subsequences, and subsets (0)	2024.08.06
Python 2 vs 3 Difference (0)	2024.04.12

01. Two Pointers Technique

2024. 8. 6. 06:06

Two Pointers Technique

The two pointers technique is a common and efficient method used to solve various problems involving arrays or linked lists. It involves using two pointers that traverse the data structure in a specific way to achieve the desired result.

Key Concepts

Initialization: Typically, the two pointers are initialized at different positions, such as the start and end of the array, or both at the start but with different purposes.
Movement: The pointers move towards each other or in a specific pattern based on the problem's conditions.
Termination: The process continues until the pointers meet or cross each other, or until a specific condition is met.

Explanation:

Initialize pointer left at the beginning of the array.
Traverse the array with pointer right starting from the second element.
If the element at right is different from the element at left, increment left and copy the element at right to left.
Continue until right traverses the entire array.
The array up to index left will contain unique elements.

Advantages of Two Pointers Technique

Efficiency: Often reduces the time complexity of problems from (O(n^2)) to (O(n)) by avoiding nested loops.
Simplicity: Provides a straightforward approach to solving problems involving arrays or lists.

Applications

Finding Pair with Given Sum:
- In a sorted array, use two pointers starting at the beginning and end. Adjust pointers based on the sum of elements they point to.
Removing Duplicates from Sorted Array:
- Use two pointers to track the position of unique elements and overwrite duplicates.
Reversing a String or Array:
- Use two pointers at the beginning and end, swapping elements and moving pointers towards each other.
Container with Most Water:
- Use two pointers to find the maximum area by moving towards each other and updating the maximum area found.
Merging Two Sorted Arrays:
- Use two pointers to traverse each array and merge them into a single sorted array.
Valid Palindrome II
Valid Word Abbreviation
Merge Sorted Array
Lowest Common Ancestor of a Binary Tree III
Valid Palindrome
3Sum
Next Permutation
Move Zeroes
Dot Product of Two Sparse Vectors

Examples

Example 1: Finding Pair with Given Sum in a Sorted Array

Problem: Given a sorted array and a target sum, find if there exists a pair of elements that add up to the target sum.

Code:

def find_pair_with_sum(arr, target):
    left, right = 0, len(arr) - 1
    while left < right:
        current_sum = arr[left] + arr[right]
        if current_sum == target:
            return (arr[left], arr[right])
        elif current_sum < target:
            left += 1
        else:
            right -= 1
    return None

# Example usage:
arr = [1, 2, 3, 4, 6]
target = 6
print(find_pair_with_sum(arr, target))  # Output: (2, 4)

Explanation:

Initialize pointers left at the beginning and right at the end of the array.
Calculate the sum of elements at these pointers.
If the sum matches the target, return the pair.
If the sum is less than the target, move the left pointer to the right to increase the sum.
If the sum is greater than the target, move the right pointer to the left to decrease the sum.
Continue until the pointers meet or cross each other.

Example 2: Removing Duplicates from Sorted Array

Problem: Given a sorted array, remove the duplicates in-place such that each element appears only once and return the new length.

Code:

def remove_duplicates(arr):
    if not arr:
        return 0
    left = 0
    for right in range(1, len(arr)):
        if arr[right] != arr[left]:
            left += 1
            arr[left] = arr[right]
    return left + 1

# Example usage:
arr = [1, 1, 2, 2, 3, 4, 4]
new_length = remove_duplicates(arr)
print(arr[:new_length])  # Output: [1, 2, 3, 4]

680. Valid Palindrome II

Problem: Given a non-empty string s, you may delete at most one character. Judge whether you can make it a palindrome.

Approach:

Use two pointers, left starting from the beginning and right from the end.
Compare characters at left and right.
If they don't match, skip one character and check if the remaining substring is a palindrome.

Code:

Code:

def valid_palindrome(s):
    def is_palindrome_range(i, j):
        while i < j:
            if s[i] != s[j]:
                return False
            i += 1
            j -= 1
        return True

    left, right = 0, len(s) - 1
    while left < right:
        if s[left] != s[right]:
            # If there's a mismatch, check by skipping one character
            return is_palindrome_range(left + 1, right) or is_palindrome_range(left, right - 1)
        left, right = left + 1, right - 1
    return True

Explanation:

is_palindrome_range Function:
- Instead of using all, this function now uses a while loop to check if the substring from i to j is a palindrome.
- The loop compares characters at both ends (s[i] and s[j]), incrementing i and decrementing j to move towards the center of the substring.
- If any characters don't match, it returns False.
- If all characters match, it returns True.
Main Function:
- The rest of the logic remains the same. If a mismatch is found between s[left] and s[right], the function checks both options: skipping left or skipping right.

Example Usage:

# Example 1:
s1 = "abca"
print(valid_palindrome(s1))  # Output: True

# Example 2:
s2 = "racecar"
print(valid_palindrome(s2))  # Output: True

# Example 3:
s3 = "abc"
print(valid_palindrome(s3))  # Output: False

Time Complexity:

O(n) where n is the length of the string s. The while loop in is_palindrome_range checks each character at most once.

Space Complexity:

O(1) since only a few variables are used. The algorithm operates in constant space.

408. Valid Word Abbreviation

Problem: Given a non-empty string word and an abbreviation abbr, check if it is a valid abbreviation of the word.

Approach:

Use two pointers, one for word and one for abbr.
Traverse both strings, expanding numbers in abbr to check against word.

Python Code:

def valid_word_abbreviation(word, abbr):
    i, j = 0, 0
    while i < len(word) and j < len(abbr):
        # If abbr contains a digit
        if abbr[j].isdigit():
            if abbr[j] == '0':  # No leading zeros allowed
                return False
            num = 0
            # Convert the series of digits into a number
            while j < len(abbr) and abbr[j].isdigit():
                num = num * 10 + int(abbr[j])
                j += 1
            i += num  # Move i forward by the number
        else:
            # Characters must match
            if word[i] != abbr[j]:
                return False
            i += 1
            j += 1

    # Both i and j should reach the end of their respective strings
    return i == len(word) and j == len(abbr)

Explanation of the Approach:

Two Pointers:
- Use two pointers i and j, where i tracks the position in word and j tracks the position in abbr.
- The goal is to traverse both word and abbr, expanding numbers in abbr to check if the corresponding characters match in word.
Handling Digits:
- If the current character in abbr[j] is a digit, we need to expand it to a number representing the number of characters to skip in word.
- If a digit starts with '0', it's invalid since abbreviations cannot have leading zeros.
Handling Characters:
- If abbr[j] is a character (not a digit), it must exactly match word[i].
- If they don't match, return False.
End Condition:
- After the loop, both i and j must reach the ends of their respective strings for the abbreviation to be valid.

Example Usage:

# Example 1:
word1 = "internationalization"
abbr1 = "i12iz4n"
print(valid_word_abbreviation(word1, abbr1))  # Output: True

# Example 2:
word2 = "apple"
abbr2 = "a2e"
print(valid_word_abbreviation(word2, abbr2))  # Output: False

# Example 3:
word3 = "substitution"
abbr3 = "s10n"
print(valid_word_abbreviation(word3, abbr3))  # Output: True

Explanation of Test Cases:

Test Case 1:
- Input: word = "internationalization", abbr = "i12iz4n"
- Output: True
- Explanation: The abbreviation expands to "i12iz4n" → "internationalization".
Test Case 2:
- Input: word = "apple", abbr = "a2e"
- Output: False
- Explanation: "a2e" would expand to "aple", but this does not match "apple".
Test Case 3:
- Input: word = "substitution", abbr = "s10n"
- Output: True
- Explanation: "s10n" expands to "substitution", which matches.

Time Complexity:

O(n + m), where n is the length of word and m is the length of abbr. The algorithm processes each character from both strings exactly once.

Space Complexity:

O(1), since we are using only a few variables to track indices and counts, without using additional space proportional to the input size.

88. Merge Sorted Array

Problem: Merge two sorted arrays nums1 and nums2 into nums1 as one sorted array.

Explanation of the Approach:

The problem requires merging two sorted arrays nums1 and nums2 in-place into nums1, where nums1 has enough space to hold both arrays. The goal is to merge them in sorted order.

Key Observations:

We are given m as the number of valid elements in nums1 and n as the number of elements in nums2.
nums1 has additional space at the end, which can accommodate the elements from nums2.

Two-pointer Approach (starting from the end):

Instead of merging the arrays from the front, we can merge them starting from the end. This way, we avoid overwriting the elements in nums1 that are yet to be compared.
We use two pointers: one for nums1 (starting at m-1) and one for nums2 (starting at n-1), and a third pointer for the end of nums1 (at m + n - 1).
At each step, we compare the elements at the two pointers and place the larger element at the current position in nums1.

Code:

def merge(nums1, m, nums2, n):
    # Start merging from the end
    while m > 0 and n > 0:
        if nums1[m - 1] > nums2[n - 1]:
            nums1[m + n - 1] = nums1[m - 1]
            m -= 1
        else:
            nums1[m + n - 1] = nums2[n - 1]
            n -= 1

    # If there are any remaining elements in nums2, place them in nums1
    nums1[:n] = nums2[:n]

Explanation:

Merging from the End:
- We start from the last elements of nums1 and nums2.
- If nums1[m - 1] is larger than nums2[n - 1], we place nums1[m - 1] in the current position (m + n - 1), and move the pointer m to the left.
- Otherwise, we place nums2[n - 1] in the current position, and move the pointer n to the left.
Handling Remaining Elements in nums2:
- If nums2 has remaining elements after all comparisons, copy them into the front of nums1. This is needed because if all elements in nums1 are larger, some elements in nums2 will remain unplaced.
No Need to Copy nums1:
- There's no need to handle any remaining elements from nums1 explicitly, as they are already in their correct positions if not overwritten.

Example Usage:

# Example 1:
nums1 = [1, 2, 3, 0, 0, 0]
m = 3
nums2 = [2, 5, 6]
n = 3
merge(nums1, m, nums2, n)
print(nums1)  # Output: [1, 2, 2, 3, 5, 6]

# Example 2:
nums1 = [4, 5, 6, 0, 0, 0]
m = 3
nums2 = [1, 2, 3]
n = 3
merge(nums1, m, nums2, n)
print(nums1)  # Output: [1, 2, 3, 4, 5, 6]

Time Complexity:

O(m + n): We traverse both nums1 and nums2 once, making a total of m + n comparisons and insertions.

Space Complexity:

O(1): The algorithm is performed in-place, so no additional space is required besides a few variables for tracking indices.

1650. Lowest Common Ancestor of a Binary Tree III

Problem: Find the lowest common ancestor of two nodes in a binary tree, where each node has a pointer to its parent.

Approach:

Use two pointers to traverse upwards from each node until they meet.

Explanation of the Approach:

The problem asks to find the Lowest Common Ancestor (LCA) of two nodes in a binary tree where each node has a pointer to its parent.

Key Idea:

Since each node has a pointer to its parent, we can trace back from each node to the root of the tree.
The first point where both nodes meet is their lowest common ancestor (LCA).
We use two pointers, one for each node (p and q), and move them up to their parent nodes. If a pointer reaches the root (i.e., None), we switch it to the other node. This ensures that both pointers traverse the same number of steps by the time they meet.

Approach:

Two Pointers:
- We initialize two pointers, a starting at p and b starting at q.
- Both pointers move upwards to their respective parents at each step.
Switch Pointers:
- If one pointer reaches None (the root), switch it to the other node.
- This ensures that both pointers traverse the same number of steps, as one pointer may reach the root earlier than the other.
Meeting Point:
- Eventually, both pointers will either meet at the lowest common ancestor or at None (in case the nodes don't have a common ancestor, but in a connected tree, they will meet at LCA).
End Condition:
- The loop continues until both pointers point to the same node (a == b), which will be the LCA.

Code:

class Node:
    def __init__(self, val=0, parent=None):
        self.val = val
        self.parent = parent

def lowest_common_ancestor(p, q):
    a, b = p, q
    while a != b:
        a = a.parent if a else q
        b = b.parent if b else p
    return a

Example Usage:

# Example tree setup:
#       3
#      / \
#     5   1
#    / \
#   6   2
#      / \
#     7   4

root = Node(3)
node5 = Node(5, root)
node1 = Node(1, root)
node6 = Node(6, node5)
node2 = Node(2, node5)
node7 = Node(7, node2)
node4 = Node(4, node2)

# Find LCA of node 6 and node 4
lca = lowest_common_ancestor(node6, node4)
print(lca.val)  # Output: 5

# Find LCA of node 7 and node 4
lca = lowest_common_ancestor(node7, node4)
print(lca.val)  # Output: 2

Explanation of Test Cases:

Test Case 1:
- Input: node6 and node4
- Output: 5
- Explanation: The LCA of node 6 and node 4 is node 5.
Test Case 2:
- Input: node7 and node4
- Output: 2
- Explanation: The LCA of node 7 and node 4 is node 2.

Time Complexity:

O(h), where h is the height of the binary tree.
- In the worst case, we traverse the height of the tree twice (once for each pointer), but switching ensures both pointers meet in a finite number of steps.

Space Complexity:

O(1), as the solution uses a constant amount of space, regardless of the input size, except for the space needed for the input nodes.

125. Valid Palindrome

Problem: Given a string, determine if it is a palindrome, considering only alphanumeric characters and ignoring cases.

Explanation of the Approach:

The problem requires determining if a given string is a valid palindrome, considering only alphanumeric characters and ignoring case differences. A palindrome is a string that reads the same forward and backward.

Key Steps:

Two-Pointer Approach:
- Use two pointers: left starting from the beginning of the string and right starting from the end. These pointers will move towards the center of the string.
Skip Non-Alphanumeric Characters:
- Since only alphanumeric characters are considered, if a character at left or right is not alphanumeric (checked using isalnum()), move the pointer until it points to a valid alphanumeric character.
Case-Insensitive Comparison:
- Convert characters to lowercase using lower() before comparing them to ignore case differences.
Continue Until Pointers Meet:
- If all characters match, the string is a palindrome. If at any point s[left].lower() != s[right].lower(), the string is not a palindrome, and we return False.
End Condition:
- If the loop completes and no mismatches are found, return True to indicate the string is a palindrome.

Python Code:

def is_palindrome(s):
    left, right = 0, len(s) - 1
    while left < right:
        # Skip non-alphanumeric characters from the left
        while left < right and not s[left].isalnum():
            left += 1
        # Skip non-alphanumeric characters from the right
        while left < right and not s[right].isalnum():
            right -= 1
        # Compare the alphanumeric characters case-insensitively
        if s[left].lower() != s[right].lower():
            return False
        # Move pointers inward
        left, right = left + 1, right - 1
    return True

Example Usage:

# Example 1:
s1 = "A man, a plan, a canal: Panama"
print(is_palindrome(s1))  # Output: True

# Example 2:
s2 = "race a car"
print(is_palindrome(s2))  # Output: False

# Example 3:
s3 = " "
print(is_palindrome(s3))  # Output: True

Explanation of Test Cases:

Test Case 1:
- Input: "A man, a plan, a canal: Panama"
- Output: True
- Explanation: After ignoring non-alphanumeric characters and case differences, the string becomes "amanaplanacanalpanama", which is a palindrome.
Test Case 2:
- Input: "race a car"
- Output: False
- Explanation: After ignoring non-alphanumeric characters and case differences, the string becomes "raceacar", which is not a palindrome.
Test Case 3:
- Input: " "
- Output: True
- Explanation: An empty string or a string with only spaces is trivially a palindrome.

Time Complexity:

O(n), where n is the length of the string. We traverse the string with two pointers, each pointer moving once through the string.

Space Complexity:

O(1), since we use only a few extra variables for the two pointers and comparisons, without any additional space proportional to the input size.

15. 3Sum

Problem: Given an array nums of n integers, find all unique triplets in the array which gives the sum of zero.

Explanation of the Approach:

The problem requires finding all unique triplets in an array that sum to zero. The triplet (a, b, c) is considered unique if no other triplet contains the same set of numbers. The approach used here is efficient, combining sorting with a two-pointer technique to find such triplets.

Key Steps:

Sort the Array:
- Sorting the array helps in avoiding duplicates and simplifies the process of finding the required triplets.
Iterate with a Fixed Element:
- For each element in the array nums[i], we fix this element and then use a two-pointer approach to find the remaining two elements (nums[left] and nums[right]) such that their sum with nums[i] equals zero.
Two-Pointer Technique:
- After fixing nums[i], the problem reduces to finding two numbers in the sorted array (from i+1 to end) whose sum equals -nums[i]. This can be done using two pointers:
  - left pointer starts right after i (i + 1).
  - right pointer starts at the end of the array.
  - Move left and right pointers inward based on the sum of the three numbers.
Avoid Duplicates:
- To ensure the uniqueness of triplets, skip duplicates of nums[i], nums[left], and nums[right] by moving the pointers past consecutive duplicate values.
Termination:
- The loop continues until i reaches the third-to-last element, and the two-pointer search is carried out for each element.

Python Code:

def three_sum(nums):
    nums.sort()  # Step 1: Sort the array
    result = []

    for i in range(len(nums) - 2):  # Step 2: Iterate with fixed element
        if i > 0 and nums[i] == nums[i - 1]:  # Skip duplicates for i
            continue

        left, right = i + 1, len(nums) - 1  # Two-pointer approach
        while left < right:
            total = nums[i] + nums[left] + nums[right]
            if total < 0:
                left += 1  # Move left pointer right if the sum is less than zero
            elif total > 0:
                right -= 1  # Move right pointer left if the sum is greater than zero
            else:
                result.append([nums[i], nums[left], nums[right]])  # Found a triplet

                # Skip duplicates for left
                while left < right and nums[left] == nums[left + 1]:
                    left += 1
                # Skip duplicates for right
                while left < right and nums[right] == nums[right - 1]:
                    right -= 1

                left += 1
                right -= 1  # Move both pointers inward to find new triplets

    return result

Example Usage:

# Example 1:
nums1 = [-1, 0, 1, 2, -1, -4]
print(three_sum(nums1))  
# Output: [[-1, -1, 2], [-1, 0, 1]]

# Example 2:
nums2 = [0, 1, 1]
print(three_sum(nums2))  
# Output: []

# Example 3:
nums3 = [0, 0, 0]
print(three_sum(nums3))  
# Output: [[0, 0, 0]]

Explanation of Test Cases:

Test Case 1:
- Input: [-1, 0, 1, 2, -1, -4]
- Output: [[-1, -1, 2], [-1, 0, 1]]
- Explanation: The two triplets that sum to zero are [-1, -1, 2] and [-1, 0, 1].
Test Case 2:
- Input: [0, 1, 1]
- Output: []
- Explanation: There is no triplet that sums to zero in this array.
Test Case 3:
- Input: [0, 0, 0]
- Output: [[0, 0, 0]]
- Explanation: The triplet [0, 0, 0] sums to zero and is the only valid triplet.

Time Complexity:

O(n^2): Sorting the array takes O(n log n), and then for each element, the two-pointer search takes O(n). Thus, the total time complexity is O(n^2).

Space Complexity:

O(1) (excluding the output space): We only use a few extra variables for the pointers and result storage, making the space complexity constant apart from the input and output.

31. Next Permutation

Problem: Implement the next permutation, which rearranges numbers into the lexicographically next greater permutation of numbers.

Explanation of the Approach:

The goal of the Next Permutation problem is to rearrange the given numbers into the next lexicographically larger permutation. If no such permutation exists (i.e., the array is in descending order), the array should be rearranged as the smallest possible permutation (i.e., sorted in ascending order).

Key Observations:

Lexicographical Order:
- The next permutation is the smallest permutation that is lexicographically larger than the current one. If no such permutation exists, we return the smallest permutation by sorting.
Reverse Sorted Suffix:
- The key insight is that the next permutation must change the smallest possible portion of the array to get the next largest number. This portion will be the longest suffix that is in decreasing order.
Steps to Solve:
- Find the first decreasing element (from the end): Starting from the end of the array, find the first element nums[i] that is smaller than the element after it (nums[i + 1]). This element marks the point where the next larger permutation can be generated.
- Find the next largest element: From the right end of the array, find the smallest element nums[j] that is larger than nums[i] and swap them. This ensures that we get the smallest possible number that is larger than nums[i].
- Reverse the suffix: After swapping, reverse the portion of the array after index i. This ensures that the suffix is sorted in ascending order, which gives us the smallest possible lexicographical order.

Python Code:

def next_permutation(nums):
    # Step 1: Find the first decreasing element from the end
    i = len(nums) - 2
    while i >= 0 and nums[i] >= nums[i + 1]:
        i -= 1

    # Step 2: If we found a decreasing element, swap it with the next largest element
    if i >= 0:
        j = len(nums) - 1
        while nums[j] <= nums[i]:
            j -= 1
        nums[i], nums[j] = nums[j], nums[i]

    # Step 3: Reverse the part of the array after the swapped element
    nums[i + 1:] = reversed(nums[i + 1:])

Step-by-Step Example:

Example 1:

Input: nums = [1, 2, 3]
Step 1: Find the first decreasing element from the end. Here, nums[1] = 2 is smaller than nums[2] = 3, so i = 1.
Step 2: Find the smallest element larger than nums[1]. Here, nums[2] = 3 is the smallest element larger than nums[1], so we swap nums[1] and nums[2].
- Array after swap: [1, 3, 2]
Step 3: Reverse the part of the array after index 1. In this case, only one element remains, so no actual change is made.
- Final result: [1, 3, 2]

Example 2:

Input: nums = [3, 2, 1]
Step 1: Find the first decreasing element. There is none, because the array is in descending order.
Step 2: Since no such element exists, skip the swap step.
Step 3: Reverse the entire array to get the smallest permutation.
- Final result: [1, 2, 3]

Example 3:

Input: nums = [1, 1, 5]
Step 1: Find the first decreasing element from the end. Here, nums[1] = 1 is smaller than nums[2] = 5, so i = 1.
Step 2: Find the smallest element larger than nums[1]. Here, nums[2] = 5 is the smallest element larger than nums[1], so we swap nums[1] and nums[2].
- Array after swap: [1, 5, 1]
Step 3: Reverse the part of the array after index 1. In this case, only one element remains, so no actual change is made.
- Final result: [1, 5, 1]

Time Complexity:

O(n), where n is the length of the array. The process involves:
- A single pass to find the first decreasing element (O(n)).
- A single pass to find the next largest element (O(n)).
- A reverse operation on the suffix, which is also at most O(n).

Space Complexity:

O(1), as the rearrangement is done in place with constant extra space.

This solution efficiently finds the next lexicographically greater permutation by modifying the input array in-place.

283. Move Zeroes

Problem: Given an array nums, move all 0's to the end while maintaining the relative order of the non-zero elements.

Explanation of the Approach:

The problem asks to move all zeros in the array to the end, while maintaining the relative order of the non-zero elements. The idea is to rearrange the array in-place with minimal operations.

Key Observations:

Two Pointers:
- Use two pointers (left and right) to keep track of the position where the next non-zero element should be placed.
- left tracks the position where a non-zero element should go, and right iterates over the array.
Swap Non-zero Elements:
- As right traverses the array, whenever a non-zero element is encountered, swap it with the element at the left index.
- After the swap, increment the left pointer to point to the next position where a non-zero element should be placed.
Zeros are Naturally Moved to the End:
- By the end of the process, all non-zero elements will be shifted to the front, and all zeros will be moved to the end.

Python Code:

def move_zeroes(nums):
    left = 0  # Pointer to place the next non-zero element
    for right in range(len(nums)):  # Traverse the array with right pointer
        if nums[right] != 0:  # If current element is non-zero
            # Swap the non-zero element with the left pointer's element
            nums[left], nums[right] = nums[right], nums[left]
            left += 1  # Move the left pointer forward

Explanation of the Code:

Initialize left:
- The left pointer keeps track of where the next non-zero element should be placed. Initially, it is set to 0.
Iterate with right:
- The right pointer traverses the entire array. For each non-zero element found at nums[right], we swap it with the element at nums[left].
Swap Non-zero Elements:
- If nums[right] is non-zero, swap it with nums[left] to bring the non-zero element to the front. After the swap, move left to the next position.
End of Iteration:
- By the end of the loop, all non-zero elements will be at the front of the array, and the zeros will be pushed to the end, as they were swapped with elements in the front.

Example Usage:

# Example 1:
nums1 = [0, 1, 0, 3, 12]
move_zeroes(nums1)
print(nums1)  # Output: [1, 3, 12, 0, 0]

# Example 2:
nums2 = [0, 0, 1]
move_zeroes(nums2)
print(nums2)  # Output: [1, 0, 0]

Explanation of Test Cases:

Test Case 1:
- Input: [0, 1, 0, 3, 12]
- Output: [1, 3, 12, 0, 0]
- Explanation: The zeros are moved to the end while the relative order of 1, 3, and 12 is maintained.
Test Case 2:
- Input: [0, 0, 1]
- Output: [1, 0, 0]
- Explanation: The single non-zero element 1 is moved to the front, and the zeros are moved to the end.

Time Complexity:

O(n), where n is the length of the array. We traverse the array once with the right pointer and perform constant-time swaps.

Space Complexity:

O(1), since the rearrangement is done in place and no extra space is used apart from a few variables for tracking indices.

This approach efficiently solves the problem by moving zeros to the end of the array while maintaining the order of the non-zero elements in a single pass.

1570. Dot Product of Two Sparse Vectors

Problem: Given two sparse vectors, compute their dot product. Implement the SparseVector class:

SparseVector(nums) initializes the object with the vector nums.
dotProduct(vec) computes the dot product with another SparseVector.

Explanation of the Approach:

The problem asks to compute the dot product of two sparse vectors. In a sparse vector, most elements are zero, so iterating through every element in the vector (as we would in a dense vector) would be inefficient. Instead, we can optimize this by storing only the non-zero elements and their indices.

Key Observations:

Sparse Vectors:
- Sparse vectors contain a large number of zeros, so it’s wasteful to store all elements.
- We can store only the non-zero elements along with their indices in the form of pairs (index, value).
Dot Product:
- The dot product of two vectors is the sum of the products of their corresponding elements.
- For sparse vectors, we only need to consider the indices where both vectors have non-zero values.
Two-Pointer Approach:
- Since we are storing only the non-zero elements, we can use a two-pointer technique to traverse through both vectors efficiently.
- Both pointers start from the beginning of the non-zero pairs, and we move the pointers based on the comparison of indices:
  - If the indices are equal, compute the product of the values and move both pointers forward.
  - If one index is smaller, move the corresponding pointer forward.

Python Code:

class SparseVector:
    def __init__(self, nums):
        # Store pairs of (index, value) where value is non-zero
        self.pairs = [(i, num) for i, num in enumerate(nums) if num != 0]

    def dotProduct(self, vec):
        result = 0
        p1, p2 = 0, 0

        # Two-pointer approach
        while p1 < len(self.pairs) and p2 < len(vec.pairs):
            index1, value1 = self.pairs[p1]
            index2, value2 = vec.pairs[p2]

            if index1 == index2:
                result += value1 * value2  # Multiply when indices are the same
                p1 += 1
                p2 += 1
            elif index1 < index2:
                p1 += 1  # Move pointer in the first vector
            else:
                p2 += 1  # Move pointer in the second vector

        return result

Explanation of Code:

Initialization:
- In the constructor __init__, we store the non-zero elements and their indices in a list called pairs. This is done using a list comprehension that iterates through the input nums and includes only non-zero values with their corresponding indices.
dotProduct Method:
- In the dotProduct method, we use two pointers p1 and p2 to iterate through the pairs list of both vectors.
- If the indices at the two pointers match, we calculate the product of the values and add it to the result.
- If the indices don't match, we move the pointer with the smaller index forward.
- The loop continues until we exhaust one of the lists of non-zero pairs.

Example Usage:

# Example 1:
nums1 = [1, 0, 0, 2, 3]
nums2 = [0, 3, 0, 4, 0]

v1 = SparseVector(nums1)
v2 = SparseVector(nums2)
print(v1.dotProduct(v2))  # Output: 8 (2*4)

# Example 2:
nums3 = [0, 1, 0, 0, 2, 0, 0]
nums4 = [1, 0, 0, 0, 3, 0, 4]

v3 = SparseVector(nums3)
v4 = SparseVector(nums4)
print(v3.dotProduct(v4))  # Output: 6 (1*0 + 2*3 = 6)

Explanation of Test Cases:

Test Case 1:
- nums1 = [1, 0, 0, 2, 3], nums2 = [0, 3, 0, 4, 0]
- The only common index with non-zero values is at index 3, where 2 * 4 = 8. Thus, the dot product is 8.
Test Case 2:
- nums3 = [0, 1, 0, 0, 2, 0, 0], nums4 = [1, 0, 0, 0, 3, 0, 4]
- The common non-zero indices are at index 4, where 2 * 3 = 6. Thus, the dot product is 6.

Time Complexity:

O(n1 + n2), where n1 and n2 are the number of non-zero elements in nums1 and nums2, respectively. This is because we traverse through only the non-zero elements using two pointers, so the complexity is proportional to the number of non-zero elements, not the length of the input arrays.

Space Complexity:

O(n1 + n2) for storing the non-zero elements of both vectors. The storage is proportional to the number of non-zero elements in both vectors, rather than the size of the full arrays.

Explanation:

Initialization:
- The SparseVector constructor processes the input list nums and stores non-zero elements along with their indices in the pairs list.
Dot Product:
- Use two pointers, p1 and p2, to iterate through the non-zero elements of both vectors.
- When indices match, multiply the corresponding values and add to the result.
- If indices do not match, advance the pointer with the smaller index.
- Continue until one of the pointers exceeds the length of its list.

Summary of Two Pointers Technique

The Two Pointers technique is highly versatile and effective for solving a wide range of problems, including:

String and array manipulation (valid palindromes, word abbreviations, merging arrays).
Graph and tree traversal (lowest common ancestor).
Combinatorial problems (3Sum).
Sequence generation and modification (next permutation).
Array partitioning (move zeroes).
Efficient computations with sparse data structures (dot product of sparse vectors).

By maintaining two pointers that traverse the data structure in a strategic manner, you can often achieve significant performance improvements over more naive approaches.

'ML Engineering > python' 카테고리의 다른 글

03. Binary Tree DFS Techniques (0)	2024.08.06
02. Sliding Window Technique (0)	2024.08.06
Differences between subarrays, substrings, subsequences, and subsets (0)	2024.08.06
Python 2 vs 3 Difference (0)	2024.04.12
Python - Data types (0)	2024.04.05

Differences between subarrays, substrings, subsequences, and subsets

2024. 8. 6. 05:27

Subarray: Contiguous segment of an array.
- [1, 2, 3] in [0, 1, 2, 3, 4]
Substring: Contiguous segment of a string.
- "ell" in "hello"
Subsequence: Sequence derived by deleting some or no elements without changing the order.
- [1, 3] in [1, 2, 3, 4]
- "hlo" in "hello"
Subset: Any combination of elements from a set, order does not matter.
- {1, 3} in {1, 2, 3}
- {}, {1, 2}, {3, 2} in {1, 2, 3}

1. Subarray

Definition: A subarray is a contiguous segment of an array. It includes elements that are consecutive in the original array.

Characteristics:

Must be contiguous.
Order of elements must be preserved.
Length can range from 0 to the length of the array.

Example:

Original Array: [1, 2, 3, 4]
Possible Subarrays: [1, 2], [2, 3, 4], [3], [1, 2, 3, 4], [] (empty subarray)

2. Substring

Definition: A substring is a contiguous sequence of characters within a string. Like subarrays, substrings consist of consecutive characters.

Characteristics:

Must be contiguous.
Order of characters must be preserved.
Length can range from 0 to the length of the string.

Example:

Original String: "hello"
Possible Substrings: "he", "ell", "hello", "o", "" (empty substring)

3. Subsequence

Definition: A subsequence is a sequence derived from another sequence (array or string) by deleting some or no elements without changing the order of the remaining elements.

Characteristics:

Does not need to be contiguous.
Order of elements/characters must be preserved.
Length can range from 0 to the length of the sequence.

Example:

Original Array: [1, 2, 3, 4]
Possible Subsequences: [1, 3], [2, 4], [1, 2, 3, 4], [] (empty subsequence)
Original String: "hello"
Possible Subsequences: "hlo", "el", "hello", "" (empty subsequence)

4. Subset

Definition: A subset is any combination of elements from a set, where the order does not matter. Subsets include all possible combinations, regardless of order or contiguity.

Characteristics:

Does not need to be contiguous.
Order of elements/characters does not matter.
Length can range from 0 to the length of the set.
Includes all possible combinations of the elements.

Example:

Original Set: {1, 2, 3}
Possible Subsets: {}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}

'ML Engineering > python' 카테고리의 다른 글

03. Binary Tree DFS Techniques (0)	2024.08.06
02. Sliding Window Technique (0)	2024.08.06
01. Two Pointers Technique (0)	2024.08.06
Python 2 vs 3 Difference (0)	2024.04.12
Python - Data types (0)	2024.04.05

Python 2 vs 3 Difference

2024. 4. 12. 11:22

Here are some key differences between Python 2 and Python 3:

Division: In Python 2, integer division always results in an integer. In Python 3, integer division results in a float if the result is not an integer.

print(7 / 5 )
print(-7 / 5)	 
''' 
Output in Python 2.x 
1 
-2 

Output in Python 3.x : 
1.4 
-1.4 
'''

Annotations: Python 3 supports type annotations, which can help with code clarity and maintainability. Python 2 does not support type annotations.

Exceptions: In Python 2, exceptions are enclosed in notations. In Python 3, exceptions are enclosed in parentheses.

Storage of Strings: In Python 2, strings are stored as ASCII by default, while in Python 3, strings are stored as Unicode by default. This means that in Python 3, you can use non-ASCII characters in your strings without having to explicitly encode them.

print(type('default string ')) 
print(type(u'string with b ')) 
''' 
Output in Python 2.x (Unicode and str are different) 
<type 'str'> 
<type 'unicode'> 

Output in Python 3.x (Unicode and str are same) 
<class 'str'> 
<class 'str'> 
'''

Print statement: In Python 2, the print statement is used to print output to the console. In Python 3, the print function is used instead. The print function is more flexible than the print statement, and it allows you to print multiple values on the same line, for example.

Range function: In Python 2, the xrange() function is used to generate a sequence of numbers. In Python 3, the range() function is used instead. The range() function is more efficient than the xrange() function, and it can be used to generate infinite sequences.

'ML Engineering > python' 카테고리의 다른 글

03. Binary Tree DFS Techniques (0)	2024.08.06
02. Sliding Window Technique (0)	2024.08.06
01. Two Pointers Technique (0)	2024.08.06
Differences between subarrays, substrings, subsequences, and subsets (0)	2024.08.06
Python - Data types (0)	2024.04.05

PREV 1 2 3 4 5 6 7 ···11 NEXT

Notes