Introduction
Welcome to this article on the Longest Increasing Subsequence (LIS) problem.
The LIS problem consists of finding the longest subsequence in a given sequence such that all elements of the subsequence are sorted in increasing order.
In this article, we will discuss the concept of subproblem formulation and how to combine them, the recurrence relation and memoization technique, the code for a bottom-up dynamic programming solution in topological order, the problem graph for the LIS problem, and the time complexity of the solution.
Subproblem formulation and how to combine them
The variables for the subproblems in the longest increasing subsequence (LIS) problem are 2 which are:
- The index of the element in the list
- The value of the element at that index.
These variables are used to determine the length of the LIS at a given index in the list.
The DP[i][j]
array would represent the LIS length at index i in the list, where j is the value of the element at that index.
For example, consider the list [10, 9, 2, 5, 3, 7, 101, 18].
In a DP solution to the LIS problem, the DP[i][j]
array would represent the LIS length at each index i
in the list, where j
is the value of the element at that index.
For the first element (index 0, value 10), DP[0][10] would be 1
, since there is only one element in the list.
For the second element (index 1, value 9), DP[1][9] would also be 1
, since there are no elements in the list that are less than 9 and come after it.
In general, the i
dimension in DP[i][j]
represents the index of the element in the list, and the j
dimension represents the value of the element at that index.
These dimensions are used to determine the LIS length at a given index in the list.
The LIS_at_index
function uses recursive calls to calculate the LIS length at a given index in the list.
The LIS_at_index
function is called recursively in a loop, where the loop iterates over the previous elements in the list. For each previous element, the LIS_at_index
function is called to calculate the LIS length at that index. This value is then used to update the LIS length at the current index, if necessary.
The recurrence relation and Memoization concept
The recurrence relation for the LIS problem can then be defined as:
DP[i][j]=1 ,i=0
DP[i][j]=max(DP[i][j],DP[k][j]+1 , j>k,for all k<i
In this recurrence relation, the LIS length at index i
is initially set to 1, since there is at least one element in the list.
The LIS length at index i
is then updated to be the maximum of the previously calculated LIS lengths at indices less than i
plus 1, if the value of the element at index i
is greater than the value of the element at those previous indices.
This recurrence relation can be used to calculate the LIS length at each index in the list, which can then be used to determine the overall solution to the LIS problem.
In the case of the LIS problem, the memo
dictionary is used to store the LIS lengths at each index in the list. When the LIS_at_index
function is called on a given index, it first checks the memo
dictionary to see if the LIS length at that index has already been calculated.
If it has, the stored value is used instead of recalculating it. This can help to improve the performance of the algorithm by avoiding repeated calculations.
In summary, the memo
dictionary in the solution to the LIS problem is used to implement memoization.
Code for the Bottom-up DP formulation in topological order
Problem graph for the LIS problem
The problem graph for the longest increasing subsequence (LIS) problem can be represented as a directed acyclic graph (DAG), which is constructed as follows:
1. Each element is added as a node
2. For each element node, add outgoing edges with a weight of 1 to all the greater elements after it
3. Add node B that has outgoing edge of with a weight 0 to all element nodes.
4. Add node E that has incoming edges with a weight 0 from all the element nodes.
So for the example array of [1,5,4,8], we have:
The Longest Increasing Subsequence solution is now the longest path from B to E.
Time complexity for the solution
The LIS_at_index
function, which is used to calculate the LIS length at a given index in the list, has a time complexity of O(n)
, since it loops over the previous elements in the list and calls itself recursively on each of them. Since this function is called on each element in the list, the overall time complexity of the algorithm is O(n²), since O(n) * O(n) = O(n²).
The memo
dictionary, which is used to store the previously calculated LIS lengths, does not affect the time complexity of the algorithm, since it is only used to look up previously calculated values and does not involve any additional calculations.
In summary, the solution to the LIS problem provided earlier has a time complexity of O(n²), where n
is the length of the list.
This means that the algorithm’s performance will scale quadratically with the size of the input, which means that it may not be efficient for very large inputs.
Conclusion:
In this article, we have learned about the Longest Increasing Subsequence (LIS) problem and its dynamic programming solutions.
The LIS problem has various applications in modern solutions, such as data analysis, machine learning, and natural language processing.
I hope that this article has provided a clear understanding of the LIS problem and its dynamic programming solutions, and that you will be able to apply these concepts to solve similar problems and create innovative solutions in the future.
Thank you for reading!
You can refer to this GitHub repository for the codes used: