1.7「Stanford Algorithms」Merge So

作者: 墨小匠 | 来源:发表于2019-10-03 13:13 被阅读0次

Okay, so let's move on, and actually discuss the pseudo-code for the merge sort algorithm.

First, let me just tell you the pseudo-code, leaving aside exactly how the merging subroutine is implemented.

And thus, high levels should be very simple and clear at this point.

So there's gonna be two recursive calls, and then there's gonna be a merging step.

Now, I owe you a few comments, 'cause I'm being a little sloppy.

Again, as I promised, this isn't something you would directly translate into code, although it's pretty close.

But so what are the couple of the ways that I'm being sloppy? Well, first of all, there's, [inaudible], you know, in any recursive algorithm, you gotta have some base cases.

You gotta have this idea that when the input's sufficient.

Really small you don't do any recursion, you just return some trivial answer.

So in the sorting problem the base case would be if your handed an array that has either zero or an elements, well it's already sorted, there's nothing to do, so you just return it without any recursion.

Okay, so to be clear, I haven't written down the base cases.

Although of course you would if you were actually implementing, a merge short.

Some of you, make a note of that.

A couple of other things I'm ignoring.

I'm ignoring what the, what to do if the array has odd lengths, so if it has say nine elements, obviously you have to somehow break that into five and four or four and five, so you would do that just in either way and that would fine.

And then secondly, I'm ignoring the details or what it really means to sort of recursively sort, so for example, I'm not discussing exactly how you would pass these subarrays onto the recursive calls.

That's something that would really depend somewhat on what, on the programming language, so that's exactly what I want to avoid.

I really want to talk about the concepts which transcend any particular programming language implementation.

So that's why I'm going to describe algorithms at this level okay.

Alright, so the hard part relatively speaking, that is.

How do you implement the merge depth? The recursive calls have done their work.

We have these two sort of separated half the numbers.

The left half and the right half.

How do we combine them into one? And in English, I already told you on the last slide.

The idea is you just populate the output array in a sorted order, by traversing pointers or just traversing through the two, sorted sub-arrays in parallel.

So let's look at that in some more detail.

Okay, so here is the pseudo-code for the merge step.

[sound] So let me begin by, introducing some names for the, characters in the, what we're about to discuss.

So let's use C.

To denote the output array.

So this is what we're suppose to spit out with the numbers in sorted order.

And then, I'm gonna use a and b to denote the results of the two recursive calls, okay? So, the first recursive call has given us array a, which contains the left half of the input array in sorted order.

Similarly, b contains the right half of the input array, again, in sorted order.

So, as I said, we're gonna need to traverse the two, sorted sub-arrays, a and b, in parallel.

So, I'm gonna introduce a counter, i, to traverse through a, j to traverse through b.

I and j will both be initialized to one, to be at the beginning of their respective arrays.

And now we're gonna do.

We're going to do a single pass of the output array copying it in an increasing order.

Always taking the smallest from the union of the two sorted sub arrays.

And if you, if there's one idea in this merge step it's just the realization that.

The minimum element that you haven't yet looked at in A and B has to be at the front of one or the two lists right so for example at the very beginning of the algorithm where is the minimum element over all.

Well, which ever of the two arrays it lands in -- A or B -- it has to be the smallest one there okay.

So the smallest element over all is either the smallest element A or it's the smallest element B.

So you just check both places, the smaller one is the smallest you copy it over and you repeat.

That's it.

So the purpose of K is just to traverse the output array from left to right.

That's the order we're gonna populate it.

Currently looking at position I, and the first array of position J and the second array.

So that's how far we've gotten, how deeply we've probed in the both of those two arrays.

We look at which one has the current smallest, and we copy the smallest one over.

Okay? So if the, if, the entry in the i position of A is smaller, we copy that one over.

Of course, we have to increment i.

We probe one deeper into the list A, and symmeterically for the case where the current position in B has the smaller element.

Now again, I'm being a little bit sloppy, so that we can focus on the forest, and not sort of, And not get bogged down with the trees.

I'm ignoring some end cases, so if you really wanted to implement this, you'd have to add a little bit, to keep track of when you fall off, either, either A or B.

Because you have additional checks for when i or j reaches the end of the array, at which point you copy over all the remaining elements into C.

Alright, so I'm gonna give you a cleaned up version, of, that pseudo-code so that you don't have to tolerate my questionable handwriting any longer than is absolutely necessary.

This again, is just the same thing that we wrote on the last slide, okay? The pseudo-code for the merge step.

Now, so that's the Merge Sort algorithm.

Now let's get to the meaty part of this lecture, which is, okay, so merge sort produces a sorted array.

What makes it, if anything, better than much simpler non divide and conquer algorithms, like say, insertion sort? Other words, what is the running time of the merge sort algorithm? Now I'm not gonna give you a completely precise definition, definition of what I mean by running time and there's good reason for that, as we'll discuss shortly.

But intuitively, you should think of the running time of an algorithm, you should imagine that you're just running the algorithm in a debugger.

Then, every time you press enter, you advance with one line of the program through the debugger.

And then basically, the running time is just a number of operations executed, the number of lines of code executed.

So the question is, how many times you have to hit enter on the debugger before the, program finally terminates.

So we're interested in how many such, lines of code get executed for Merge Short when an input array has n numbers.

Okay, so that's a fairly complicated question.

So let's start with a more modest school.

Rather than thinking about the number of operations executed by Merge Sort, which is this crazy recursive algorithm, which is calling itself over and over and over again.

Let's just think about how many operations are gonna get executed when we do a single merge of two sorted sub arrays.

That seems like it should be an easier place to start.

So let me remind you, the pseudo code of the merge subroutine, here it is.

So let's just go and count up how many operations that are gonna get used.

So there's the initialization step.

So let's say that I'm gonna charge us one operation for each of these two initializations.

So let's call this two operations, just set i equal to one and j equal to one then we have this four loop executes a total number of end times so each of these in iterations of this four loop how many instructions get executed, well we have one here we have a comparison so we compare A(i) to B(j) and either way the comparison comes up we then do two more operations, we do an assignment.

Here or here.

And then we do an increment of the relevent variable either here or here.

So that's gonna be three operations per iteration.

And then maybe I'll also say that in order to increment K we're gonna call it a fourth iteration.

Okay? So for each of these N iterations of the four loop we're gonna do four operations.

All right? So putting it all together, what do we have is the running time for merge.

So let's see the upshot.

So the upshot is that the running time of the merge subroutine, given an array of M numbers, is at most four M plus two.

So a couple of comments.

First of all, I've changed a letter on you so don't get confused.

In the previous slide we were thinking about an input size of N.

Here I've just made it.

See I've changed the name of the variable to M.

That's gonna be convenient once we think about merge sort, which is recursing on smaller sub-problems.

But it's exactly the same thing and, and whatever.

So an array of M entries does as most four M plus two.

Lines of code.

The second thing is, there's some ambiguity in exactly how we counted lines of code on the previous slide.

So maybe you might argue that, you know, really, each loop iteration should count as two operations, not just one.

'Cause you don't just have to increment K, but you also have to compare it to the, upper bound of N.

Eh, maybe.

Would have been 5M+2 instead of 4M+2.

So it turns out these small differences in how you count up.

The number of lines of code executed are not gonna matter, and we'll see why shortly.

So, amongst friends, let's just agree, let's call it 4M plus two operations from merge, to execute on array on exactly M entries.

So, let me abuse our friendship now a little bit further with an, an inequality which is true, but extremely sloppy.

But I promise it'll make our lives just easier in some future calculations.

So rather than 4m+2, 'cause 2's sorta getting on my nerves.

Let's just call this.

Utmost six N.

Because m is at least one.

[sound] Okay, you have to admit it's true, 6MO is at least 4M plus two.

It's very sloppy, these numbers are not anything closer to each other for M large but, let's just go ahead and be sloppy in the interest of future simplicity.

Okay.

Now I don't expect anyone to be impressed with this rather crude upper bound, the number of lines of code that the merge subroutine needs to finish, to execute.

The key question you recall was how many lines of code does merge sort require to correctly sort the input array, not just this subroutine.

And in fact, analyzing Merge Sort seems a lot more intimidating, because if it keeps spawning off these recursive versions of itself.

So the number of recursive calls, the number of things we have to analyze, is blowing up exponentially as we think about various levels of the recursion.

Now, if there's one thing we have going for us, it's that every time we make a recursive call.

It's on a quite a bit smaller input then what we started with, it's on an array only half the size of the input array.

So there's some kind of tension between on the one hand explosion of sub problems, a proliferation of sub problems and the fact that successive subproblems only have to solve smaller and smaller subproblems.

And resolute resolving these two forces is what's going to drive our analysis of Merge Short.

So, the good news is, is I'll be able to show you a complete analysis of exactly how many lines of code Merge Sort takes.

And I'll be able to give you, and, in fact, a very precise upper bound.

And so here's gonna be the claim that we're gonna prove in the remainder of this lecture.

So the claim is that Merge Short never needs than more than six times N.

Times the logarithm of N log base two if you're keeping track plus an extra six N operations to correctly sort an input array of N numbers, okay so lets discuss for a second is this good is this a win, knowing that this is an upper bound of the number of lines of code the merger takes well yes it is and it shows the benefits of the divide and conquer paradigm.

Recall.

In the simpler sorting methods that we briefly discussed like insertion sort, selection sort, and bubble sort, I claimed that their performance was governed by the quadratic function of the input size.

That is they need a constant times in the squared number of operations to sort an input array of length N.

Merge sort by contrast needs at most a constant times N times log N, not N squared but N times log N lines of code to correctly sort an input array.

So to get a feel for what kind of win this is let me just remind you for those of you who are rusty, or for whatever reason have lived in fear of a logarithm, just exactly what the logarithm is.

Okay? So.

The way to think about the logarithm is as follows.

So you have the X axis, where you have N, which is going from one up to infinity.

And for comparison let's think about just the identity function, okay? So, the function which is just.

F(n)=n.

Okay, and let's contrast this with a logarithm.

So what is the logorithm? Well, for our purposes, we can just think of a logorithm as follows, okay? So the log of n, log base 2 of n is, you type the number N into your calculator, okay? Then you hit divide by two.

And then you keep repeating dividing by two and you count how many times you divide by two until you get a number that drops below one okay.

So if you plug in 32 you got to divide five times by two to get down to one.

Log base two of 32 is five.

You put in 1024 you have to divide by two, ten times till you get down to one.

So log base two of 1024 is ten and so on, okay.

So the point is you already see this if a log of a 1000 roughly is something like ten then the logarithm is much, much smaller than the input.

So graphically, what the logarithm is going to look like is it's going to look like.

A curve becomes very flat very quickly, as N grows large, okay? So F(n) being log base 2 of n.

And I encourage you to do this, perhaps a little bit more precisely on the computer or a graphing calculator, at home.

But log is running much, much, much slower than the identity function.

And as a result, sorting algorithm which runs in time proportional to n times log n is much, much faster, especially as n grows large, than a sorting algorithm with a running time that's a constant times n squared.

好的，让我们继续，然后实际讨论合并排序算法的伪代码。

首先，让我告诉您伪代码，而确切地讲如何合并子例程。

因此，此时高水平应该非常简单明了。

因此，将有两个递归调用，然后将有一个合并步骤。

现在，我欠您一些评论，因为我有点草率。

再次，正如我所承诺的那样，尽管它非常接近，但这不是您可以直接将其转换为代码的东西。

但是，我草率的几种方式是什么？好吧，首先，[听不清]，您知道，在任何递归算法中，都有一些基本情况。

您必须有这样的想法，即输入足够时。

真的很小，您不进行任何递归，只需返回一些简单的答案即可。

因此，在排序问题中，基本情况将是：如果您递交的数组具有零个或元素，那么它已经被排序了，无事可做，因此您无需进行任何递归就可以将其返回。

好的，很明显，我没有写下基本案例。

当然，如果您实际上要实现，则合并很短。

你们中的一些人，记下了这一点。

我忽略了其他几件事。

我忽略了数组具有奇数长度的情况，该怎么办，因此，如果数组具有9个元素，显然您必须以某种方式将其分解为5和4或4和5，因此您可以采用任何一种方式这样就可以了

其次，我忽略了细节或进行递归排序的真正含义，因此，例如，我没有讨论要如何将这些子数组传递给递归调用。

这实际上取决于某种语言和编程语言，所以这正是我要避免的事情。

我真的很想谈谈超越任何特定编程语言实现的概念。

这就是为什么我要在这个级别上描述算法的原因。

好吧，相对而言，最困难的部分是。

您如何实现合并深度？递归调用已完成其工作。

我们将这两种数字分开一半。

左半部分和右半部分。

我们如何将它们组合成一个整体？在上一张幻灯片中，我已经用英语告诉过您。

想法是，您可以通过遍历指针或仅并行遍历两个已排序的子数组，以有序的顺序填充输出数组。

因此，让我们更详细地了解一下。

好的，这是合并步骤的伪代码。

[声音]因此，让我开始介绍一下即将讨论的主题名称。

因此，让我们使用C。

表示输出数组。

因此，这就是我们想按排序顺序吐出的数字。

然后，我将使用a和b表示两个递归调用的结果，好吗？因此，第一个递归调用给了我们一个数组a，该数组按排序顺序包含了输入数组的左半部分。

类似地，b仍按排序顺序包含输入数组的右半部分。

因此，正如我所说，我们将需要并行遍历两个排序的子数组a和b。

因此，我要介绍一个计数器i，以遍历a，j遍历b。

I和j都将被初始化为1，位于它们各自数组的开头。

现在我们要做。

我们将对输出数组进行一次遍历，并以递增顺序复制它。

始终从两个排序的子数组的并集中取最小的一个。

而且，如果您在此合并步骤中有一个想法，那就仅仅是实现了。

您尚未在A和B中查看过的最小元素必须在一个或两个列表的最前面，例如，在算法的最开始，这里是所有元素中的最小元素。

好吧，它落入的两个阵列中的任何一个-A或B-它必须是那里最小的阵列。

因此，总体上最小的元素是最小的元素A或最小的元素B。

因此，您只需检查两个地方，较小的地方就是将其复制到最小的地方，然后重复。

而已。

因此，K的目的仅仅是从左到右遍历输出数组。

这就是我们要填充的顺序。

当前正在查看位置I，以及位置J的第一个数组和第二个数组。

这就是我们所走的路，我们对这两个数组的探究有多深。

我们查看哪一个当前最小，然后复制最小的那一个。

好的？因此，如果A的i位置中的条目较小，则将其复制过来。

当然，我们必须增加i。

我们将更深入地研究列表A，并对称地针对列表B中的当前位置具有较小元素的情况进行研究。

再说一次，我有点草率，所以我们可以专注于森林，而不是那种，而不会陷入树木的泥潭。

我忽略了一些最终情况，因此，如果您真的想实现此目的，则必须添加一些内容，以跟踪A或B何时脱落。

因为您需要额外检查i或j何时到达数组的末尾，此时您会将所有剩余的元素复制到C中。

好了，因此，我将为您提供该伪代码的清理版本，以使您不必忍受超出绝对必要时间的我可疑的笔迹。

同样，这与我们在上一张幻灯片中写的一样，好吗？合并步骤的伪代码。

现在，这就是合并排序算法。

现在，让我们进入本讲座的重点，好吧，所以归并排序产生一个排序数组。

是什么使它比简单的非分而治之算法（如插入排序）更好呢？换句话说，合并排序算法的运行时间是多少？现在，我不会给您一个完全精确的定义，即运行时间的定义，这有充分的理由，我们将在稍后进行讨论。

但是从直觉上讲，您应该考虑算法的运行时间，应该想象您只是在调试器中运行算法。

然后，每按一次Enter键，便通过调试器前进一行程序。

然后基本上，运行时间只是执行的操作数，执行的代码行数。

所以问题是，在程序最终终止之前，您必须在调试器上按回车几次。

因此，我们对输入数组有n个数字时为Merge Short执行多少行这样的代码感兴趣。

好的，这是一个相当复杂的问题。

因此，让我们从一个更谦虚的学校开始。

而不是考虑由Merge Sort执行的操作数，Merge Sort是这种疯狂的递归算法，它会一遍又一遍地调用自己。

让我们考虑一下当我们对两个排序的子数组进行一次合并时将执行多少个操作。

似乎应该更容易开始。

让我提醒您，这里是合并子例程的伪代码。

因此，让我们去计算一下将要使用的操作数。

因此，有初始化步骤。

因此，假设我要为这两个初始化的每个操作收取一个操作。

因此，我们将这两个操作称为“ i”，将“ i”设为“ 1”，“ j”等于“ 1”，然后使这四个循环执行总的结束时间，因此在这四个循环的迭代中，每个执行了多少条指令，这里有一个比较，因此我们将A（i）与B（j）进行比较，无论哪种方式出现，我们都会再进行两次运算，然后进行赋值。

在这里或这里。

然后，我们在此处或此处对relevent变量进行增量。

这样，每次迭代将需要进行三个操作。

然后也许我也要说，为了增加K，我们将其称为第四次迭代。

好的？因此，对于这四个循环的N个迭代中的每一个，我们将执行四个操作。

好吧？因此，将所有内容放在一起，就是合并的运行时间。

因此，让我们看看结果。

因此，结果是，给定M个数的数组，merge子例程的运行时间最多为4个M加2。

所以有几点评论。

首先，我给您写了一封信，所以不要感到困惑。

在上一张幻灯片中，我们正在考虑输入大小N。

在这里，我就做到了。

请参阅我已将变量名称更改为M。

一旦我们考虑合并排序（这将在较小的子问题上重复出现），那将很方便。

但这是完全一样的，而且，无论如何。

因此，由M个条目组成的数组最多可包含四个M加两个。

代码行。

第二件事是，在上一张幻灯片中我们如何计算代码行数方面存在一些歧义。

因此，也许您可能会争辩说，您知道，实际上，每个循环迭代应计为两个操作，而不仅仅是一个。

因为不仅要增加K，还必须将其与N的上限进行比较。

嗯，也许吧。

应该是5M + 2而不是4M + 2。

因此，事实证明您在计算方式上存在这些细微的差异。

执行的代码行数无关紧要，我们很快就会明白为什么。

因此，在朋友当中，让我们同意吧，我们称其为4M加上合并中的两个操作，以便在M个条目上的数组上执行。

所以，现在让我进一步滥用我们的友谊，这是不平等的，但确实很草率。

但我保证，在以后的一些计算中，这将使我们的生活更加轻松。

因此，不是4m + 2，而是2引起了我的不安。

我们就这样称呼它。

最多六个N。

因为m至少为1。

[声音]好的，您必须承认这是事实，6MO至少为4M加2M。

这很草率，对于M大来说，这些数字彼此之间并不太接近，但是，为了未来的简单性，让我们继续草率地做些草率。

好的。

现在，我不希望任何人会对这个相当粗糙的上限（执行合并子例程需要完成的代码行数）印象深刻。

您回想的关键问题是合并排序需要多少行代码才能对输入数组进行正确排序，而不仅仅是此子例程。

实际上，对Merge Sort的分析似乎更令人生畏，因为如果它不断产生于它们自己的这些递归版本中，那么它就会变得更加令人生畏。

因此，当我们考虑递归的各个级别时，递归调用的数量（我们必须分析的事物）呈指数级增长。

现在，如果有一件事情要我们去做，那就是每次我们进行递归调用时。

它的输入要比我们开始时小很多，它的输入数组只有输入数组大小的一半。

因此，一方面是子问题的激增，子问题的泛滥与连续子问题只需要解决越来越小的子问题之间存在某种张力。

坚决解决这两种力量将推动我们对“合并短缺”的分析。

因此，好消息是，我将向您展示Merge Sort需要多少行代码的完整分析。

而且，我将能够为您提供一个非常精确的上限。

因此，这就是我们将在本讲座的其余部分中证明的主张。

因此，声称“合并短”的需求永远不会超过N的六倍。

如果您要跟踪，则对N个对数以2为底的对数加另外6个N运算，以正确地对N个数字的输入数组进行排序，好吧，让我们再讨论一下，这是一次胜利，知道这是一个胜利是的，合并很好地达到了代码行数的上限，它显示了分而治之范式的好处。

召回。

在我们简要讨论过的更简单的排序方法中，例如插入排序，选择排序和冒泡排序，我声称它们的性能由输入大小的二次函数决定。

也就是说，它们需要操作次数平方的常数来对长度为N的输入数组进行排序。

通过对比合并排序最多需要一个恒定的时间N倍N，而不是N平方，而是N倍N代码行来正确排序输入数组。

因此，让我想一想这是一场什么样的胜利，让我提醒那些生锈的人，或者出于某种原因而生活在对数中的人们，正是对数。

好的？所以。

考虑对数的方式如下。

因此，您拥有X轴，您拥有N轴，该轴从一个到无限大。

为了进行比较，让我们考虑一下身份函数，好吗？所以，这只是功能。

F（n）＝ n。

好的，让我们将此与对数进行对比。

那么，徽标是什么？好吧，出于我们的目的，我们可以想到以下徽标，好吗？因此，n的对数（n的对数以2为底）是，您在计算器中键入数字N，好吗？然后您将除以二。

然后您继续重复除以2，然后计算除以2的次数，直到得到的数字降到1以下。

因此，如果您插入32，则必须将五分之二除以1。

32的对数底数是5。

您输入1024，则必须除以2、10倍，直到下降到1。

因此，以1024为底的对数二是10，依此类推，好的。

因此，要点是，如果1000的对数大约为10，则对数远比输入小得多。

因此，从图形上看，对数看起来像是对数。

随着N变大，曲线很快变得非常平坦，好吗？因此F（n）是n的对数底数2。

我鼓励您这样做，也许可以在家中计算机或图形计算器上更精确一些。

但是日志运行的速度比身份功能慢得多。

结果，与运行时间为常数乘以n平方的排序算法相比，按n倍于log n的时间运行的排序算法要快得多，尤其是当n变大时。

1.7「Stanford Algorithms」Merge So

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读