Sep 13, 2024
For a Long Time, I Misunderstood Amortized Analysis
For a long time, I thought amortized analysis is simply "given a sequence of costs, take their average". It was yesterday that I realized there is a subtle flaw there.
A bit on why I wrote this post: previously I thought that if we have costs, where of them is but the rest is , then the amortized cost will be . After some discussions, I realized that this is not true, as we will see in this post.
Let's go with a semi-formal definition of the true amortized analysis. Suppose we have a sequence of operations, where the -th operation's cost is . If the amortized cost is , then for any (and inputs), . Well, looks similar to what I had before, no?
To better understand the context, amortized analysis typically appears in online algorithms. In online algorithms, we can think of the operations as events, and when an event arrive, we (typically) have to deal with it immediately. A classic example is Online Connectivity, where given a graph of vertices, we have to deal with possible events: , , and . In online setting, when arrives, we have to immediately output whether and are connected or not.
In any case, there is an important point of an online algorithm: it does not know the future. It does not even know whether there will be any new events or not. The latter will be crucial to understand what went wrong with my previous understanding of amortized analysis. Now, suppose that we have the following two sequences of costs:
and
Following my previous understanding, both will have amortized cost of .
But, does it make sense?
Recall that in online algorithms, the algorithm do not know the future. Hence, during the first event, the algorithm cannot know whether it is the last event, or there will be any new events. Thus, if the first event's cost is , there must be an execution where, it is the only event, and the amortized cost is . To put a more extreme example, imagine having events, where the first one's cost is but the rest is . Then, for simplicity, let's assume that for any possible events, the cost sequence will be like that. By my previous understanding, the amortized cost will be . But in reality, there must exists execution where, based on my previous understanding, the amortized cost will be (i.e. only the first event arrive). Hence, saying it as is wrong.
What went wrong is that I forgot the for any part. So, rather than
it should be
In other words, if the amortized cost is , then at any point in time, the current average cost must be .
In a way, this is why in the accounting method and potential method we do not want to have a debt. Simply because we do not know whether we will be able pay it or not.
Remark: technically, we have to consider all executions (i.e. all possible sequences) rather than a single execution. In that case, the statement would be "take the maximum average from all possible sequences". (Note that this would also capture the maximum prefix average, as prefix is also a sequence.) I refrain from explaining that way because when I was taught of amortized analysis, I was given an example of a sequence, in which the subtle part that the prefix is also a possible sequence is omitted. Which, in turn, gives me an illusion that we just need to take the average. In particular, it's not immediately clear to me that amortized analysis implies we cannot have huge terms at the start. (Thanks athin for pointing this out.)
Coming back to why I wrote this post: previously I thought that if we have costs, where of them is but the rest is , then the amortized cost will be . However, when we talk about amortization, order matters, thus I cannot say that the amortized cost will be if the "expensive thing" can happen upfront.
Note to self: if I have to teach amortized analysis, perhaps I have to show a "negative" example where just taking average on the whole sequence is incorrect. (Previously I was taught just via "positive" examples, e.g. how array doubling is analyzed.)