244. Shortest Word Distance II (M)

Design a class which receives a list of words in the constructor, and implements a method that takes two words word1 and word2 and return the shortest distance between these two words in the list. Your method will be called repeatedly many times with different parameters.

Example:
Assume that words = ["practice", "makes", "perfect", "coding", "makes"].

Input: word1 = “coding”, word2 = “practice”
Output: 3
Input: word1 = "makes", word2 = "coding"
Output: 1
Note:
You may assume that word1 does not equal to word2, and word1 and word2 are both in the list.


我的答案:offline把每个word的index数组都存起来,然后online的时候找最近

class WordDistance {
private:
    unordered_map<string, vector<int>> word2index;
public:
    WordDistance(vector<string>& words) {
        int len = words.size();
        string word;
        for (int i=0; i<len; ++i) {
            word = words[i];
            if (word2index.find(word) == word2index.end())
                word2index[word] = {i};
            else
                word2index[word].push_back(i);
        }
    }
    
    int shortest(string word1, string word2) {
        int ans = INT_MAX;
        
        vector<int> indexes1 = word2index[word1];
        vector<int> indexes2 = word2index[word2];
        int len1 = indexes1.size();
        int len2 = indexes2.size();
        
        int i1 = 0, i2 = 0;
        while (i1 < len1 and i2 < len2) {
            int ans_temp = abs(indexes1[i1] - indexes2[i2]);
            if (ans_temp < ans) { //check 
                ans = ans_temp;
            }
            
            if (indexes1[i1] < indexes2[i2])
                ++i1;
            else
                ++i2;
        }
        
        return ans;
    }
};

/**
 * Your WordDistance object will be instantiated and called as such:
 * WordDistance* obj = new WordDistance(words);
 * int param_1 = obj->shortest(word1,word2);
 */

Runtime: 96 ms, faster than 5.76% of C++ online submissions for Shortest Word Distance II.
Memory Usage: 20.3 MB, less than 25.08% of C++ online submissions for Shortest Word Distance II.

很慢!
看了下其他答案和标准答案,意思是一样的,可以改进的地方是,constructor里面不需要判断是否存在words[i],直接push_back就行了


Solution

Before looking at the solution for this problem, let's look at what the problem asks us to do in simpler terms. We have to design a class which receives a list of words as input in the constructor. The class has a function which we need to implement and that function is shortest which takes two words as input and returns the minimum distance between the two as the output.

When the problem talks about the distance between two words, it essentially means the absolute gap between the indices of the two words in the list. For e.g. if the first word occurs at a location i and the second word occurs at the location j, then the distance between the two would be abs(i - j).

The question asks us to find the minimum such different between words which clearly indicates that the words can occur at multiple locations. If we have K occurrences for the word1 and L occurrences for the word2, then iteratively checking every pair of indices will give us a <math><semantics><annotation encoding="application/x-tex">O(N^2)</annotation></semantics></math>O(N2) algorithm which won't be optimal at all. We won't discuss that algorithm here since it is very straightforward.

The brute-force algorithm would simple consider all possible pairs of indices for (word1_location, word2_location) and see which one produces the minimum distance. Let's try and build on this idea and see if some pre-processing can help us out reduce the complexity of the brute-force algorithm.


Approach 1: Using Preprocessed Sorted Indices

Intuition

A given word can occur multiple times in the original word list. Let's suppose the first word, word1 in the input to the function shortest occurs at the indices [i1, i2, i3, i4] in the original list. Similarly, let's assume that the second word, word2, appears at the following locations inside the word list [j1, j2, j3].

Now, given these list of indices, we are to simply find the pair of indices (i, j) such that their absolute difference is minimum.

The main idea for this approach is that if the list of these indices is in sorted order, we can find such a pair in linear time.

The idea is to use a two pointer approach. Let's say we have a pointer i for the sorted list of indices of word1 and j for the sorted list of indices of word2. At every iteration, we record the difference of indices i.e. abs(word1[i] - word2[j]). Once we've done that, we have two possible choices for progressing the two pointers.

<pre style="box-sizing: border-box; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, Courier, monospace; font-size: 13px; margin-top: 0px; margin-bottom: 1em; overflow: auto; background: var(--dark-bg) !important; border-color: rgb(51, 51, 51); color: rgb(170, 170, 170); padding: 10px 15px; line-height: 1.6; border-radius: 3px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">word1[i] < word2[j]
</pre>

If this is the case, that means there is no point in moving the j pointer forward. The location indices for the words are in a sorted order. We know that word2[j + 1] > word2[j] because these indices are sorted. So, if we move j forward, then the difference abs(word1[i] - word2[j + 1]) would be even greater than abs(word1[i] - word2[j]). That doesn't help us since we want to find the minimum possible distance (difference) overall.

So, if we have (word1[i] < word2[j]), we move the pointer 'i' one step forward i.e. (i + 1) in the hopes that abs(word1[i + 1] - word2[j]) would give us a lower distance than abs(word1[i] - word2[j]). We say "hopes" because it is not certain this improvement would happen.

Let's look at two different examples. In the first example we will see that moving i forward gave us the best difference overall (0). In the second example we see that moving i forward leads us to our second case (yet to discuss) but doesn't lead to any improvement in the difference.

Example-1

word1_locations = [2,4,5,9]
word2_locations = [4,10,11]

i, j = 0, 0
min_diff = 2 (abs(2 - 4))
word1[i] < word2[j] i.e. 2 < 4
  move i one step forward

i, j = 1, 0 (abs(4 - 4))
min_diff = 0 (We hit the jackpot!)  

Example-2

word1_locations = [2,7,15,16]
word2_locations = [4,10,11]

i, j = 0, 0
min_diff = 2 (abs(2 - 4))
word1[i] < word2[j] i.e. 2 < 4
  move i one step forward

i, j = 1, 0
min_diff = 2 (2 < abs(7 - 4))

Here, we did not update out global minimum difference.
That is why we said earlier, moving 'i' forward may or
may not give a lower difference. But moving 'j' forward in
our case would definitely worsen the difference (or keep it same!).

Let's move onto our second scenario.

<pre style="box-sizing: border-box; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, Courier, monospace; font-size: 13px; margin-top: 0px; margin-bottom: 1em; overflow: auto; background: var(--dark-bg) !important; border-color: rgb(51, 51, 51); color: rgb(170, 170, 170); padding: 10px 15px; line-height: 1.6; border-radius: 3px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">word1[i] > word2[j]
</pre>

If this is the case, that means there is no point in moving the i pointer forward. We know that word1[i + 1] > word2[j] because these indices are sorted. So, if we move i forward, then the difference abs(word1[i + 1] - word2[j]) would be even greater than abs(word1[i] - word2[j]). That doesn't help us since we want to find the minimum possible distance (difference) overall.

So, along the similar lines of thought as the previous case, if we have (word1[i] > word2[j]), we move the pointer 'j' one step forward i.e. (j + 1) in the hopes that abs(word1[i] - word2[j + 1]) would give us a lower distance than abs(word1[i] - word2[j]). We say "hopes" because as showcased in the previous scenario, it is not certain this improvement would happen.

Now let's formally look at the algorithm for solving this problem.

Algorithm

  1. In the constructor of the class, we simply iterate over the given list of words and prepare a dictionary, mapping a word to all it's locations in the array.
  2. Since we process all the words from left to right, we will get all the indices in a sorted order by default for all the words. So, we don't have to sort the indices ourselves.
  3. Let's call the dictionary that we build, locations.
  4. For a given pair of words, obtain the list of indices (appearances inside the original list/array of words). Let's call the two arrays loc1 and loc2.
  5. Initialize two pointer variables l1 = 0 and l2 = 0.
  6. For a given l1 and l2, we first update (if possible) the minimum difference (distance) till now i.e. dist = min(dist, abs(loc1[l1] - loc2[l2])). Then, we check if loc1[l1] < loc2[l2] and if this is the case, we move l1 one step forward i.e. l1 = l1 + 1. Otherwise, we move l2 one step forward i.e. l2 = l2 + 1.
  7. We keep doing this until all the elements in the smaller of the two location arrays are processed.
  8. Return the global minimum distance between the words.

<center style="box-sizing: border-box; color: rgb(170, 170, 170); font-family: -apple-system, system-ui, "Segoe UI", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(40, 40, 40); text-decoration-style: initial; text-decoration-color: initial;">
image

</center>

This represents the locations dictionary that we should build given the original words list in the constructor. The key represents the word and the value is a list containing indices in ascending order of occurrences throughout the array. Let's look at the minimum distance between the words apple and football in the array. So, we will be considering the two sorted lists of indices: [3, 6, 8, 12] and [2, 7, 9].

<center style="box-sizing: border-box; color: rgb(170, 170, 170); font-family: -apple-system, system-ui, "Segoe UI", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(40, 40, 40); text-decoration-style: initial; text-decoration-color: initial;">

image.png

13 / 14

</center>

<iframe src="https://leetcode.com/playground/9kJr9mNg/shared" frameborder="0" width="100%" height="500" name="9kJr9mNg" style="box-sizing: border-box; margin: 20px 0px; color: rgb(170, 170, 170); font-family: -apple-system, system-ui, "Segoe UI", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Helvetica Neue", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(40, 40, 40); text-decoration-style: initial; text-decoration-color: initial;"></iframe>

Complexity analysis

  • Time complexity : The time complexity of the constructor of our class is <math><semantics><annotation encoding="application/x-tex">O(N)</annotation></semantics></math>O(N) considering there were <math><semantics><annotation encoding="application/x-tex">N</annotation></semantics></math>N words in the original list. We iterate over them and prepare a mapping from key to list of indices as described before. Then, for the function that finds the minimum distance between the two words, the complexity would be <math><semantics><annotation encoding="application/x-tex">O(max(K, L))</annotation></semantics></math>O(max(K,L)) where <math><semantics><annotation encoding="application/x-tex">K</annotation></semantics></math>K and <math><semantics><annotation encoding="application/x-tex">L</annotation></semantics></math>L represent the number of occurrences of the two words. However, <math><semantics><annotation encoding="application/x-tex">K = O(N)</annotation></semantics></math>K=O(N) and also <math><semantics><annotation encoding="application/x-tex">L = O(N)</annotation></semantics></math>L=O(N). Therefore, the overall time complexity would also be <math><semantics><annotation encoding="application/x-tex">O(N)</annotation></semantics></math>O(N). The reason the complexity is <math><semantics><annotation encoding="application/x-tex">O(max(K, L))</annotation></semantics></math>O(max(K,L)) and not <math><semantics><annotation encoding="application/x-tex">O(min(K, L))</annotation></semantics></math>O(min(K,L)) is because of the scenario where the minimum element of the smaller list is larger than all the elements of the larger list. In that scenario, the pointer for the smaller list will not progress at all and the one for the longer list will reach to the very end.
  • Space complexity: <math><semantics><annotation encoding="application/x-tex">O(N)</annotation></semantics></math>O(N) for the dictionary that we prepare in the constructor. The keys represent all the unique words in the input and the values represent all of the indices from <math><semantics><annotation encoding="application/x-tex">0 ... N</annotation></semantics></math>0...N.
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 211,376评论 6 491
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,126评论 2 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 156,966评论 0 347
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,432评论 1 283
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,519评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,792评论 1 290
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,933评论 3 406
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,701评论 0 266
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,143评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,488评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,626评论 1 340
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,292评论 4 329
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,896评论 3 313
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,742评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,977评论 1 265
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,324评论 2 360
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,494评论 2 348

推荐阅读更多精彩内容