一、问题描述
给定一个字符串 s 和一个字符串数组 words,找出 s 中所有恰好由 words 中所有单词串联形成的子串的起始索引。
注意:words 中的单词可以以任意顺序串联。
示例 1:
1 2
| 输入:s = "barfoothefoobarman", words = ["foo","bar"] 输出:[0,9]
|
示例 2:
1 2
| 输入:s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"] 输出:[]
|
示例 3:
1 2
| 输入:s = "barfoofoobarthefoobarman", words = ["bar","foo","the"] 输出:[6,9,12]
|
二、核心概念
2.1 问题分析
这道题的核心是滑动窗口 + 哈希表的组合应用。
关键观察:
words 中的所有单词长度相同
- 需要找到所有包含
words 中所有单词的子串
2.2 滑动窗口思想
滑动窗口是一种常用的双指针技巧,用于在一维数据结构中维护一个动态的窗口。
2.3 哈希表统计
使用哈希表(字典)来统计 words 中每个单词的出现次数。
三、解题思路
3.1 滑动窗口优化法
优化思路:
- 单词长度相同,记为
word_len
- 窗口大小固定为
window_len = word_len * len(words)
- 根据起始位置对
word_len 取模,将问题分解为 word_len 个子问题
四、代码实现
4.1 方法一:滑动窗口(推荐)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
| from collections import Counter from typing import List
class Solution: def findSubstring(self, s: str, words: List[str]) -> List[int]: if not s or not words: return [] word_len = len(words[0]) word_count = len(words) window_len = word_len * word_count s_len = len(s) if s_len < window_len: return [] target_cnt = Counter(words) result = [] for offset in range(word_len): left = offset right = offset curr_cnt = Counter() matched = 0 while right + word_len <= s_len: word = s[right:right+word_len] right += word_len if word in target_cnt: curr_cnt[word] += 1 if curr_cnt[word] == target_cnt[word]: matched += 1 while right - left > window_len: left_word = s[left:left+word_len] left += word_len if left_word in target_cnt: if curr_cnt[left_word] == target_cnt[left_word]: matched -= 1 curr_cnt[left_word] -= 1 if matched == len(target_cnt): result.append(left) return result
|
4.2 方法二:滑动窗口(简化版)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
| from collections import Counter from typing import List
class Solution: def findSubstring(self, s: str, words: List[str]) -> List[int]: if not s or not words: return [] word_len = len(words[0]) word_count = len(words) window_len = word_len * word_count s_len = len(s) if s_len < window_len: return [] target_cnt = Counter(words) result = [] for i in range(s_len - window_len + 1): curr_cnt = Counter() valid = True for j in range(word_count): word = s[i + j * word_len : i + (j + 1) * word_len] if word not in target_cnt: valid = False break curr_cnt[word] += 1 if curr_cnt[word] > target_cnt[word]: valid = False break if valid: result.append(i) return result
|
五、复杂度分析
| 方法 |
时间复杂度 |
空间复杂度 |
| 滑动窗口(推荐) |
O(n × k) |
O(m × k) |
| 滑动窗口(简化版) |
O(n × m × k) |
O(m × k) |
六、测试用例
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| s = "barfoothefoobarman" words = ["foo", "bar"]
s = "wordgoodgoodgoodbestword" words = ["word", "good", "best", "word"]
s = "barfoofoobarthefoobarman" words = ["bar", "foo", "the"]
|