The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'. For example, "ACGAATTCCG" is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA. Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

You may return the answer in any order.

Input & Output

Example 1 — Basic Repeated Sequences
$ Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]
💡 Note: AAAAACCCCC appears at positions 0 and 10. CCCCCAAAAA appears at positions 5 and 15. Both sequences are 10 characters long and occur more than once.
Example 2 — No Repeats
$ Input: s = "AAAAAAAAAA"
Output: []
💡 Note: The string has exactly 10 characters, so there is only one possible 10-letter sequence "AAAAAAAAAA" starting at position 0. Since it appears only once (not more than once), the result is an empty array.
Example 3 — Short String
$ Input: s = "ACGT"
Output: []
💡 Note: String is too short (4 characters) to contain any 10-letter sequences, so return empty array.

Constraints

  • 1 ≤ s.length ≤ 105
  • s[i] is either 'A', 'C', 'G', or 'T'

Visualization

Tap to expand
Repeated DNA Sequences INPUT DNA Sequence String s: A A A A A C C C C C A A A A A C C C C C C A A A A A G G G T T T Nucleotide Legend: A C G T Task: Find all 10-letter sequences that appear more than once in the DNA string Length: 32 characters ALGORITHM STEPS 1 Initialize Hash Map Create empty map to store substring counts 2 Sliding Window Extract each 10-char window from index 0 to n-10 3 Hash and Count Add substring to map, increment count each time 4 Filter Results Return substrings with count greater than 1 Hash Map State: AAAAACCCCC 2 CCCCCAAAAA 2 AAAACCCCCA 1 OK OK skip FINAL RESULT Repeated 10-letter sequences: "AAAAACCCCC" Found at index 0 and 5 "CCCCCAAAAA" Found at index 5 and 10 Output Array: ["AAAAACCCCC", "CCCCCAAAAA"] Solution Verified Both sequences appear exactly 2 times each Time: O(n) | Space: O(n) Key Insight: Using a hash map allows O(1) lookup for each 10-character substring. We slide a window of size 10 across the string, hashing each substring. When we see a substring for the second time, we add it to the result. This avoids O(n^2) brute force comparison and achieves linear time complexity. TutorialsPoint - Repeated DNA Sequences | Hash Map Approach
Asked in
LinkedIn 8 Amazon 6 Microsoft 4
87.5K Views
Medium Frequency
~25 min Avg. Time
3.2K Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen