DNA Sequence Analyzer - Problem
You are given a DNA sequence represented as a string containing only the characters 'A', 'T', 'G', and 'C'. Find all repeated DNA subsequences of length K that appear more than once in the sequence.
For each repeated subsequence, return:
- The subsequence string
- Its frequency (number of occurrences)
- All starting positions where it appears
Return format: An array of objects, where each object has three properties: sequence (string), frequency (integer), and positions (array of integers).
Note: Return results sorted by frequency in descending order, then by sequence lexicographically if frequencies are equal.
Input & Output
Example 1 — Basic Repeated Patterns
$
Input:
dna = "AGATCGATCGA", k = 3
›
Output:
[{"sequence":"ATC","frequency":2,"positions":[2,5]},{"sequence":"CGA","frequency":2,"positions":[4,8]}]
💡 Note:
Pattern 'ATC' appears at positions 2 and 5. Pattern 'CGA' appears at positions 4 and 8. Both have frequency 2, sorted lexicographically: ATC comes before CGA.
Example 2 — Single Character Repeats
$
Input:
dna = "AAAAAAAAAA", k = 2
›
Output:
[{"sequence":"AA","frequency":9,"positions":[0,1,2,3,4,5,6,7,8]}]
💡 Note:
In a string of 10 A's, the pattern 'AA' appears at every position from 0 to 8, giving it a frequency of 9.
Example 3 — No Repeated Patterns
$
Input:
dna = "ATCG", k = 2
›
Output:
[]
💡 Note:
Patterns are 'AT', 'TC', 'CG' - each appears only once. No pattern has frequency > 1, so return empty array.
Constraints
- 1 ≤ dna.length ≤ 104
- 1 ≤ k ≤ dna.length
- dna contains only characters 'A', 'T', 'G', 'C'
Visualization
Tap to expand
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code