Given a list paths of directory info, including the directory path, and all the files with contents in this directory, return all the duplicate files in the file system in terms of their paths.
You may return the answer in any order.
A group of duplicate files consists of at least two files that have the same content.
A single directory info string in the input list has the following format:
It means there are n files (f1.txt, f2.txt ... fn.txt) with content (f1_content, f2_content ... fn_content) respectively in the directory "root/d1/d2/.../dm".
Note that n >= 1 and m >= 0. If m = 0, it means the directory is just the root directory.
The output is a list of groups of duplicate file paths. For each group, it contains all the file paths of the files that have the same content.
A file path is a string that has the following format: "directory_path/file_name.txt"
The key insight is to use a hash map where file content is the key and list of file paths is the value. This automatically groups files with identical content. Parse each directory string to extract file paths and contents, then return groups with more than one file. Best approach is hash map grouping with Time: O(n), Space: O(n).
Common Approaches
✓
Greedy
⏱️ Time: N/A
Space: N/A
Optimized
⏱️ Time: N/A
Space: N/A
Brute Force Comparison
⏱️ Time: O(n²)
Space: O(n)
Parse all files first, then compare each file's content with every other file's content to identify duplicates. Group files with matching content together.
Hash Map Grouping
⏱️ Time: O(n)
Space: O(n)
Parse all directory strings to extract files, then use a hash map where content is the key and list of file paths is the value. Files with same content automatically get grouped together.
Algorithm Steps — Algorithm Steps
Code -
solution.c — C
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_FILES 1000
#define MAX_PATH_LEN 500
#define MAX_CONTENT_LEN 100
typedef struct {
char content[MAX_CONTENT_LEN];
char files[MAX_FILES][MAX_PATH_LEN];
int count;
} ContentGroup;
static ContentGroup groups[MAX_FILES];
static int groupCount = 0;
void parseString(const char* input, char result[][MAX_PATH_LEN], int* count) {
*count = 0;
int len = strlen(input);
int i = 1; // Skip opening bracket
while (i < len - 1) { // Skip closing bracket
// Skip whitespace and commas
while (i < len && (input[i] == ' ' || input[i] == ',')) i++;
if (i >= len - 1) break;
// Skip opening quote
if (input[i] == '"') i++;
// Read until closing quote
int start = i;
while (i < len && input[i] != '"') i++;
// Copy the string
int length = i - start;
strncpy(result[*count], &input[start], length);
result[*count][length] = '\0';
(*count)++;
// Skip closing quote
if (input[i] == '"') i++;
}
}
int findContentGroup(const char* content) {
for (int i = 0; i < groupCount; i++) {
if (strcmp(groups[i].content, content) == 0) {
return i;
}
}
return -1;
}
void addToGroup(const char* content, const char* filepath) {
int groupIdx = findContentGroup(content);
if (groupIdx == -1) {
// Create new group
strcpy(groups[groupCount].content, content);
strcpy(groups[groupCount].files[0], filepath);
groups[groupCount].count = 1;
groupCount++;
} else {
// Add to existing group
strcpy(groups[groupIdx].files[groups[groupIdx].count], filepath);
groups[groupIdx].count++;
}
}
void solution(char paths[][MAX_PATH_LEN], int pathCount) {
groupCount = 0;
for (int p = 0; p < pathCount; p++) {
char* pathInfo = paths[p];
// Find first space to separate directory from files
char* firstSpace = strchr(pathInfo, ' ');
if (!firstSpace) continue;
// Extract directory
char directory[MAX_PATH_LEN];
int dirLen = firstSpace - pathInfo;
strncpy(directory, pathInfo, dirLen);
directory[dirLen] = '\0';
// Process files
char* current = firstSpace + 1;
while (*current) {
// Skip spaces
while (*current == ' ') current++;
if (!*current) break;
// Find the file info end (next space or end of string)
char* nextSpace = strchr(current, ' ');
char fileInfo[MAX_PATH_LEN];
if (nextSpace) {
int len = nextSpace - current;
strncpy(fileInfo, current, len);
fileInfo[len] = '\0';
current = nextSpace + 1;
} else {
strcpy(fileInfo, current);
current += strlen(current);
}
// Parse filename and content
char* parenStart = strchr(fileInfo, '(');
if (!parenStart) continue;
char filename[MAX_PATH_LEN];
char content[MAX_CONTENT_LEN];
// Extract filename
int filenameLen = parenStart - fileInfo;
strncpy(filename, fileInfo, filenameLen);
filename[filenameLen] = '\0';
// Extract content (skip opening paren, stop at closing paren)
char* parenEnd = strchr(parenStart + 1, ')');
if (!parenEnd) continue;
int contentLen = parenEnd - parenStart - 1;
strncpy(content, parenStart + 1, contentLen);
content[contentLen] = '\0';
// Create full path
char fullPath[MAX_PATH_LEN];
sprintf(fullPath, "%s/%s", directory, filename);
// Add to content group
addToGroup(content, fullPath);
}
}
}
int main() {
char input[10000];
fgets(input, sizeof(input), stdin);
char paths[MAX_FILES][MAX_PATH_LEN];
int pathCount;
parseString(input, paths, &pathCount);
solution(paths, pathCount);
// Print result
printf("[");
int first = 1;
for (int i = 0; i < groupCount; i++) {
if (groups[i].count >= 2) {
if (!first) printf(",");
first = 0;
printf("[");
for (int j = 0; j < groups[i].count; j++) {
if (j > 0) printf(",");
printf("\"%s\"", groups[i].files[j]);
}
printf("]");
}
}
printf("]\n");
return 0;
}
Time & Space Complexity
Time Complexity
⏱️
n
2n
✓ Linear Growth
Space Complexity
n
2n
✓ Linear Space
28.0K Views
MediumFrequency
~25 minAvg. Time
892 Likes
Ln 1, Col 1
Smart Actions
💡Explanation
AI Ready
💡 SuggestionTabto acceptEscto dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen
Algorithm Visualization
Pinch to zoom • Tap outside to close
Test Cases
0 passed
0 failed
3 pending
Select Compiler
Choose a programming language
Compiler list would appear here...
AI Editor Features
Header Buttons
💡
Explain
Get a detailed explanation of your code. Select specific code or analyze the entire file. Understand algorithms, logic flow, and complexity.
🔧
Fix
Automatically detect and fix issues in your code. Finds bugs, syntax errors, and common mistakes. Shows you what was fixed.
💡
Suggest
Get improvement suggestions for your code. Best practices, performance tips, and code quality recommendations.
💬
Ask AI
Open an AI chat assistant to ask any coding questions. Have a conversation about your code, get help with debugging, or learn new concepts.
Smart Actions (Slash Commands)
🔧
/fix Enter
Find and fix issues in your code. Detects common problems and applies automatic fixes.
💡
/explain Enter
Get a detailed explanation of what your code does, including time/space complexity analysis.
🧪
/tests Enter
Automatically generate unit tests for your code. Creates comprehensive test cases.
📝
/docs Enter
Generate documentation for your code. Creates docstrings, JSDoc comments, and type hints.
⚡
/optimize Enter
Get performance optimization suggestions. Improve speed and reduce memory usage.
AI Code Completion (Copilot-style)
👻
Ghost Text Suggestions
As you type, AI suggests code completions shown in gray text. Works with keywords like def, for, if, etc.
Tabto acceptEscto dismiss
💬
Comment-to-Code
Write a comment describing what you want, and AI generates the code. Try: # two sum, # binary search, # fibonacci
💡
Pro Tip: Select specific code before using Explain, Fix, or Smart Actions to analyze only that portion. Otherwise, the entire file will be analyzed.