Duplicate Invoice Detector - Problem

You work in the accounting department and need to detect potential duplicate invoices to prevent double payments. You're given a list of invoices, where each invoice contains three fields:

  • amount - the invoice amount (decimal)
  • vendor - the vendor name (string)
  • date - the invoice date (string in YYYY-MM-DD format)

Two invoices are considered potential duplicates if they match on any two of the three fields. For example:

  • Same amount + same vendor (different dates)
  • Same amount + same date (different vendors)
  • Same vendor + same date (different amounts)

Return a list of all invoices that have at least one potential duplicate in the dataset. Each invoice should appear only once in the result, even if it matches multiple other invoices.

Note: Each invoice is represented as [amount, vendor, date] where amount is a float.

Input & Output

Example 1 — Basic Duplicate Detection
$ Input: invoices = [[1000.0, "Apple", "2024-01-15"], [1000.0, "Google", "2024-01-15"], [500.0, "Apple", "2024-01-16"]]
Output: [[1000.0, "Apple", "2024-01-15"], [1000.0, "Google", "2024-01-15"]]
💡 Note: First two invoices share same amount ($1000) and same date (2024-01-15), so both are flagged as potential duplicates. Third invoice shares no 2+ fields with others.
Example 2 — Multiple Duplicate Groups
$ Input: invoices = [[800.0, "Microsoft", "2024-01-10"], [800.0, "Microsoft", "2024-01-11"], [600.0, "Oracle", "2024-01-10"], [700.0, "Oracle", "2024-01-10"]]
Output: [[800.0, "Microsoft", "2024-01-10"], [800.0, "Microsoft", "2024-01-11"], [600.0, "Oracle", "2024-01-10"], [700.0, "Oracle", "2024-01-10"]]
💡 Note: First two invoices match on amount+vendor. Last two invoices match on vendor+date. All four invoices have duplicates.
Example 3 — No Duplicates Found
$ Input: invoices = [[1000.0, "Apple", "2024-01-15"], [2000.0, "Google", "2024-01-16"], [1500.0, "Microsoft", "2024-01-17"]]
Output: []
💡 Note: No two invoices share 2+ matching fields, so no duplicates detected.

Constraints

  • 1 ≤ invoices.length ≤ 1000
  • -106 ≤ amount ≤ 106
  • 1 ≤ vendor.length ≤ 50
  • date is in YYYY-MM-DD format

Visualization

Tap to expand
INPUT$1000, Apple, 2024-01-15$1000, Google, 2024-01-15$500, Apple, 2024-01-16Find invoices sharing2+ matching fields:• Amount + Vendor• Amount + Date• Vendor + DateALGORITHM1Create FingerprintsGenerate field pair keys2Hash Map LookupO(1) fingerprint collision check3Mark DuplicatesFlag matching invoicesHash Map Example:"$1000_2024-01-15" → collision!Both invoices are duplicatesRESULTDuplicate Invoices:$1000, Apple, 2024-01-15$1000, Google, 2024-01-15✓ Found 2 potential duplicatesThird invoice has no matchesKey Insight:Field pair fingerprints enable O(1) duplicate detection instead of O(n²) brute force comparisonTutorialsPoint - Duplicate Invoice Detector | Hash Map Fingerprints
Asked in
Google 35 Amazon 28 Microsoft 22 Apple 18
23.4K Views
Medium Frequency
~15 min Avg. Time
892 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen