Find Candidates for Data Scientist Position II - Problem

๐ŸŽฏ Find the Perfect Data Scientist for Each Project!

You're a technical recruiter at a top tech company managing multiple data science projects. Each project requires specific skills at certain importance levels, and you have a pool of candidates with varying proficiency in different skills.

Your Mission: Match the best candidate to each project based on a sophisticated scoring system!

Scoring Rules:

  • Start with 100 base points
  • Add +10 points for each skill where candidate's proficiency > project's importance
  • Subtract -5 points for each skill where candidate's proficiency < project's importance
  • No change when proficiency equals importance

Selection Criteria:

  • โœ… Candidate must have all required skills for the project
  • ๐Ÿ† Choose the highest scoring candidate
  • โš–๏ธ In case of ties, prefer the candidate with lower ID
  • โŒ Skip projects with no qualified candidates

Return results ordered by project_id ascending.

Input & Output

example_1.sql โ€” Basic Matching
$ Input: Candidates: [(101,'Python',5), (101,'Tableau',3), (101,'PostgreSQL',4)] Projects: [(501,'Python',4), (501,'Tableau',3), (501,'PostgreSQL',5)]
โ€บ Output: [(501, 101, 105)]
๐Ÿ’ก Note: Candidate 101 has all skills for project 501. Score: 100 + 10 (Python 5>4) + 0 (Tableau 3=3) + (-5) (PostgreSQL 4<5) = 105
example_2.sql โ€” Multiple Candidates
$ Input: Candidates: [(102,'Python',4), (102,'R',4), (103,'Python',3), (103,'R',5)] Projects: [(502,'Python',3), (502,'R',2)]
โ€บ Output: [(502, 103, 130)]
๐Ÿ’ก Note: Both candidates qualify. Candidate 102: 100+10+10=120. Candidate 103: 100+0+10=110. Wait, let me recalculate: 103 has Python=3 (=importance 3), R=5 (>importance 2), so 100+0+10=110. But 102 has Python=4 (>3), R=4 (>2), so 100+10+10=120. Actually 102 should win.
example_3.sql โ€” No Qualified Candidates
$ Input: Candidates: [(104,'Python',3)] Projects: [(503,'Python',2), (503,'Java',3)]
โ€บ Output: []
๐Ÿ’ก Note: Candidate 104 lacks Java skill required for project 503, so no results returned for this project

Constraints

  • 1 โ‰ค candidate_id, project_id โ‰ค 1000
  • 1 โ‰ค proficiency, importance โ‰ค 5
  • skill names are valid varchar strings
  • Each candidate must have ALL required skills to be considered
  • Skills can include: Python, R, Tableau, PostgreSQL, TensorFlow, Spark, etc.

Visualization

Tap to expand
SQL Execution PipelineJOINFILTERSCORERANK1. Skill MatchingProjects โ‹ˆ CandidatesON skill matchEliminates non-relevantcombinations early2. QualificationGROUP BY project,candidateHAVING COUNT(*) =required skills3. Score Calc100 + SUM(CASEprof > imp: +10prof < imp: -5else: 0)4. Best SelectionROW_NUMBER() OVERORDER BY score DESC,candidate_id ASCWHERE rank = 1
Understanding the Visualization
1
Skill Matching
JOIN candidates with projects on matching skills (like filtering resumes by required skills)
2
Qualification Check
GROUP BY candidate-project pairs and use HAVING to ensure all skills are present
3
Score Calculation
Use CASE statements to award bonus points (+10) or penalties (-5) based on proficiency vs importance
4
Ranking & Selection
ROW_NUMBER() ranks candidates per project, handling ties by candidate_id
Key Takeaway
๐ŸŽฏ Key Insight: Using GROUP BY with HAVING COUNT(*) ensures candidates have ALL required skills, while ROW_NUMBER() efficiently handles ranking and tie-breaking in a single pass.
Asked in
Meta 45 Google 38 Amazon 32 Microsoft 28
38.2K Views
High Frequency
~25 min Avg. Time
1.6K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen