How to remove non-word characters in JavaScript?

To remove non-word characters in JavaScript, we use regular expressions to replace unwanted characters with empty strings. Non-word characters include symbols, punctuation, and special characters that aren't letters, numbers, or underscores.

What are Non-Word Characters?

Non-word characters are anything that doesn't match the \w pattern in regex. This includes symbols like !@#$%^&*(), punctuation, and special characters, but excludes letters (a-z, A-Z), digits (0-9), and underscores (_).

Using Simple Regex Pattern

The most straightforward approach uses the \W pattern, which matches any non-word character:

<html>
<body>
<script>
function removeNonWordChars(str) {
    if (!str) return '';
    return str.replace(/\W/g, '');
}

let text = 'Hello! @World# 123$%^';
let cleaned = removeNonWordChars(text);
document.write('Original: ' + text + '<br>');
document.write('Cleaned: ' + cleaned);
</script>
</body>
</html>
Original: Hello! @World# 123$%^
Cleaned: HelloWorld123

Preserving Spaces and Hyphens

If you want to keep spaces and hyphens while removing other non-word characters, use a custom pattern:

<html>
<body>
<script>
function removeNonWordKeepSpaces(str) {
    if (!str) return '';
    // Keep word characters, spaces, and hyphens
    return str.replace(/[^\w\s-]/g, '');
}

let text = 'Tutorix is the ~!@^&";'/>#$%*()+`={}[]|\:<.,best e-learning platform';
let cleaned = removeNonWordKeepSpaces(text);
document.write('Original: ' + text + '<br>');
document.write('Cleaned: ' + cleaned);
</script>
</body>
</html>
Original: Tutorix is the ~!@^&";'/>#$%*()+`={}[]|\:<.,best e-learning platform
Cleaned: Tutorix is the best e-learning platform

Advanced Pattern with Unicode Support

For international characters and more precise control, you can use a comprehensive pattern:

<html>
<body>
<script>
function removeNonWordAdvanced(str) {
    if (!str) return '';
    // Pattern includes spaces, hyphens, and extended Unicode characters
    var pattern = /[^\x20\x2D0-9A-Z\x5Fa-z\xC0-\xD6\xD8-\xF6\xF8-\xFF]/g;
    return str.replace(pattern, '');
}

let text = 'Café & Résumé! @#$% 123';
let cleaned = removeNonWordAdvanced(text);
document.write('Original: ' + text + '<br>');
document.write('Cleaned: ' + cleaned);
</script>
</body>
</html>
Original: Café & Résumé! @#$% 123
Cleaned: Café Résumé 123

Comparison of Methods

Method Pattern Preserves Spaces Unicode Support
Simple /\W/g No Basic
With Spaces /[^\w\s-]/g Yes Basic
Advanced Custom Unicode Yes Extended

Common Use Cases

Removing non-word characters is useful for:

  • Sanitizing user input for usernames or IDs
  • Cleaning text data for processing
  • Creating URL-friendly strings
  • Preparing text for search or comparison

Conclusion

Use /\W/g for simple non-word character removal, or /[^\w\s-]/g to preserve spaces and hyphens. Choose the pattern based on your specific requirements for Unicode support and character preservation.

Updated on: 2026-03-15T23:18:59+05:30

390 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements