Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to remove non-word characters in JavaScript?
To remove non-word characters in JavaScript, we use regular expressions to replace unwanted characters with empty strings. Non-word characters include symbols, punctuation, and special characters that aren't letters, numbers, or underscores.
What are Non-Word Characters?
Non-word characters are anything that doesn't match the \w pattern in regex. This includes symbols like !@#$%^&*(), punctuation, and special characters, but excludes letters (a-z, A-Z), digits (0-9), and underscores (_).
Using Simple Regex Pattern
The most straightforward approach uses the \W pattern, which matches any non-word character:
<html>
<body>
<script>
function removeNonWordChars(str) {
if (!str) return '';
return str.replace(/\W/g, '');
}
let text = 'Hello! @World# 123$%^';
let cleaned = removeNonWordChars(text);
document.write('Original: ' + text + '<br>');
document.write('Cleaned: ' + cleaned);
</script>
</body>
</html>
Original: Hello! @World# 123$%^ Cleaned: HelloWorld123
Preserving Spaces and Hyphens
If you want to keep spaces and hyphens while removing other non-word characters, use a custom pattern:
<html>
<body>
<script>
function removeNonWordKeepSpaces(str) {
if (!str) return '';
// Keep word characters, spaces, and hyphens
return str.replace(/[^\w\s-]/g, '');
}
let text = 'Tutorix is the ~!@^&";'/>#$%*()+`={}[]|\:<.,best e-learning platform';
let cleaned = removeNonWordKeepSpaces(text);
document.write('Original: ' + text + '<br>');
document.write('Cleaned: ' + cleaned);
</script>
</body>
</html>
Original: Tutorix is the ~!@^&";'/>#$%*()+`={}[]|\:<.,best e-learning platform
Cleaned: Tutorix is the best e-learning platform
Advanced Pattern with Unicode Support
For international characters and more precise control, you can use a comprehensive pattern:
<html>
<body>
<script>
function removeNonWordAdvanced(str) {
if (!str) return '';
// Pattern includes spaces, hyphens, and extended Unicode characters
var pattern = /[^\x20\x2D0-9A-Z\x5Fa-z\xC0-\xD6\xD8-\xF6\xF8-\xFF]/g;
return str.replace(pattern, '');
}
let text = 'Café & Résumé! @#$% 123';
let cleaned = removeNonWordAdvanced(text);
document.write('Original: ' + text + '<br>');
document.write('Cleaned: ' + cleaned);
</script>
</body>
</html>
Original: Café & Résumé! @#$% 123 Cleaned: Café Résumé 123
Comparison of Methods
| Method | Pattern | Preserves Spaces | Unicode Support |
|---|---|---|---|
| Simple | /\W/g |
No | Basic |
| With Spaces | /[^\w\s-]/g |
Yes | Basic |
| Advanced | Custom Unicode | Yes | Extended |
Common Use Cases
Removing non-word characters is useful for:
- Sanitizing user input for usernames or IDs
- Cleaning text data for processing
- Creating URL-friendly strings
- Preparing text for search or comparison
Conclusion
Use /\W/g for simple non-word character removal, or /[^\w\s-]/g to preserve spaces and hyphens. Choose the pattern based on your specific requirements for Unicode support and character preservation.
