How to Test a URL for 404 error in PHP?


PHP: PHP (Hypertext Preprocessor) is a widely-used open-source server-side scripting language that is specifically designed for web development. It was originally created by Rasmus Lerdorf in 1994 and has since evolved into a powerful language used by millions of developers worldwide.

PHP is primarily used to develop dynamic web pages and web applications. It allows developers to embed PHP code within HTML, making it easy to mix server-side logic with the presentation layer. PHP scripts are executed on the server, and the resulting HTML is sent to the client's browser.

To test a URL for a 404 error in PHP, there are different approaches you can take. Here are a few alternative methods:

  • Using file_get_contents

  • Using get_headers and strpos with stream_context_create

  • Using curl_exec with CURLOPT_NOBODY

Using file_get_contents to test a URL for a 404 error in PHP

<?php
   function isUrlValid($url) {
      // Disable error reporting for file_get_contents
      $context = stream_context_create(['http' => ['ignore_errors' => true]]);
      // Fetch the URL content
      $content = file_get_contents($url, false, $context);
      // Get the response headers
      $headers = $http_response_header;
      // Check if the response code contains "404"
      foreach ($headers as $header) {
         if (stripos($header, 'HTTP/1.1 404') !== false) {
            return false; // URL is invalid or returns a 404 error
         }
      }
      return true; // URL is valid
   }
   // Usage
   $url = "http://example.com";
   if (isUrlValid($url)) {
      echo "URL is valid.";
   } else {
      echo "URL is invalid or returns a 404 error.";
   }
?>

In this approach, the isUrlValid function takes a URL as a parameter. It creates a stream context with ignore_errors set to true, which suppresses any errors that file_get_contents may encounter while fetching the URL content.

The file_get_contents function is then used to fetch the content of the URL, passing the stream context as the third argument. The function returns the content as a string.

The response headers are stored in the $http_response_header variable, which is automatically populated by file_get_contents with the headers from the last HTTP request.

The function then iterates through the headers and checks if any of them contain the string "HTTP/1.1 404" using stripos (case-insensitive search). If a header with a 404 response code is found, it means the URL is invalid or returns a 404 error, and the function returns false.

If no 404 response code is found in the headers, the function returns true, indicating that the URL is valid.

You can replace "http://example.com" with the URL you want to test. Keep in mind that file_get_contents may be restricted by certain server configurations, so make sure it's allowed in your environment.

Using get_headers and strpos with stream_context_create

Here's a detailed explanation of using get_headers and strpos with stream_context_create to test a URL for a 404 error in PHP:

<?php
   function isUrlValid($url) {
      // Create a stream context with "ignore_errors" set to true
      $context = stream_context_create(['http' => ['ignore_errors' => true]]);
      // Fetch the URL headers
      $headers = get_headers($url, 0, $context);
      // Check if the response code contains "404"
      if (strpos($headers[0], '404') !== false) {
         return false; // URL is invalid or returns a 404 error
      }
      return true; // URL is valid
   }
   // Usage
   $url = "http://example.com";
   if (isUrlValid($url)) {
      echo "URL is valid.";
   } else {
      echo "URL is invalid or returns a 404 error.";
   }
?>

In this approach, the isUrlValid function takes a URL as a parameter. It creates a stream context using stream_context_create with the option ignore_errors set to true. This context is used to ignore any errors encountered while retrieving the URL headers.

The get_headers function is then called with the URL and the stream context as parameters. It returns an array containing the response headers for the URL.

The function checks the first element of the headers array ($headers[0]) and uses strpos to search for the string "404" within the header. If the string "404" is found, it means the URL is invalid or returns a 404 error, and the function returns false.

If the string "404" is not found in the header, the function returns true, indicating that the URL is valid.

You can replace "http://example.com" with the URL you want to test. Make sure the get_headers function and the stream_context_create function are enabled in your PHP configuration.

Using curl_exec with CURLOPT_NOBODY

Here's a detailed explanation of using curl_exec with CURLOPT_NOBODY option to test a URL for a 404 error in PHP:

<?php
   function isUrlValid($url) {
      // Initialize cURL session
      $ch = curl_init($url);

      // Set the CURLOPT_NOBODY option to send a HEAD request
      curl_setopt($ch, CURLOPT_NOBODY, true);

      // Set CURLOPT_RETURNTRANSFER option to receive the response as a string
      curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

      // Execute the request
      curl_exec($ch);

      // Get the response code
      $responseCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

      // Close cURL session
      curl_close($ch);

      // Check if the response code is 404
      return $responseCode !== 404;
   }

   // Usage
   $url = "http://example.com";
   if (isUrlValid($url)) {
      echo "URL is valid.";
   } else {
      echo "URL is invalid or returns a 404 error.";
   }
?>

In this approach, the isUrlValid function takes a URL as a parameter. It initializes a cURL session using curl_init with the URL.

The curl_setopt function is used to set the CURLOPT_NOBODY option to true, which sends a HEAD request instead of a GET request. This way, only the response headers are retrieved, not the entire response body.

The CURLOPT_RETURNTRANSFER option is set to true to ensure that the response is returned as a string.

Next, curl_exec is called to execute the cURL request.

After the request is executed, curl_getinfo is used to retrieve the HTTP response code from the cURL session using the CURLINFO_HTTP_CODE option.

Finally, curl_close is called to close the cURL session.

The function then checks if the response code is not equal to 404. If the response code is not 404, it means the URL is valid, and the function returns true. Otherwise, it returns false, indicating that the URL is invalid or returns a 404 error.

You can replace "http://example.com" with the URL you want to test. Make sure you have the cURL extension enabled in your PHP configuration for this approach to work.

Conclusion

All three methods provide ways to test a URL for a 404 error in PHP. They all have their pros and cons, and the choice depends on your specific requirements and the libraries available in your project. You can choose the method that suits your needs best.

Updated on: 17-Aug-2023

880 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements