- Scrapy Tutorial
- Scrapy - Home
- Scrapy Basic Concepts
- Scrapy - Overview
- Scrapy - Environment
- Scrapy - Command Line Tools
- Scrapy - Spiders
- Scrapy - Selectors
- Scrapy - Items
- Scrapy - Item Loaders
- Scrapy - Shell
- Scrapy - Item Pipeline
- Scrapy - Feed exports
- Scrapy - Requests & Responses
- Scrapy - Link Extractors
- Scrapy - Settings
- Scrapy - Exceptions
- Scrapy Live Project
- Scrapy - Create a Project
- Scrapy - Define an Item
- Scrapy - First Spider
- Scrapy - Crawling
- Scrapy - Extracting Items
- Scrapy - Using an Item
- Scrapy - Following Links
- Scrapy - Scraped Data
- Scrapy Built In Services
- Scrapy - Logging
- Scrapy - Stats Collection
- Scrapy - Sending an E-mail
- Scrapy - Telnet Console
- Scrapy - Web Services
Scrapy - Logging
Logging means tracking of events, which uses built-in logging system and defines functions and classes to implement applications and libraries. Logging is a ready-to-use material, which can work with Scrapy settings listed in Logging settings.
Scrapy will set some default settings and handle those settings with the help of scrapy.utils.log.configure_logging() when running commands.
In Python, there are five different levels of severity on a log message. The following list shows the standard log messages in an ascending order −
logging.DEBUG − for debugging messages (lowest severity)
logging.INFO − for informational messages
logging.WARNING − for warning messages
logging.ERROR − for regular errors
logging.CRITICAL − for critical errors (highest severity)
How to Log Messages
The following code shows logging a message using logging.info level.
import logging logging.info("This is an information")
The above logging message can be passed as an argument using logging.log shown as follows −
import logging logging.log(logging.INFO, "This is an information")
Now, you can also use loggers to enclose the message using the logging helpers logging to get the logging message clearly shown as follows −
import logging logger = logging.getLogger() logger.info("This is an information")
There can be multiple loggers and those can be accessed by getting their names with the use of logging.getLogger function shown as follows.
import logging logger = logging.getLogger('mycustomlogger') logger.info("This is an information")
A customized logger can be used for any module using the __name__ variable which contains the module path shown as follows −
import logging logger = logging.getLogger(__name__) logger.info("This is an information")
Logging from Spiders
Every spider instance has a logger within it and can used as follows −
import scrapy class LogSpider(scrapy.Spider): name = 'logspider' start_urls = ['http://dmoz.com'] def parse(self, response): self.logger.info('Parse function called on %s', response.url)
In the above code, the logger is created using the Spider’s name, but you can use any customized logger provided by Python as shown in the following code −
import logging import scrapy logger = logging.getLogger('customizedlogger') class LogSpider(scrapy.Spider): name = 'logspider' start_urls = ['http://dmoz.com'] def parse(self, response): logger.info('Parse function called on %s', response.url)
Loggers are not able to display messages sent by them on their own. So they require "handlers" for displaying those messages and handlers will be redirecting these messages to their respective destinations such as files, emails, and standard output.
Depending on the following settings, Scrapy will configure the handler for logger.
The following settings are used to configure the logging −
The LOG_FILE and LOG_ENABLED decide the destination for log messages.
When you set the LOG_ENCODING to false, it won't display the log output messages.
The LOG_LEVEL will determine the severity order of the message; those messages with less severity will be filtered out.
The LOG_FORMAT and LOG_DATEFORMAT are used to specify the layouts for all messages.
When you set the LOG_STDOUT to true, all the standard output and error messages of your process will be redirected to log.
Scrapy settings can be overridden by passing command-line arguments as shown in the following table −
|Sr.No||Command & Description|
Sets LOG_ENABLED to False
This function can be used to initialize logging defaults for Scrapy.
scrapy.utils.log.configure_logging(settings = None, install_root_handler = True)
|Sr.No||Parameter & Description|
settings (dict, None)
It creates and configures the handler for root logger. By default, it is None.
It specifies to install root logging handler. By default, it is True.
The above function −
- Routes warnings and twisted loggings through Python standard logging.
- Assigns DEBUG to Scrapy and ERROR level to Twisted loggers.
- Routes stdout to log, if LOG_STDOUT setting is true.
Default options can be overridden using the settings argument. When settings are not specified, then defaults are used. The handler can be created for root logger, when install_root_handler is set to true. If it is set to false, then there will not be any log output set. When using Scrapy commands, the configure_logging will be called automatically and it can run explicitly, while running the custom scripts.
To configure logging's output manually, you can use logging.basicConfig() shown as follows −
import logging from scrapy.utils.log import configure_logging configure_logging(install_root_handler = False) logging.basicConfig ( filename = 'logging.txt', format = '%(levelname)s: %(your_message)s', level = logging.INFO )
Kickstart Your Career
Get certified by completing the courseGet Started