 
- Scrapy - Overview
- Scrapy - Environment
- Scrapy - Command Line Tools
- Scrapy - Spiders
- Scrapy - Selectors
- Scrapy - Items
- Scrapy - Item Loaders
- Scrapy - Shell
- Scrapy - Item Pipeline
- Scrapy - Feed exports
- Scrapy - Requests & Responses
- Scrapy - Link Extractors
- Scrapy - Settings
- Scrapy - Exceptions
- Scrapy Live Project
- Scrapy - Create a Project
- Scrapy - Define an Item
- Scrapy - First Spider
- Scrapy - Crawling
- Scrapy - Extracting Items
- Scrapy - Using an Item
- Scrapy - Following Links
- Scrapy - Scraped Data
- Scrapy Built In Services
- Scrapy - Logging
- Scrapy - Stats Collection
- Scrapy - Sending an E-mail
- Scrapy - Telnet Console
- Scrapy - Web Services
- Scrapy Useful Resources
- Scrapy - Quick Guide
- Scrapy - Useful Resources
- Scrapy - Discussion
Scrapy - Other Settings
The following table shows other settings of Scrapy −
| Sr.No | Setting & Description | 
|---|---|
| 1 | AJAXCRAWL_ENABLED It is used for enabling the large crawls. Default value: False | 
| 2 | AUTOTHROTTLE_DEBUG It is enabled to see how throttling parameters are adjusted in real time, which displays stats on every received response. Default value: False | 
| 3 | AUTOTHROTTLE_ENABLED It is used to enable AutoThrottle extension. Default value: False | 
| 4 | AUTOTHROTTLE_MAX_DELAY It is used to set the maximum delay for download in case of high latencies. Default value: 60.0 | 
| 5 | AUTOTHROTTLE_START_DELAY It is used to set the initial delay for download. Default value: 5.0 | 
| 6 | AUTOTHROTTLE_TARGET_CONCURRENCY It defines the average number of requests for a Scrapy to send parallely to remote sites. Default value: 1.0 | 
| 7 | CLOSESPIDER_ERRORCOUNT It defines total number of errors that should be recieved before the spider is closed. Default value: 0 | 
| 8 | CLOSESPIDER_ITEMCOUNT It defines a total number of items before closing the spider. Default value: 0 | 
| 9 | CLOSESPIDER_PAGECOUNT It defines the maximum number of responses to crawl before spider closes. Default value: 0 | 
| 10 | CLOSESPIDER_TIMEOUT It defines the amount of time (in sec) for a spider to close. Default value: 0 | 
| 11 | COMMANDS_MODULE It is used when you want to add custom commands in your project. Default value: '' | 
| 12 | COMPRESSION_ENABLED It indicates that the compression middleware is enabled. Default value: True | 
| 13 | COOKIES_DEBUG If set to true, all the cookies sent in requests and received in responses are logged. Default value: False | 
| 14 | COOKIES_ENABLED It indicates that cookies middleware is enabled and sent to web servers. Default value: True | 
| 15 | FILES_EXPIRES It defines the delay for the file expiration. Default value: 90 days | 
| 16 | FILES_RESULT_FIELD It is set when you want to use other field names for your processed files. | 
| 17 | FILES_STORE It is used to store the downloaded files by setting it to a valid value. | 
| 18 | FILES_STORE_S3_ACL It is used to modify the ACL policy for the files stored in Amazon S3 bucket. Default value: private | 
| 19 | FILES_URLS_FIELD It is set when you want to use other field name for your files URLs. | 
| 20 | HTTPCACHE_ALWAYS_STORE Spider will cache the pages thoroughly if this setting is enabled. Default value: False | 
| 21 | HTTPCACHE_DBM_MODULE It is a database module used in DBM storage backend. Default value: 'anydbm' | 
| 22 | HTTPCACHE_DIR It is a directory used to enable and store the HTTP cache. Default value: 'httpcache' | 
| 23 | HTTPCACHE_ENABLED It indicates that HTTP cache is enabled. Default value: False | 
| 24 | HTTPCACHE_EXPIRATION_SECS It is used to set the expiration time for HTTP cache. Default value: 0 | 
| 25 | HTTPCACHE_GZIP This setting if set to true, all the cached data will be compressed with gzip. Default value: False | 
| 26 | HTTPCACHE_IGNORE_HTTP_CODES It states that HTTP responses should not be cached with HTTP codes. Default value: [] | 
| 27 | HTTPCACHE_IGNORE_MISSING This setting if enabled, the requests will be ignored if not found in the cache. Default value: False | 
| 28 | HTTPCACHE_IGNORE_RESPONSE_CACHE_CONTROLS It is a list containing cache controls to be ignored. Default value: [] | 
| 29 | HTTPCACHE_IGNORE_SCHEME It states that HTTP responses should not be cached with URI schemes. Default value: ['file'] | 
| 30 | HTTPCACHE_POLICY It defines a class implementing cache policy. Default value: 'scrapy.extensions.httpcache.DummyPolicy' | 
| 31 | HTTPCACHE_STORAGE It is a class implementing the cache storage. Default value: 'scrapy.extensions.httpcache.FilesystemCacheStorage' | 
| 32 | HTTPERROR_ALLOWED_CODES It is a list where all the responses are passed with non-200 status codes. Default value: [] | 
| 33 | HTTPERROR_ALLOW_ALL This setting when enabled, all the responses are passed despite of its status codes. Default value: False | 
| 34 | HTTPPROXY_AUTH_ENCODING It is used to authenticate the proxy on HttpProxyMiddleware. Default value: "latin-1" | 
| 35 | IMAGES_EXPIRES It defines the delay for the images expiration. Default value: 90 days | 
| 36 | IMAGES_MIN_HEIGHT It is used to drop images that are too small using minimum size. | 
| 37 | IMAGES_MIN_WIDTH It is used to drop images that are too small using minimum size. | 
| 38 | IMAGES_RESULT_FIELD It is set when you want to use other field name for your processed images. | 
| 39 | IMAGES_STORE It is used to store the downloaded images by setting it to a valid value. | 
| 40 | IMAGES_STORE_S3_ACL It is used to modify the ACL policy for the images stored in Amazon S3 bucket. Default value: private | 
| 41 | IMAGES_THUMBS It is set to create the thumbnails of downloaded images. | 
| 42 | IMAGES_URLS_FIELD It is set when you want to use other field name for your images URLs. | 
| 43 | MAIL_FROM The sender uses this setting to send the emails. Default value: 'scrapy@localhost' | 
| 44 | MAIL_HOST It is a SMTP host used to send emails. Default value: 'localhost' | 
| 45 | MAIL_PASS It is a password used to authenticate SMTP. Default value: None | 
| 46 | MAIL_PORT It is a SMTP port used to send emails. Default value: 25 | 
| 47 | MAIL_SSL It is used to implement connection using SSL encrypted connection. Default value: False | 
| 48 | MAIL_TLS When enabled, it forces connection using STARTTLS. Default value: False | 
| 49 | MAIL_USER It defines a user to authenticate SMTP. Default value: None | 
| 50 | METAREFRESH_ENABLED It indicates that meta refresh middleware is enabled. Default value: True | 
| 51 | METAREFRESH_MAXDELAY It is a maximum delay for a meta-refresh to redirect. Default value: 100 | 
| 52 | REDIRECT_ENABLED It indicates that the redirect middleware is enabled. Default value: True | 
| 53 | REDIRECT_MAX_TIMES It defines the maximum number of times for a request to redirect. Default value: 20 | 
| 54 | REFERER_ENABLED It indicates that referrer middleware is enabled. Default value: True | 
| 55 | RETRY_ENABLED It indicates that the retry middleware is enabled. Default value: True | 
| 56 | RETRY_HTTP_CODES It defines which HTTP codes are to be retried. Default value: [500, 502, 503, 504, 408] | 
| 57 | RETRY_TIMES It defines maximum number of times for retry. Default value: 2 | 
| 58 | TELNETCONSOLE_HOST It defines an interface on which the telnet console must listen. Default value: '127.0.0.1' | 
| 59 | TELNETCONSOLE_PORT It defines a port to be used for telnet console. Default value: [6023, 6073] |