Scrapy - Telnet Console



Description

Telnet console is a Python shell which runs inside Scrapy process and is used for inspecting and controlling a Scrapy running process.

Access Telnet Console

The telnet console can be accessed using the following command −

telnet localhost 6023

Basically, telnet console is listed in TCP port, which is described in TELNETCONSOLE_PORT settings.

Variables

Some of the default variables given in the following table are used as shortcuts −

Sr.No Shortcut & Description
1

crawler

This refers to the Scrapy Crawler (scrapy.crawler.Crawler) object.

2

engine

This refers to Crawler.engine attribute.

3

spider

This refers to the spider which is active.

4

slot

This refers to the engine slot.

5

extensions

This refers to the Extension Manager (Crawler.extensions) attribute.

6

stats

This refers to the Stats Collector (Crawler.stats) attribute.

7

setting

This refers to the Scrapy settings object (Crawler.settings) attribute.

8

est

This refers to print a report of the engine status.

9

prefs

This refers to the memory for debugging.

10

p

This refers to a shortcut to the pprint.pprint function.

11

hpy

This refers to memory debugging.

Examples

Following are some examples illustrated using Telnet Console.

Pause, Resume and Stop the Scrapy Engine

To pause Scrapy engine, use the following command −

telnet localhost 6023
>>> engine.pause()
>>>

To resume Scrapy engine, use the following command −

telnet localhost 6023
>>> engine.unpause()
>>>

To stop Scrapy engine, use the following command −

telnet localhost 6023
>>> engine.stop()
Connection closed by foreign host.

View Engine Status

Telnet console uses est() method to check the status of Scrapy engine as shown in the following code −

telnet localhost 6023
>>> est()
Execution engine status

time()-engine.start_time                        : 8.62972998619
engine.has_capacity()                           : False
len(engine.downloader.active)                   : 16
engine.scraper.is_idle()                        : False
engine.spider.name                              : followall
engine.spider_is_idle(engine.spider)            : False
engine.slot.closing                             : False
len(engine.slot.inprogress)                     : 16
len(engine.slot.scheduler.dqs or [])            : 0
len(engine.slot.scheduler.mqs)                  : 92
len(engine.scraper.slot.queue)                  : 0
len(engine.scraper.slot.active)                 : 0
engine.scraper.slot.active_size                 : 0
engine.scraper.slot.itemproc_size               : 0
engine.scraper.slot.needs_backout()             : False

Telnet Console Signals

You can use the telnet console signals to add, update, or delete the variables in the telnet local namespace. To perform this action, you need to add the telnet_vars dict in your handler.

scrapy.extensions.telnet.update_telnet_vars(telnet_vars)

Parameters −

telnet_vars (dict)

Where, dict is a dictionary containing telnet variables.

Telnet Settings

The following table shows the settings that control the behavior of Telnet Console −

Sr.No Settings & Description Default Value
1

TELNETCONSOLE_PORT

This refers to port range for telnet console. If it is set to none, then the port will be dynamically assigned.

[6023, 6073]
2

TELNETCONSOLE_HOST

This refers to the interface on which the telnet console should listen.

'127.0.0.1'
Advertisements