Collectl: An Advanced All-in-One Performance Monitoring Tool for Linux


The predominant project of a Linux approach administrator is to make certain that the system is in an excellent condition. Collectl is used to collect performance information that describes the present process popularity. Unlike most of the other monitoring tools, collectldoes no longer center of attention in a restricted number of process metrics. Instead it can gather information on many different types of system resources such as cpu, disk, memory, network, sockets, tcp, inodes, infiniband, lustre, memory, nfs, processes, quadrics, slabs and buddyinfo. This article explains about how to install collectl.

Features

  • It runs interactively, as a daemon or both.
  • It displays the output in many formats.
  • It has the capability to monitor almost any subsystem.
  • It plays the role of many other utilities such as ps, top, iotop, vmstat.
  • It has the ability to record and playback the captured data.
  • Exports the data in various file formats. (this is very useful when you want to analyse the data with external tools).
  • It can run as a service to monitor remote machines or an entire server cluster.
  • Displays the data in the terminal, write to a file or a socket.

Installing collectl in Linux

To install collectl in Linux, it should require perl. Use the following command to install perl –

$ sudo apt-get install perl

To install collectl, use the following command –

$ sudo apt-get install collectl

The sample output should be like this –

Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  aglfn apache2 apache2-bin apache2-data apache2-utils colplot gnuplot
  gnuplot-tex gnuplot5-data gnuplot5-qt libapr1 libaprutil1
  libaprutil1-dbd-sqlite3 libaprutil1-ldap libgetopt-simple-perl liblua5.1-0
  libwxbase3.0-0v5 libwxgtk3.0-0v5 perl-tk
Suggested packages:
  apache2-doc apache2-suexec-pristine | apache2-suexec-custom feedgnuplot
  gnuplot-doc libgnuplot-iostream-dev python-gnuplot gnuplot5-doc
The following NEW packages will be installed:
  aglfn apache2 apache2-bin apache2-data apache2-utils collectl colplot
  gnuplot gnuplot-tex gnuplot5-data gnuplot5-qt libapr1 libaprutil1
  libaprutil1-dbd-sqlite3 libaprutil1-ldap libgetopt-simple-perl liblua5.1-0
  libwxbase3.0-0v5 libwxgtk3.0-0v5 perl-tk
0 upgraded, 20 newly installed, 0 to remove and 20 not upgraded.
Need to get 10.5 MB of archives.
After this operation, 41.4 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 libapr1 amd64 1.5.2-3 [86.0 kB]
Get:2 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 libaprutil1 amd64 1.5.4-1build1 [77.1 kB]
Get:3 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 libaprutil1-dbd-sqlite3 amd64 1.5.4-1build1 [10.6 kB]
Get:4 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 libaprutil1-ldap amd64 1.5.4-1build1 [8,720 B]
Get:5 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 liblua5.1-0 amd64 5.1.5-8ubuntu1 [102 kB]
Get:6 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 apache2-bin amd64 2.4.18-2ubuntu3 [918 kB]
Get:7 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 apache2-utils amd64 2.4.18-2ubuntu3 [81.1 kB]
Get:8 http://in.archive.ubuntu.com/ubuntu xenial/main amd64 apache2-data all 2.4.18-2ubuntu3 [162 kB]
................................................................................................

To run Collectl, use the following command –

$ collectl

The sample output should be like this –

waiting for 1 second sample...
#
#cpu sys inter  ctxsw KBRead  Reads KBWrit Writes   KBIn  PktIn  KBOut  PktOut 
   3   1   576    952      0      0   1428    175      0      0      0       0 
   1   1   372    998      0      0      0      0      0      2      0       0 
   2   1   477    896      0      0      0      0      0      1      0       0 
   2   1   437   1103      0      0      0      0      0      0      0       0 
   1   1   426   1045      0      0    160     14      0      4      0       0 
   2   1   392    962      0      0      0      0      0      6      0       0 
   2   1   358    920      0      0      0      0      0      2      0       0 
   2   1   421   1067      0      0      0      0      0      4      0       3 
   2   1   413    892      0      0      0      0      0      0      0       0 
   2   1   511   1771      0      0      0      0      0      0      0       0 
   6   2   759   3749      0      0     52      4      0      0      0       0 
   5   1   621   3251      0      0      0      0      0      0      0       0 
   6   2   771   4380      0      0      0      0      0      0      0       0 
  12   4  1080   8043      0      0      0      0      0      0      0       0 
  15   4  1215   8517      0      0    176     14      0      5      2       6 
   5   1   545   2512      0      0    212     20      1      1      0       1 
   5   1   502   2433      0      0      0      0      1     10      0       0 
   2   1   389   1173      0      0      0      0      0      4      0       0 
  20   3  1024   6732      0      0      0      0      0      1      0       0 
  27   5  1301   8061      0      0      0      0      0      0      0       0 
   2   1   749   3351      0      0  33932     53      0      2      0       1 
   8   2   791   4740      0      0   7788     31      0      0      0       0 
#
#cpu sys inter  ctxsw KBRead  Reads KBWrit Writes   KBIn  PktIn  KBOut  PktOut 
  10   3   856   5916      0      0      0      0      0      2      0       1 
  26   4  1176   6580      0      0      0      0      0      0      0       0 
   4   1   654   2984      0      0      0      0      0      0      0       0 
   3   1   451   1714      0      0    136     14      0      0      0       0 
   7   3   569   3232      0      0      0      0      0      0      0       0 
   4   2   726   1872      0      0     64      3      0      1      0       1 
   8   2   674   2750      0      0      0      0      0      6      0       0 
   7   2   610   3043      0      0      0      0      1      8      7       8 
  13   2   704   4142      0      0      0      0      2      6      0       3 
  10   2   831   3708      0      0      0      0      4     17      5      20 
   8   1   671   4091      0      0      0      0      0      4      0       0 
   5   1   614   3133      0      0     76      2      0      1      0       0 
   8   1   588   3025      0      0      0      0      0      1      0       0 
   8   1   676   3929      0      0    172     35      0      0      0       0 
   6   2   613   3100      0      0     48      2      0      1      0       0 
   3   1   757   3726      0      0    248     33      0      1      0       0 
   8   2   768   4200      0      0    156      3      0      0      0       0 
   2   1   449   1765      0      0      0      0      1      3      0       0 
   2   1   590   2032      0      0      0      0      0      0      0       0 
   2   1   583   1872      0      0      0      0      0      0      0       0 
   2   0   505   1445      0      0      0      0      2      9      0       0 
   3   1   609   2071      0      0     68      2      0      2      0       0 
#
#cpu sys inter  ctxsw KBRead  Reads KBWrit Writes   KBIn  PktIn  KBOut  PktOut 
   4   1   659   2958      0      0      0      0      1      7      2       4 
   2   1   618   2653      0      0     32      5      1      3      0       1 
  10   2   815   3966      0      0      0      0      1      6      0       0 
   4   1   480   1817      0      0      0      0      0      6      0       1 
   6   1   552   2502      0      0    264      5      1      8      0       0 
   6   2   612   3497      0      0      0      0      2     12      0       0 
   6   1   543   2670      0      0      0      0      1      2      0       0 
   5   1   519   2311      0      0      0      0      0      1      0       0 
   6   2   515   2463      0      0      0      0      1      7      0       0 
   7   2   616   3141      0      0    216     23      1     11      0       0 
   7   1   630   3578      0      0      0      0      1     15      0       0 
   8   2   641   3617      0      0      4      1      1      9      0       0 
   6   1   539   2369      0      0      0      0      1      8      0       0 
   9   1   680   3740      0      0    124     13      0      4      0       0 
   9   1   655   3997      0      0      0      0      1      7      0       0 
   4   1   595   1856      0      0      0      0      2     17      2       4 
  10   3   886   6428      0      0     20      5      0      4      0       0 
  10   2  1006   5959      0      0      0      0      0      6      0       0 
   3   2   746   1798      0      0      0      0      2     25      0       2 
   2   1   676   2194      0      0     32      2      1      7      0       0 
   4   1   713   2295      0      0      0      0      0      1      0       0 
   3   1   778   2313      0      0      4      1      0      3      0       0 
#
#cpu sys inter  ctxsw KBRead  Reads KBWrit Writes   KBIn  PktIn  KBOut  PktOut 
   2   1   523   1520      0      0      0      0      0      3      0       0 
   3   2   747   2625      0      0      0      0      0      5      0       0 
   3   1   688   2005      0      0     36      2      0      0      0       0 
   2   1   590   1180      0      0    168     21      0      1      0       0 
   1   1   464   1043      0      0    156     27      0      0      0       0 
   3   1   486   1202      0      0    188      3      0      0      0       0 
   3   1   674   2351      0      0      0      0      1     10      0       1 
   5   2   964   3355      0      0      0      0      1      6      0       0 
.....................................................................................

As you have seen in the above output, it is very easy to work with the system metrics values present in the command output because it appears on a single line. You can also display statistics for all subsystems except slabs by combining the command with the below

For example, all option is shown as per below command –

$ sudo  collectl --all

The sample output should be like this –

waiting for 1 second sample...
#
#cpu sys inter  ctxsw Cpu0 Cpu1 Free Buff Cach Inac Slab  Map   Fragments KBRead  Reads KBWrit Writes   KBIn  PktIn  KBOut  PktOut   IP  Tcp  Udp Icmp  Tcp  Udp  Raw Frag Handle Inodes  Reads Writes Meta Comm 
  16   3   817   1542  430  390   1G 175M   1G 683M 193M   1G nsslkjjebbk      0      0     24      3      1      1      0       1    0    0    0    0  623    0    0    0   8160 240829      0      0    0    0 
  11   1   745   1324  316  426   1G 175M   1G 683M 193M   1G nsslkjjebbk      0      0      0      0      0      3      0       2    0    0    0    0  622    0    0    0   8160 240828      0      0    0    0 
  15   2   793   1683  371  424   1G 175M   1G 683M 193M   1G ssslkjjebbk      0      0      0      0      1      1      0       1    0    0    0    0  622    0    0    0   8160 240829      0      0    0    0 
  16   2   872   1875  427  446   1G 175M   1G 683M 193M   1G ssslkjjebbk      0      0     24      3      1      1      0       1    0    0    0    0  622    0    0    0   8160 240828      0      0    0    0 
  24   2   842   1383  473  368   1G 175M   1G 683M 193M   1G ssslkjjebbk      0      0    168      6      1      1      0       1    0    0    0    0  622    0    0    0   8160 240828      0      0    0    0 
  27   3   844   1099  478  365   1G 175M   1G 683M 193M   1G nsslkjjebbk      0      0      0      0      1      6      1       9    0    0    0    0  622    0    0    0   8160 240828      0      0    0    0 
  26   5   823   1238  396  428   1G 175M   1G 683M 193M   1G ssslkjjebbk      0      0      0      0      2     11      3       9    0    0    0    0  622    0    0    0   8160 240828      0      0    0    0 
  15   1   753   1276  361  391   1G 175M   1G 683M 193M   1G ssslkjjebbk      0      0     40      3      1      2      0       3    0    0    0    0  623    0    0    0   8160 240829      0      0    0    0

If you want to collect data about the memory, use the following command-

$ sudo collectl -sm

The sample output should be like this –

waiting for 1 second sample...
#
#Free Buff Cach Inac Slab  Map 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
   1G 174M   2G 825M 170M   3G 
...................................

To get the data on tcp, use the following command –

$ sudo collectl -st

The sample output should be like this –

#
#  IP  Tcp  Udp Icmp 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
    0    0    0    0 
............................

It is very easy to make collectl work as the top utility, just run the following command in your terminal-

$ sudo collectl --top

The sample output should be like this –

# TOP PROCESSES sorted by time (counters are /sec) 11:42:40
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
# TOP PROCESSES sorted by time (counters are /sec) 11:42:44  9  24:47.28    0    0    0   37 compiz 
# TOP PROCESSES sorted by time (counters are /sec) 11:43:10Pct  AccuTime  RKB  WKB MajF MinF Commandn/perl 
# TOP PROCESSES sorted by time (counters are /sec) 11:43:11Pct  AccuTime  RKB  WKB MajF MinF Commandogle/chrome/chrome 
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Commandn/perl 
30526  root     20 30525    0 R   74M   27M  1  0.01  0.08   9  00:03.41    0    0    0    0 /usr/bin/perl 
13185  netdata  39 21351    0 S   20M    4M  0  0.01  0.03   4  02:56.04    0    8    0 3982 /bin/bash 
  871  root     20   787    1 S  369M   52M  0  0.01  0.01   2  26:54.20    0    0    0    2 /usr/lib/xorg/Xorg 
 1438  linux    20  1230    7 S    1G  225M  1  0.00  0.01   1  24:47.99    0    0    0    0 compiz 
13055  netdata  39 21351    0 S   16M    3M  3  0.00  0.01   1  01:18.21    0    0    0    0 /usr/libexec/netdata/plugins.d/apps.plugin 
17289  linux    20  1823    9 S  977M  236M  2  0.00  0.01   1  01:10.33    0    0    0    2 /opt/google/chrome/chrome 
21577  netdata  39 21351    0 S   19M    3M  2  0.01  0.00   1  00:01.54    0    0    0  142 /bin/bash 
29665  root     20     2    0 S     0     0  2  0.01  0.00   1  00:00.08    0    0    0    0 kworker/2:1 
    1  root     20     0    0 S  181M    5M  2  0.00  0.00   0  00:02.43    0    0    0    0 /sbin/init 
    2  root     20     0    0 S     0     0  0  0.00  0.00   0  00:00.04    0    0    0    0 kthreadd 
    3  root     20     2    0 S     0     0  0  0.00  0.00   0  00:05.39    0    0    0    0 ksoftirqd/0 
    5  root      0     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 kworker/0:0H 
    7  root     20     2    0 S     0     0  0  0.00  0.00   0  01:11.84    0    0    0    0 rcu_sched 
    8  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 rcu_bh 
    9  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:01.40    0    0    0    0 migration/0 
   10  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.18    0    0    0    0 watchdog/0 
   11  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.15    0    0    0    0 watchdog/1 
   12  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:01.59    0    0    0    0 migration/1 
   13  root     20     2    0 S     0     0  1  0.00  0.00   0  00:01.57    0    0    0    0 ksoftirqd/1 
   15  root      0     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 kworker/1:0H 
   16  root     RT     2    0 S     0     0  2  0.00  0.00   0  00:00.15    0    0    0    0 watchdog/2 
   17  root     RT     2    0 S     0     0  2  0.00  0.00   0  00:02.60    0    0    0    0 migration/2

To use the collectl utility as the ps tool run the following command in your terminal-

$ sudo collectl -c1 -sZ -i:1

The sample output should be like this –

waiting for 1 second sample...

### RECORD    1 >>> linux <<< (1461824164.001) (Thu Apr 28 11:46:04 2016) ###

# PROCESS SUMMARY (counters are /sec)
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
    1  root     20     0    0 S  181M    5M  0  0.00  0.00   0  00:02.43    0    0    0    0 /sbin/init 
    2  root     20     0    0 S     0     0  3  0.00  0.00   0  00:00.04    0    0    0    0 kthreadd 
    3  root     20     2    0 S     0     0  0  0.00  0.00   0  00:05.40    0    0    0    0 ksoftirqd/0 
    5  root      0     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 kworker/0:0H 
    7  root     20     2    0 S     0     0  0  0.00  0.00   0  01:12.13    0    0    0    0 rcu_sched 
    8  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 rcu_bh 
    9  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:01.41    0    0    0    0 migration/0 
   10  root     RT     2    0 S     0     0  0  0.00  0.00   0  00:00.18    0    0    0    0 watchdog/0 
   11  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:00.15    0    0    0    0 watchdog/1 
   12  root     RT     2    0 S     0     0  1  0.00  0.00   0  00:01.60    0    0    0    0 migration/1 
   13  root     20     2    0 S     0     0  1  0.00  0.00   0  00:01.57    0    0    0    0 ksoftirqd/1 
   15  root      0     2    0 S     0     0  1  0.00  0.00   0  00:00.00    0    0    0    0 kworker/1:0H 
   16  root     RT     2    0 S     0     0  2  0.00  0.00   0  00:00.15    0    0    0    0 watchdog/2 
   17  root     RT     2    0 S     0     0  2  0.00  0.00   0  00:02.61    0    0    0    0 migration/2 
   18  root     20     2    0 S     0     0  2  0.00  0.00   0  00:01.38    0    0    0    0 ksoftirqd/2 
   20  root      0     2    0 S     0     0  2  0.00  0.00   0  00:00.00    0    0    0    0 kworker/2:0H 
   21  root     RT     2    0 S     0     0  3  0.00  0.00   0  00:00.14    0    0    0    0 watchdog/3 
   22  root     RT     2    0 S     0     0  3  0.00  0.00   0  00:02.30    0    0    0    0 migration/3 
   23  root     20     2    0 S     0     0  3  0.00  0.00   0  00:01.08    0    0    0    0 ksoftirqd/3 
   25  root      0     2    0 S     0     0  3  0.00  0.00   0  00:00.00    0    0    0    0 kworker/3:0H 
   26  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.00    0    0    0    0 kdevtmpfs 
   27  root      0     2    0 S     0     0  2  0.00  0.00   0  00:00.00    0    0    0    0 netns 
   28  root      0     2    0 S     0     0  2  0.00  0.00   0  00:00.00    0    0    0    0 perf 
   29  root     20     2    0 S     0     0  0  0.00  0.00   0  00:00.03    0    0    0    0 khungtaskd 
............................................................................................

To get more information about Collectl, use the following command –

$ man collectl

Congratulations! Now, you know “How to install Collectl and usage on a Linux”. We’ll learn more about these types of commands in our next Linux post. Keep reading!

Advertisements