How to Install and Configure Cloudera Manager on CentOS/RHEL 8?


Cloudera Manager is an enterprise-level software solution for managing Apache Hadoop clusters. It provides a web-based interface for deploying, configuring, and monitoring Hadoop clusters. Cloudera Manager is available in both open-source and enterprise editions. In this article, we will discuss how to install and configure Cloudera Manager on CentOS/RHEL 8.

Prerequisites

Before we proceed with installation, make sure that following prerequisites are met −

  • A fresh installation of CentOS/RHEL 8

  • A user with sudo privileges

  • A stable internet connection

Step 1: Install Java

Cloudera Manager requires Java to be installed on system. CentOS/RHEL 8 comes with OpenJDK pre-installed, but Cloudera Manager recommends using Oracle JDK. To install Oracle JDK on CentOS/RHEL 8, follow steps below −

Download latest version of Oracle JDK from official website.

Extract downloaded file using following command −

$ tar zxvf jdk-<version>-linux-x64.tar.gz

Move extracted directory to /usr/local using following command −

$ sudo mv jdk-<version> /usr/local

Set JAVA_HOME environment variable by adding following line to /etc/profile file −

export JAVA_HOME=/usr/local/jdk-<version>

Reload profile file using following command −

$ source /etc/profile

Verify installation by running following command −

$ java -version

Step 2: Install Cloudera Manager Server

To install Cloudera Manager Server, follow steps below −

Download latest version of Cloudera Manager Server from official website.

Install required dependencies using following command −

$ sudo yum install -y postgresql-server postgresql-jdbc

Install Cloudera Manager Server using following command −

$ sudo rpm -ivh cloudera-manager-server-<version>.rpm

Start Cloudera Manager Server using following command −

$ sudo systemctl start cloudera-scm-server

Enable Cloudera Manager Server to start at boot using following command −

$ sudo systemctl enable cloudera-scm-server

Step 3: Install Cloudera Manager Agent

To install Cloudera Manager Agent, follow steps below −

Download latest version of Cloudera Manager Agent from official website.

Install Cloudera Manager Agent using following command −

$ sudo rpm -ivh cloudera-manager-agent-<version>.rpm

Edit /etc/cloudera-scm-agent/config.ini file and set hostname or IP address of Cloudera Manager Server using following line −

server_host=<hostname_or_IP_address>

Start Cloudera Manager Agent using following command −

$ sudo systemctl start cloudera-scm-agent

Enable Cloudera Manager Agent to start at boot using following command −

$ sudo systemctl enable cloudera-scm-agent

Step 4: Accessing Cloudera Manager Web UI

To access Cloudera Manager Web UI, follow steps below −

Open a web browser and go to http://<hostname_or_IP_address>:7180

Log in with username and password that you specified during installation process.

Step 5: Deploying Hadoop Cluster

To deploy a Hadoop cluster using Cloudera Manager, follow steps below −

  • Click on Clusters tab and then click on Create Cluster button.

  • Follow instructions on screen to configure cluster.

  • After configuring cluster, click on Continue button.

  • Cloudera Manager will start deploying cluster. This process may take some time depending on size and complexity of cluster.

Step 6: Monitoring Hadoop Cluster

Once cluster is deployed, you can use Cloudera Manager to monitor health and performance of cluster. To monitor cluster, follow steps below −

  • Click on Clusters tab and then click on name of cluster that you want to monitor.

  • Click on Services tab to see list of services running in cluster.

  • Click on name of a service to see status and performance metrics of that service.

  • Click on Charts tab to see graphs of performance metrics for selected service.

In addition to basic installation and deployment of Hadoop clusters, Cloudera Manager provides a wide range of features for managing and optimizing your Hadoop environment. Some of these features include −

  • Configuration Management − Cloudera Manager allows you to manage configuration of Hadoop components and services across your entire cluster. You can make changes to configuration settings of one or more services and propagate those changes to all nodes in cluster.

  • Health Monitoring − Cloudera Manager provides a centralized dashboard that displays health of your Hadoop cluster in real-time. You can monitor status of services and components, check for alerts and warnings, and diagnose any issues.

  • Resource Management − Cloudera Manager allows you to manage resources (CPU, memory, and disk) consumed by your Hadoop applications. You can allocate resources to different applications based on their priority, and ensure that all applications receive a fair share of resources.

  • Backup and Recovery − Cloudera Manager provides a backup and recovery solution for your Hadoop cluster. You can take backups of metadata, configuration, and data stored in Hadoop, and restore them in case of any disaster or failure.

  • Security Management − Cloudera Manager allows you to manage security of your Hadoop cluster. You can enable authentication and authorization, set up SSL encryption, and manage Kerberos principals and keytabs.

Overall, Cloudera Manager is a comprehensive tool for managing Hadoop clusters. With its easy-to-use interface and powerful features, it can help you optimize performance, reliability, and security of your Hadoop environment.

Cloudera Manager also offers several advanced features to help you manage and optimize your Hadoop cluster. Some of these features include −

  • Custom Metrics − Cloudera Manager allows you to monitor and collect custom metrics specific to your Hadoop applications. You can define custom metrics using JMX or Cloudera Manager API, and create custom charts to visualize metrics.

  • Role-Based Access Control − Cloudera Manager provides role-based access control (RBAC) to manage permissions of users and groups. You can assign different roles to users and groups, such as administrator, operator, or viewer, and control their access to different parts of Cloudera Manager interface.

  • Rolling Upgrades − Cloudera Manager provides a rolling upgrade feature that allows you to upgrade your Hadoop components and services with zero downtime. You can perform upgrades on a rolling basis, where one node at a time is upgraded, while rest of nodes continue to run.

  • Auto-Tuning − Cloudera Manager provides an auto-tuning feature that automatically adjusts configuration of Hadoop services based on workload and resource usage. This feature ensures that your Hadoop cluster is always optimized for performance and resource utilization.

  • Integration with Other Tools − Cloudera Manager integrates with other tools and services such as Apache Kafka, Apache Spark, and Apache Impala. You can easily deploy and manage these tools using Cloudera Manager, and monitor their performance and health.

Conclusion

Cloudera Manager is a powerful tool for managing Hadoop clusters. It provides a user-friendly interface for deploying, configuring, and monitoring Hadoop clusters. In this article, we have discussed how to install and configure Cloudera Manager on CentOS/RHEL 8. By following these steps, you can easily set up a Hadoop cluster and manage it using Cloudera Manager.

Updated on: 12-May-2023

655 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements