Server Side Programming Articles

Page 130 of 2109

Drop rows containing specific value in pyspark dataframe

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 1K+ Views

When dealing with large datasets, PySpark provides powerful tools for data processing and manipulation. PySpark is Apache Spark's Python API that allows you to work with distributed data processing in your local Python environment. In this tutorial, we'll learn how to drop rows containing specific values from a PySpark DataFrame using different methods. This selective data elimination is essential for data cleaning and maintaining data relevance. Creating a Sample PySpark DataFrame First, let's create a sample DataFrame to demonstrate the row dropping techniques ? from pyspark.sql import SparkSession # Create SparkSession spark = SparkSession.builder.appName("DropRowsDemo").getOrCreate() ...

Read More

Drop One or Multiple Columns From PySpark DataFrame

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 1K+ Views

A PySpark DataFrame is a distributed data structure built on Apache Spark that provides powerful data processing capabilities. Sometimes you need to remove unnecessary columns to optimize performance or focus on specific data. PySpark offers several methods to drop one or multiple columns from a DataFrame. Creating a PySpark DataFrame First, let's create a sample DataFrame to demonstrate column dropping operations ? from pyspark.sql import SparkSession import pandas as pd # Create SparkSession spark = SparkSession.builder.appName("DropColumns").getOrCreate() # Sample dataset dataset = { "Device name": ["Laptop", "Mobile phone", "TV", "Radio"], ...

Read More

Drop Empty Columns in Pandas

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 11K+ Views

Pandas DataFrames often contain empty columns filled with NaN values that can clutter your data analysis. Python provides several efficient methods to identify and remove these empty columns to create cleaner, more relevant datasets. What Are Empty Columns? In pandas, a column is considered empty when it contains only NaN (Not a Number) values. Note that columns with empty strings, zeros, or spaces are not considered empty since these values may carry meaningful information about your dataset. Creating a DataFrame with Empty Columns Let's start by creating a sample DataFrame that includes an empty column filled ...

Read More

Drop duplicate rows in PySpark DataFrame

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 589 Views

PySpark is a Python API for Apache Spark, designed to process large-scale data in real-time with distributed computing capabilities. Unlike regular DataFrames, PySpark DataFrames distribute data across clusters and follow a strict schema for optimized processing. In this article, we'll explore different methods to drop duplicate rows from PySpark DataFrames using distinct() and dropDuplicates() functions. Installation Install PySpark using pip ? pip install pyspark Creating a PySpark DataFrame First, let's create a sample DataFrame with duplicate rows to demonstrate the deduplication methods ? from pyspark.sql import SparkSession import pandas as ...

Read More

Drop columns in DataFrame by label Names or by Index Positions

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 275 Views

A pandas DataFrame is a 2D data structure for storing tabular data. When working with DataFrames, you often need to remove unwanted columns. This can be done by specifying column names or their index positions using the drop() method. In this tutorial, we'll explore different methods to drop columns from a pandas DataFrame including dropping by names, index positions, and ranges. Creating the Sample DataFrame Let's start by creating a sample DataFrame to work with ? import pandas as pd dataset = { "Employee ID": ["CIR45", "CIR12", "CIR18", "CIR50", "CIR28"], ...

Read More

Drop a list of rows from a Pandas DataFrame

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 586 Views

The pandas library in Python is widely popular for representing data in tabular structures called DataFrames. When working with data analysis, you often need to remove specific rows from your DataFrame. This article demonstrates three effective methods for dropping multiple rows from a Pandas DataFrame. Creating a Sample DataFrame Let's start by creating a DataFrame with student marks data ? import pandas as pd dataset = { "Aman": [98, 92, 88, 90, 91], "Raj": [78, 62, 90, 71, 45], "Saloni": [82, ...

Read More

How to Locate Elements using Selenium Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 1K+ Views

Selenium is a powerful web automation tool that can be used with Python to locate and extract elements from web pages. This is particularly useful for web scraping, testing, and automating browser interactions. In this tutorial, we'll explore different methods to locate HTML elements using Selenium with Python. Setting Up Selenium Before locating elements, you need to set up Selenium with a WebDriver. Here's a basic setup ? from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.chrome.service import Service import time # Setup Chrome driver driver = webdriver.Chrome() driver.get("https://example.com") time.sleep(2) # Always close ...

Read More

How to iterate through a nested List in Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 6K+ Views

A nested list in Python is a list that contains other lists as elements. Iterating through nested lists requires different approaches depending on the structure and your specific needs. What is a Nested List? Here are common examples of nested lists ? # List with mixed data types people = [["Alice", 25, ["New York", "NY"]], ["Bob", 30, ["Los Angeles", "CA"]], ["Carol", 28, ["Chicago", "IL"]]] # 3-dimensional nested list matrix = [ ...

Read More

How to invert the elements of a boolean array in Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 1K+ Views

Boolean array inversion is a common operation when working with data that contains True/False values. Python offers several approaches to invert boolean arrays using NumPy functions like np.invert(), the bitwise operator ~, or np.logical_not(). Using NumPy's invert() Function The np.invert() function performs bitwise NOT operation on boolean arrays ? import numpy as np # Create a boolean array covid_negative = np.array([True, False, True, False, True]) print("Original array:", covid_negative) # Invert using np.invert() covid_positive = np.invert(covid_negative) print("Inverted array:", covid_positive) Original array: [ True False True False True] Inverted array: [False ...

Read More

How to Make a Bell Curve in Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 2K+ Views

A bell curve (normal distribution) is a fundamental concept in statistics that appears when we plot many random observations. Python's Plotly library provides excellent tools for creating these visualizations. This article demonstrates three practical methods to create bell curves using different datasets. Understanding Bell Curves The normal distribution emerges naturally when averaging many observations. For example, rolling two dice and summing their values creates a bell-shaped pattern — the sum of 7 occurs most frequently, while extreme values (2 or 12) are rare. Example 1: Bell Curve from Dice Roll Simulation Let's simulate 2000 dice rolls ...

Read More
Showing 1291–1300 of 21,090 articles
« Prev 1 128 129 130 131 132 2109 Next »
Advertisements