Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Articles on Trending Technologies
Technical articles with clear explanations and examples
Cleaning Data with Dropna in Pyspark
Data cleaning is a crucial step in any data analysis or data science project to ensure accuracy and reliability. PySpark's dropna() function provides powerful capabilities for removing rows containing missing or null values from DataFrames, making it essential for big data processing. The dropna() function allows you to specify conditions for removing rows based on missing values, with flexible parameters for different cleaning strategies. Syntax df.dropna(how="any", thresh=None, subset=None) Parameters how − Determines when to drop rows. Use "any" to drop rows with any null values, or "all" to drop only rows where ...
Read MoreModelling Steady Flow Energy Equation in Python
The Steady Flow Energy Equation (SFEE) applies conservation of energy to open systems where fluid flows continuously through a control volume. This equation is fundamental in analyzing turbomachines, nozzles, diffusers, and other fluid flow devices. Control Volume Inlet (i) p_i, V_i, h_i, z_i Exit (e) p_e, V_e, h_e, z_e Q̇ ...
Read MoreIntroduction to NSE Tools Module in Python
The NSE (National Stock Exchange of India Limited) is India's leading stock exchange, established in 1992 as the country's first dematerialized exchange. Python's nsetools library provides easy access to NSE data for real-time stock market analysis. What is NSE Tools Module? The nsetools library is a Python package that allows developers to fetch live stock market data from the National Stock Exchange. It provides real-time quotes, stock prices, indices, and market statistics without requiring complex API authentication. Key Features Works instantly without complex setup requirements Provides real-time data from NSE at high speed Covers all ...
Read MoreFinding Words Lengths in String using Python
Finding the lengths of individual words in a string is a common task in text processing and data analysis. Python provides several approaches to accomplish this, from simple loops to more advanced techniques using regular expressions and dictionaries. Methods Used Using a loop and the split() function Using the map() function with len and split() Using the re.split() method from the re module Using a Dictionary to store word lengths Using a Loop and the split() Function This is the most straightforward ...
Read MoreChange Value in Excel Using Python
In this article, we will learn different approaches to change values in Excel files using Python. We'll explore two main libraries: openpyxl for modern Excel formats and xlwt/xlrd/xlutils for legacy formats. Using Openpyxl Openpyxl is a Python library designed for working with Excel spreadsheets. It supports modern Excel file formats including: XLSX (Microsoft Excel Open XML Spreadsheet) XLSM (Microsoft Excel Open XML Macro−Enabled Spreadsheet) XLTM (Microsoft Excel Open XML Macro−Enabled Template) XLTX (Microsoft Excel Open XML Template) Key Features Reading and Writing: Create, modify, and save Excel files Data Manipulation: Sort, filter, ...
Read MoreCluster Sampling in Pandas
In this article, we will learn how we can perform cluster sampling in Pandas. But before we deep dive into that, let's explore what sampling is in Pandas and how it helps us analyze data efficiently. Sampling in Pandas In Pandas, sampling refers to the process of selecting a subset of rows or columns from a DataFrame or Series object. Sampling can be useful in many data analysis tasks, such as data exploration, testing, and validation. Pandas provides several methods for sampling data, including: DataFrame.sample(): This method returns a random sample of rows from a ...
Read MoreClear LRU Cache in Python
In this article, we will learn how to clear an LRU cache implemented in Python. LRU Cache (Least Recently Used Cache) is a data structure that improves application performance by storing frequently-used data and removing the least recently used items when the cache becomes full. The LRU Cache is particularly useful in applications with high-cost data retrieval operations, such as disk I/O or network access. By caching frequently-used data in memory, applications can significantly reduce expensive operations and improve performance. Understanding LRU Cache in Python Python's functools module provides the @lru_cache decorator to implement LRU caching. This ...
Read MoreCheck if a String is Present in a Pdf File in Python
In today's digital world, PDF files have become an essential medium for storing and sharing information. Python provides several libraries that allow us to interact with PDF files and extract information from them. One common task is to search for a particular string within a PDF file. However, the simple text-based approach shown below has significant limitations. Opening a PDF file as plain text will not work properly because PDFs contain binary data, formatting, and metadata. For real PDF processing, you should use specialized libraries like PyPDF2 or pdfplumber. Basic Text Search Approach (Limited) This approach treats ...
Read MoreChange the View of Tensor in PyTorch
PyTorch tensors support the view() method to reshape tensor dimensions without copying data. This is essential for deep learning operations where you need to transform tensor shapes for different layers. What is tensor.view()? The view() method returns a new tensor with the same data but different shape. It's memory-efficient because it creates a new view of the existing data rather than copying it. Syntax tensor.view(*shape) tensor.view(rows, columns) The total number of elements must remain constant. For a tensor with 12 elements, valid shapes include (12, ), (3, 4), (2, 6), etc. Basic ...
Read MoreHow to remove axes spine from the plot in seaborn?
The axes spines, also known as the axis lines, are the lines that define the borders or boundaries of a plot's coordinate system. In a two-dimensional plot, there are typically four axes spines: top, bottom, left, and right. These spines create the framework for the plot and serve as reference lines for the data points. Each spine represents one of the four sides of the plot. The top spine runs horizontally across the top, the bottom spine runs horizontally across the bottom, the left spine runs vertically along the left side, and the right spine runs vertically along the ...
Read More