Select Subsets of Data in SQL Query Style in Pandas

Kiran P
Updated on 10-Nov-2020 06:52:12

371 Views

IntroductionIn this post, I will show you how to perform Data Analysis with SQL style filtering with Pandas. Most of the corporate company’s data are stored in databases that require SQL to retrieve and manipulate it. For instance, there are companies like Oracle, IBM, Microsoft having their own databases with their own SQL implementations.Data scientists have to deal with SQL at some stage of their career as the data is not always stored in CSV files. I personally prefer to use Oracle, as the majority of my company’s data is stored in Oracle.Scenario – 1 Suppose we are given a ... Read More

Select Subset of Data with Index Labels in Python Pandas

Kiran P
Updated on 10-Nov-2020 06:32:47

1K+ Views

IntroductionPandas have a dual selection capability to select the subset of data using the Index position or by using the Index labels. Inthis post, I will show you how to “Select a Subset Of Data Using Index Labels” using the index label.Remember, Python dictionaries and lists are built-in data structures that select their data either by using the index label or byindex position. A dictionary’s key must be a string, integer, or tuple while a List must either use integers (the position) or sliceobjects for selection.Pandas have .loc and.iloc attributes available to perform index operations in their own unique ways. ... Read More

Use Same Positional Arguments in Python

Kiran P
Updated on 10-Nov-2020 06:22:40

868 Views

Introduction..If we were writing a program that performs arithematic operations on two numbers, we could define them as two positional arguments. But since they are the same kinds/python data types of arguments, it might make more sense to use the nargs option to tell argparse that you want exactly two of same types.How to do it..1. Let's write a program to substract two numbers (both the arguments are of same type).Exampleimport argparse def get_args(): """ Function : get_args parameters used in .add_argument 1. metavar - Provide a hint to the user about the data type. - By default, all ... Read More

Plot Pie Chart with a Single Slice Highlighted in Python Matplotlib

Kiran P
Updated on 10-Nov-2020 06:16:11

943 Views

Introduction..What is your most favorite chart type ? If you ask this question to management or a business analyst, the immediate answer is Pie charts!. It is a very common way of presenting percentages.How to do it..1. Install matplotlib by following command.pip install matplotlib2. Import matplotlibimport matplotlib.pyplot as plt3. Prepare temporary data.tennis_stats = (('Federer', 20), ('Nadal', 20), ('Djokovic', 17), ('Murray', 3), )4. Next step is to prepare the data.titles = [title for player, title in tennis_stats] players = [player for player, title in tennis_stats]5. Create the pie chart with the values as titles and labels as player names.autopct parameter - ... Read More

Plot 4D Scatter Plot with Custom Colours and Area Size in Python Matplotlib

Kiran P
Updated on 10-Nov-2020 06:12:56

891 Views

Introduction..Scatter-plot are very useful when representing the data with two dimensions to verify whether there's any relationship between two variables. A scatter plot is chart where the data is represented as dots with X and Y values.How to do it..1. Install matplotlib by following command.pip install matplotlib2. Import matplotlibimport matplotlib.pyplot as plt tennis_stats = (('Federer', 20), ('Nadal', 20), ('Djokovic', 17), ('Sampras', 14), ('Emerson', 12), ('laver', 11), ('Murray', 3), ('Wawrinka', 3), ('Zverev', 0), ('Theim', 1), ('Medvedev', 0), ('Tsitsipas', 0), ('Dimitrov', 0), ('Rublev', 0))3. Next step is to prepare the data in any array format. We can also read the data from ... Read More

Extract Required Data from Structured Strings in Python

Kiran P
Updated on 10-Nov-2020 06:08:49

2K+ Views

Introduction...I will show you couple of methods to extract require data/fields from structured strings. These approaches will help, where the format of the input structure is in a known format.How to do it..1. Let us create one dummy format to understand the approach.Report: - Time: - Player: - Titles: - Country: Report: Daily_Report - Time: 2020-10-16T01:01:01.000001 - Player: Federer - Titles: 20 - Country: Switzerlandreport = 'Report: Daily_Report - Time: 2020-10-10T12:30:59.000000 - Player: Federer - Titles: 20 - Country: Switzerland'2. First thing I noticed from the report is the seperator which is "-". We will go ahead ... Read More

Create Microsoft Word Paragraphs and Insert Images in Python

Kiran P
Updated on 10-Nov-2020 06:00:18

486 Views

Introduction...Being a Data Engineering specialist, I often receive test results from testers in Microsoft word. Sigh! they put so much information into word document right from capturing screen shots and very big paragraphs.Other day, I was asked by testing team to help them with a program to insert the tool generated Text and images (taken by automatic screen shots. Not covered in this article).MS Word document unlike others doesn't have the concept of a page, as it works in paragraphs unfortunately.So we need to use breaks and sections to properly divide a document.How to do it..1. Go ahead and install ... Read More

Save HTML Table Data to CSV in Python

Kiran P
Updated on 10-Nov-2020 05:53:33

2K+ Views

Problem:One of the most challenging taks for a data sceintist is to collect the data. While the fact is, there is plenty of data available in the web it is just extracting the data through automation.Introduction..I wanted to extract the basic operations data which is embedded in HTML tables from https://www.tutorialspoint.com/python/python_basic_operators.htm.Hmmm, The data is scattered in many HTML tables, if there is only one HTML table obviously I can use Copy & Paste to .csv file.However, if there are more than 5 tables in a single page then obviously it is pain. Isn't it ?How to do it..1. I will ... Read More

Add Legends to Charts in Python

Kiran P
Updated on 10-Nov-2020 05:46:55

4K+ Views

Introduction...The main purpose of charts is to make understand data easily. "A picture is worth a thousand words" means complex ideas that cannot be expressed in words can be conveyed by a single image/chart.When drawing graphs with lot of information, a legend may be pleasing to display relevant information to improve the understanding of the data presented.How to do it..In matplotlib, legends can be presented in multiple ways. Annotations to draw attention to specific points are also useful to help the reader understand the information displayed on the graph.1. Install matplotlib by opening up the python command prompt and firing ... Read More

Visualize API Results with Python

Kiran P
Updated on 10-Nov-2020 05:45:39

1K+ Views

Introduction..One of the biggest advantage of writing an API is to extract current/live data, even when the data is rapidly changing, an API will always get up to date data. API programs will use very specific URLs to request certain information e.g. Topp 100 most played songs of 2020 in Spotify or Youtube Music. The requested data will be returned in an easily processed format, such as JSON or CSV.Python allows the user to write API calls to almost any URL you can think of. In this example I will show how to extract API results from GitHub and visualize ... Read More

Advertisements