April Learning Carnival is here, Use code FEST10 for an extra 10% off

Build a Data Analysis Library from Scratch in Python

person icon Teddy Petrou


Build a Data Analysis Library from Scratch in Python

Immerse yourself in a long, comprehensive project that teaches advanced Python concepts to build an entire library

updated on icon Updated on Apr, 2024

language icon Language - English

person icon Teddy Petrou

category icon Python,Development,Programming Languages

Lectures -58

Duration -7.5 hours



30-days Money-Back Guarantee

Training 5 or more people ?

Get your team access to 10000+ top Tutorials Point courses anytime, anywhere.

Course Description

Build a Data a Data Analysis Library from Scratch in Python targets those that have a desire to immerse themselves in a single, long, and comprehensive project that covers several advanced Python concepts. By the end of the project you will have built a fully-functioning Python library that is able to complete many common data analysis tasks. The library will be titled Pandas Cub and have similar functionality to the popular pandas library.

This course focuses on developing software within the massive ecosystem of tools available in Python. There are 40 detailed steps that you must complete in order to finish the project. During each step, you will be tasked with writing some code that adds functionality to the library. In order to complete each step, you must pass the unit-tests that have already been written. Once you pass all the unit tests, the project is complete. The nearly 100 unit tests give you immediate feedback on whether or not your code completes the steps correctly.

There are many important concepts that you will learn while building Pandas Cub.

  • Creating a development environment with conda

  • Using test-driven development to ensure code quality
  • Using the Python data model to allow your objects to work seamlessly with built-in Python functions and operators

  • Build a Data Frame class with the following functionality:

    • Select subsets of data with the brackets operator
    • Aggregation methods - sum, min, max, mean, median, etc...
    • Non-aggregation methods such as isna, unique, rename, drop
    • Group by one or two columns to create pivot tables
    • Specific methods for handling string columns
    • Read in data from a comma-separated value file
    • A nicely formatted display of the Data Frame in the notebook

It is my experience that many people will learn just enough of a programming language like Python to complete basic tasks, but will not possess the skills to complete larger projects or build entire libraries. This course intends to provide a means for students looking for a challenging and exciting project that will take serious effort and a long time to complete.

This course is taught by expert instructor Ted Petrou , author of Pandas Cookbook, Master Data Analysis with Python, and Exercise Python.

Who this course is for:

  • Students who understand the fundamentals of Python and are looking for a longer more comprehensive project covering advanced topics that they can immerse themselves in.


What will you learn in this course:

  • Build a fully-functioning Python library similar to pandas that you can use to do data analysis
  • Complete a large, comprehensive project
  • Test-driven development with pytest
  • Environment creation with conda
  • Advanced Python topics such as special methods and property decorators


What are the prerequisites for this course?

  • Students must know the fundamentals of Python. This is an intermediate/advanced course.
  • Must feel comfortable using and iterating through lists, tuples, sets, and dictionaries
  • Exposure to numpy and pandas is helpful
Build a Data Analysis Library from Scratch in Python


Check out the detailed breakdown of what’s inside the course

Project Genesis
3 Lectures
  • play icon Project Overview 09:37 09:37
  • play icon Pandas Cub Examples 13:43 13:43
  • play icon Downloading the Material from GitHub 02:27 02:27
Environment Setup
4 Lectures
Getting Ready to Code
4 Lectures
DataFrame Construction
3 Lectures
Basic Properties and Visual Representation
7 Lectures
Subset Selection
10 Lectures
Basic Methods
6 Lectures
Value Counts
2 Lectures
Other Methods and Operators
8 Lectures
Pivot Tables
5 Lectures
Documentation, Strings, and Reading CSVs
6 Lectures

Instructor Details

Teddy Petrou

Teddy Petrou

I am the author of Pandas Cookbook and Master Data Analysis with Python, highly rated texts on performing real-world data analysis with Pandas.

I am the founder of Dunder Data, a company that teaches the fundamentals of data science and machine learning. I enjoy discovering best practices on how to use and teach data analysis with Python.

I also enjoy creating open-source Python libraries and am author of Dexplot, Bar Chart Race, DataFrame Image, and Jupyter to Medium.

Course Certificate

Use your certificate to make a career change or to advance in your current career.

sample Tutorialspoint certificate

Our students work
with the Best

Related Video Courses

View More

Annual Membership

Become a valued member of Tutorials Point and enjoy unlimited access to our vast library of top-rated Video Courses

Subscribe now
Annual Membership

Online Certifications

Master prominent technologies at full length and become a valued certified professional.

Explore Now
Online Certifications

Talk to us