Data Science Articles

Page 13 of 13

Parallel Programming in R

Bhuwanesh Nainwal
Bhuwanesh Nainwal
Updated on 17-Jan-2023 5K+ Views

Parallel programming is a software development practice that involves dividing a computation or task into smaller parts that can be executed concurrently or in parallel. Parallel programming can help improve the performance and efficiency of your R code by utilizing multiple processors or cores in a computer or cluster. The main concept of parallel programming is, if one operation can be performed in S seconds using a single processor, then it should be able to get executed in S / N seconds when N processors are involved. Need for Parallel Programming in R Most of the time the code in ...

Read More

Manipulating Time Series Data in R with xts & zoo

Bhuwanesh Nainwal
Bhuwanesh Nainwal
Updated on 17-Jan-2023 1K+ Views

The xts and zoo are two R packages that provide tools and functions for manipulating time series data. Both packages offer functions for reading, writing, and manipulating time series data stored in various formats, such as CSV, Excel, and other data sources. We shall start by introducing xts and zoo classes, basic manipulations, merging and modifying time series, and by the end, we will be discussing applying and aggregating by time. XTS and Zoo class Syntax In R, xts extends the zoo class. An xts object is similar to a matrix of observations that are indexed by a time object. ...

Read More

Joining Data in R with data.table

Bhuwanesh Nainwal
Bhuwanesh Nainwal
Updated on 17-Jan-2023 2K+ Views

In this article, we will discuss joining data in R using data.table package. By the term “joining data” we mean to say that performing different types of joins operations like INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, AND FULL OUTER JOIN between two or more tables. The main purpose of doing join operations between tables is to access data from multiple tables on the basis of some attribute (or column) condition. R provides us data.table package with the help of which we can handle tabular data (having rows and columns) very efficiently. This package was launched as an alternative ...

Read More

Defensive R Programming

Bhuwanesh Nainwal
Bhuwanesh Nainwal
Updated on 17-Jan-2023 521 Views

Defensive programming is a software development practice that involves designing and implementing code in a way that anticipates and prevents errors and vulnerabilities. In R programming, defensive programming involves using techniques and strategies to ensure that your R code is robust, reliable, and secure. By the word “Defensive” in defensive programming, most of you might be confused about whether it means writing such a code that doesn’t fail at all. But the actual definition of “Defensive programming” is writing such a code that fails properly. By “failing properly”, we mean − If the code fails, then it should be ...

Read More

Introduction to Data Science in Python

Prabhdeep Singh
Prabhdeep Singh
Updated on 11-Jan-2023 1K+ Views

As the world entered the era of big data in recent decades, the demand for more effective and efficient data storage greatly expanded. Businesses that use big data invest a lot of time and energy in creating frameworks that can hold a lot of information. The storage of vast amounts of data was then made possible by the creation of frameworks like Hadoop. As the storage issue can be resolved by using the frameworks the next issue that comes is to process the data that had already been stored. The solution to processing the data and getting the useful information ...

Read More

Python Data Science using List and Iterators

Prabhdeep Singh
Prabhdeep Singh
Updated on 11-Jan-2023 449 Views

Data science is the process of organizing, processing, and analyzing vast amounts of data in order to extract knowledge and insights from them. It involves a number of different fields, including statistical and mathematical modelling, data extraction from its source, and methods for data visualization. Working with big data technology to gather both structured and unstructured data is commonly required. In the parts that follow, we'll examine several applications of data science and how python might be useful there. Python is a widely used high-level, general-purpose, object-oriented, and interpreted language. To utilize Python for a task, one only needs to ...

Read More

Introduction to Git for Data Science

Prabhdeep Singh
Prabhdeep Singh
Updated on 11-Jan-2023 2K+ Views

The data science and engineering fields are interacting more and more because data scientists are working on production systems and joining R&D teams. We want to make it simpler for data scientists without prior engineering experience to understand the core engineering best practices. We are building a manual on engineering subjects like Git, Docker, cloud infrastructure, and model serving that we hear data science practitioners think about. Introduction to Git A version control system called Git is made to keep track of changes made to a source code over time. Without a version control system, a collaboration between multiple people ...

Read More

Software Engineering for Data Scientists in Python

Prerna Tiwari
Prerna Tiwari
Updated on 09-Jan-2023 386 Views

Data science integrates math and statistics, specialized programming, advanced analytics, machine learning, and artificial intelligence (AI) with specific subject matter expertise to reveal actionable insights hidden in an organization’s data. Data science is one of the fields which has shown the quickest growth rates across all industries. This is a result of the increasing volume of data sources and data that results from them. Data Science has generated controversy among other disciplines as a field ever since it began to gain recognition. In this article we will be learning about the fundamentals of software engineering, why it ...

Read More

Parallel Computing with Dask

Prerna Tiwari
Prerna Tiwari
Updated on 09-Jan-2023 672 Views

Dask is a flexible open-source Python library which is used for parallel computing. In this article, we will learn about parallel computing and why we should choose Dask for this purpose. We will compare it with various other libraries like spark, ray and modin. We have also discussed use cases of Dask. Parallel Computing A type of computation known as parallel computing carries out several computations or processes simultaneously. Large issues are typically divided into manageable pieces that may be solved separately. The four categories of parallel computing are Bit-level Instruction-level Data-level Job parallelism. ...

Read More
Showing 121–129 of 129 articles
« Prev 1 9 10 11 12 13 Next »
Advertisements