
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Found 160 Articles for Data Science

2K+ Views
Data manipulation is a crucial step in the data analysis process, as it allows us to prepare and organize our data in a way that is suitable for the specific analysis or visualization. There are many different tools and techniques for data manipulation, depending on the type and structure of the data, as well as the specific goals of the manipulation. The data.table package is an R package that provides an enhanced version of the data.frame class in R. It’s syntax and features make it easier and faster to manipulate and work with large datasets. The date.table is one ... Read More

869 Views
As the world entered the era of big data in recent decades, the demand for more effective and efficient data storage greatly expanded. Businesses that use big data invest a lot of time and energy in creating frameworks that can hold a lot of information. The storage of vast amounts of data was then made possible by the creation of frameworks like Hadoop. As the storage issue can be resolved by using the frameworks the next issue that comes is to process the data that had already been stored. The solution to processing the data and getting the useful information ... Read More

2K+ Views
The data science and engineering fields are interacting more and more because data scientists are working on production systems and joining R&D teams. We want to make it simpler for data scientists without prior engineering experience to understand the core engineering best practices. We are building a manual on engineering subjects like Git, Docker, cloud infrastructure, and model serving that we hear data science practitioners think about. Introduction to Git A version control system called Git is made to keep track of changes made to a source code over time. Without a version control system, a collaboration between multiple people ... Read More

327 Views
Data science is the process of organizing, processing, and analyzing vast amounts of data in order to extract knowledge and insights from them. It involves a number of different fields, including statistical and mathematical modelling, data extraction from its source, and methods for data visualization. Working with big data technology to gather both structured and unstructured data is commonly required. In the parts that follow, we'll examine several applications of data science and how python might be useful there. Python is a widely used high-level, general-purpose, object-oriented, and interpreted language. To utilize Python for a task, one only needs to ... Read More

372 Views
Python is a general-purpose, object-oriented, interpreted, high-level language and is very popular in the market. Python has a very rich library that contains pre-defined code for almost every purpose and to use python for a task using only needs the logic, as most of the coding part is handled by python itself. Python has a large community of developers which provides an extra benefit to newcomers and the experienced python user that there is no issue with any bugs. Before moving to the introduction of python for data science let’s see some basics of data science. What is Data Science? ... Read More

324 Views
Data science integrates math and statistics, specialized programming, advanced analytics, machine learning, and artificial intelligence (AI) with specific subject matter expertise to reveal actionable insights hidden in an organization’s data. Data science is one of the fields which has shown the quickest growth rates across all industries. This is a result of the increasing volume of data sources and data that results from them. Data Science has generated controversy among other disciplines as a field ever since it began to gain recognition. In this article we will be learning about the fundamentals of software engineering, why it ... Read More

585 Views
Dask is a flexible open-source Python library which is used for parallel computing. In this article, we will learn about parallel computing and why we should choose Dask for this purpose. We will compare it with various other libraries like spark, ray and modin. We have also discussed use cases of Dask. Parallel Computing A type of computation known as parallel computing carries out several computations or processes simultaneously. Large issues are typically divided into manageable pieces that may be solved separately. The four categories of parallel computing are Bit-level Instruction-level Data-level Job parallelism. ... Read More

836 Views
Cleansing, transforming, and analyzing raw data is the first step in the process of obtaining useful, pertinent information which can help businesses make informed conclusions. By offering relevant information and facts, which are usually presented as charts, pictures, tables, and graphs, the strategy helps to lower the risks associated with decision-making. Data analysis is concerned with the process of converting unprocessed data into pertinent statistics, knowledge, and explanations. Data analysis is a crucial competence that may support better decision-making. Spreadsheets are the most often used tools for data analysis, and built-in pivot tables are the most popular analytical tool. ... Read More

323 Views
Pandas is a Python library that is designed for data manipulation and analysis. It provides the two data structures: Series: It is a one-dimensional labelled array (like a column in a spreadsheet). DataFrame: It is a two-dimensional labelled data structure (like a table), allowing storage of multiple columns with different data types. Using Pandas, we can perform complex data manipulations with the help of its powerful data structures. It can work with different file formats like CSV, Excel, etc. In this article, we will learn how to analyze data activity using ... Read More

2K+ Views
Data mining and data science are the two most important concepts in information technology. Data mining is a process of determining useful information, trends, and patterns from large databases, so that these parameters can be used to solve several business problems. On the other hand, data science is the process of obtaining important insights from the unstructured and structured data by using different analysis tools. Basically, data science is one of the modern emerging fields of computer science and information technology for the study of largescale data analysis. Read this article to learn more about Data Mining and Data Science ... Read More