ETLETL stands for Extract, transform, and load. It is the process data-driven organizations use to gather data from multiple sources and then bring it together to support discovery, reporting, analysis, and decision-making.It is tempting to think a creating a Data warehouse is simply extracting data from multiple sources and loading it into the database of a Data warehouse. The ETL process needed active inputs from multiple stakeholders including developers, analysts, testers, top administration, and is technically difficult.It can support its value as a tool for decision-makers. Data warehouse system needs to change with business development. ETL is a constant activity ... Read More
ELT stands for Extract, Load, and Transform. It is a data integration process for transferring raw data from a source server to a data system (such as a data warehouse or data lake) on a target server and then fitting the data for downstream uses.The extract and load procedure can be isolated from the transformation phase. Isolating the load phase from the transformation process deletes an inherent dependency between these phases. It can include the data necessary for the transformations, the extract and load process can include an element of data that can be essential in the future. The load ... Read More
ETL stands for Extract, transform, and load. It is the process data-driven organizations use to gather data from multiple sources and then bring it together to support discovery, reporting, analysis, and decision-making.The data sources can be divergent in type, format, volume, and reliability, hence the data required to be processed to be helpful when delivered together. The target data stores can be databases, data warehouses, or data lakes, based on the objectives and technical execution. There are the following steps of ETL which are as follows −Extract − During extraction, ETL recognizes the data and duplicate it from its sources, ... Read More
Data mining is the phase of discovering useful new correlations, patterns, and trends by transferring through a high amount of records saved in repositories, using pattern recognition technologies including statistical and numerical techniques. It is the analysis of factual datasets to discover unsuspected relationships and to summarize the records in novel methods that are both logical and helpful to the data owner.It is the procedure of selection, exploration, and modeling of high quantities of information to find regularities or relations that are at first unknown to obtain clear and beneficial results for the owner of the database.Data Mining is similar ... Read More
Association rule learning is a kind of unsupervised learning technique that tests for the reliance of one data element on another data element and design appropriately so that it can be more cost-effective. It tries to discover some interesting relations or associations between the variables of the dataset. It depends on various rules to find interesting relations between variables in the database.The association rule learning is the most important approach of machine learning, and it is employed in Market Basket analysis, Web usage mining, continuous production, etc. In market basket analysis, it is an approach used by several big retailers ... Read More
Statistics is the science of learning from data. It contains everything from planning for the set of records and subsequent data administration to end-of-the-line activities including drawing inferences from numerical facts called data and presentation of results. Statistics is concerned with the most essential of person required: the need to discover out more about the globe and how it works in face of innovation and uncertainty.Information is the communication of knowledge. Data are referred to be crude data and not knowledge by themselves. The sequence from data to knowledge is as follows: from data to information (data develop into information ... Read More
Model-based clustering is a statistical approach to data clustering. The observed (multivariate) data is considered to have been created from a finite combination of component models. Each component model is a probability distribution, generally a parametric multivariate distribution.For instance, in a multivariate Gaussian mixture model, each component is a multivariate Gaussian distribution. The component responsible for generating a particular observation determines the cluster to which the observation belongs.Model-based clustering is a try to advance the fit between the given data and some mathematical model and is based on the assumption that data are created by a combination of a basic ... Read More
The grid-based clustering methods use a multi-resolution grid data structure. It quantizes the object areas into a finite number of cells that form a grid structure on which all of the operations for clustering are implemented. The benefit of the method is its quick processing time, which is generally independent of the number of data objects, still dependent on only the multiple cells in each dimension in the quantized space.The grid-based clustering uses a multi-resolution grid data structure and uses dense grid cells to form clusters. There are several interesting methods are STING, wave cluster, and CLIQUE.STING − A statistical ... Read More
Advantages of High Transmission VoltageElectric power is transmitted at very high voltages due to some technical and economic reasons which are described as follows −1. Reduces the Volume of Conductor MaterialConsider the electric power being transmitting through the three-phase three-wire transmission system.Let, P = Power transmitted (in Watts)V = Line voltage (in Volts)$\mathrm{cos}\:\phi$ = Load power factorR = Resistance per conductor (in ohms)$\mathit{\rho}$ = Resistivity of conductor materiall = length of transmission line (in meters)a = cross sectional area of conductorTherefore, the load current is given by, $$\mathrm{\mathit{I}\:=\:\frac{\mathit{P}}{\sqrt{3}\mathit{V}\mathrm{cos\:\phi }}}$$And the resistance per conductor is$$\mathrm{\mathit{R}\:=\:\rho \:\frac{\mathit{l}}{\mathit{a}}}$$Thus, the total power loss in ... Read More
There are two types of partitional algorithms which are as follows −K-means clustering − K-means clustering is the most common partitioning algorithm. K-means reassigns each data in the dataset to only one of the new clusters formed. A record or data point is assigned to the nearest cluster using a measure of distance or similarity. There are the following steps used in the K-means clustering:It can select K initial cluster centroid c1, c2, c3 ... . ck.It can assign each instance x in the S cluster whose centroid is nearest to x.For each cluster, recompute its centroid based on which ... Read More
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP