- Trending Categories
- Data Structure
- Operating System
- C Programming
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
What are the criteria for selecting the data sources?
There are various criteria for selecting the data sources which are as follows −
Data accessibility − If two possible feeds exist for the data, one is stored in binary files maintained by a set of programs written before the youngest project team member was born and the other is from a system that reads the binary documents and supports more processing, then the decision is obvious.
Data accuracy − As data is passed from system to system, many modifications are made. Sometimes data elements from other systems are added, and sometimes existing elements are processed to create new elements and other elements are dropped.
Each system performs its function well. However, it may become difficult or impossible to recognize the original data. In some cases, the data no longer represents what the business wants for analysis. If you provide the data from these downstream systems, the users may question the accuracy.
Project scheduling − In many organizations, the data warehouse project begins as part of a rewrite of an existing OLTP system. As the new system development project starts to unfold, it is the case that the business users who are securely convinced of the value of a data warehouse begin to insist that the data warehouse be implemented sooner rather than later.
To provide historical data, you need to include the data from the existing system in your data warehouse. If the rewrite of the old system is held up, the data warehouse can continue utilizing the current system. Once the new system is released for production, the data feeds can be switched to it. In many cases, it is possible to deliver the data warehouse before the new operating system can be completed.
Some dimensional information usually comes with the transaction or fact data, but it is usually minimal and often only in the form of codes. The additional attributes that the users can want and required are fed from several systems or joint master files.
In many instances, there can be multiple master files, especially for the customer dimension. There are often separate files that are used across an organization. Sales, Marketing, and Finance may have their customer master files.
There are two difficult issues as first, the customers who are included in these files may differ, and the attributes about each customer may differ. Second, the common information may not match. If it can have unlimited time and money it can pull rich data from all sources and then combine it into an individual comprehensive view of customers.
In most cases, there is not enough time or money to do that all at once. In these cases, it is recommended that the users prioritize the information, and you start with what you can and expand in the future.
- What are the criteria for EAI Software Checklist?
- What are the best protein sources for pure vegan?
- Selecting with complex criteria from a Pandas DataFrame
- What are the qualifying criteria for hedge accounting in IAS 39?
- What are the qualifying criteria for hedge accounting in IFRS 9?
- How are the criteria for deciding divisions in plants different from the criteria for deciding the subgroups among animals?
- What are the sources of food?
- What are the criteria of frequent pattern mining?
- What are the natural sources of protein?
- What are the sources of synthetic fibre?
- What are the conventional sources of energy?
- What are the differences between Objective and Decision Criteria?
- What are the criteria used to classify living things?
- What are the five main sources of electrical energy?
- What are the Sources of Natural and Artificial Light?