- Agile Data Science Tutorial
- Agile Data Science - Home
- Agile Data Science - Introduction
- Methodology Concepts
- Agile Data Science - Process
- Agile Tools & Installation
- Data Processing in Agile
- SQL versus NoSQL
- NoSQL & Dataflow programming
- Collecting & Displaying Records
- Data Visualization
- Data Enrichment
- Working with Reports
- Role of Predictions
- Extracting features with PySpark
- Building a Regression Model
- Deploying a predictive system
- Agile Data Science - SparkML
- Fixing Prediction Problem
- Improving Prediction Performance
- Creating better scene with agile & data science
- Implementation of Agile
- Agile Data Science Useful Resources
- Agile Data Science - Quick Guide
- Agile Data Science - Resources
- Agile Data Science - Discussion
NoSQL and Dataflow programming
There are times when the data is unavailable in relational format and we need to keep it transactional with the help of NoSQL databases.
In this chapter, we will focus on the dataflow of NoSQL. We will also learn how it is operational with a combination of agile and data science.
One of the major reasons to use NoSQL with agile is to increase the speed with market competition. The following reasons show how NoSQL is a best fit to agile software methodology −
Changing the model, which at present is going through mid-stream has some real costs even in case of agile development. With NoSQL, the users work with aggregate data instead of wasting time in normalizing data. The main point is to get something done and working with the goal of making model perfect data.
Whenever an organization is creating product, it lays more focus on its scalability. NoSQL is always known for its scalability but it works better when it is designed with horizontal scalability.
Ability to leverage data
NoSQL is a schema-less data model that allows the user to readily use volumes of data, which includes several parameters of variability and velocity. When considering a choice of technology, you should always consider the one, which leverages the data to a greater scale.
Dataflow of NoSQL
Let us consider the following example wherein, we have shown how a data model is focused on creating the RDBMS schema.
Following are the different requirements of schema −
User Identification should be listed.
Every user should have mandatory at least one skill.
The details of every user’s experience should be maintained properly.
The user table is normalized with 3 separate tables −
The complexity increases while querying the database and time consumption is noted with increased normalization which is not good for Agile methodology. The same schema can be designed with the NoSQL database as mentioned below −
NoSQL maintains the structure in JSON format, which is light- weight in structure. With JSON, applications can store objects with nested data as single documents.