 
- Agile Data Science - Home
- Agile Data Science - Introduction
- Methodology Concepts
- Agile Data Science - Process
- Agile Tools & Installation
- Data Processing in Agile
- SQL versus NoSQL
- NoSQL & Dataflow programming
- Collecting & Displaying Records
- Data Visualization
- Data Enrichment
- Working with Reports
- Role of Predictions
- Extracting features with PySpark
- Building a Regression Model
- Deploying a predictive system
- Agile Data Science - SparkML
- Fixing Prediction Problem
- Improving Prediction Performance
- Creating better scene with agile & data science
- Implementation of Agile
Agile Data Science - SQL versus NoSQL
The complete focus of this tutorial is to follow agile methodology with less number of steps and with implementation of more useful tools. To understand this, it is important to know the difference between SQL and NoSQL databases.
Most of the users are aware of SQL database, and have a good knowledge on either MySQL, Oracle or other SQL databases. Over the last several years, NoSQL database is getting widely adopted to solve various business problems and requirements of project.
 
The following table shows the difference between SQL and NoSQL databases −
| SQL | NoSQL | 
|---|---|
| SQL databases are mainly called Relational Database Management system (RDBMS). | NoSQL database is also called documentoriented database. It is non-relational and distributed. | 
| SQL based databases includes structure of table with rows and columns. Collection of tables and other schema structures called database. | NoSQL database includes documents as major structure and the inclusion of documents is called collection. | 
| SQL databases include predefined schema. | NoSQL databases have dynamic data and include unstructured data. | 
| SQL databases are vertical scalable. | NoSQL databases are horizontal scalable. | 
| SQL databases are good fit for complex query environment. | NoSQL do not have standard interfaces for complex query development. | 
| SQL databases are not feasible for hierarchal data storage. | NoSQL databases fits better for hierarchical data storage. | 
| SQL databases are best fit for heavy transactions in the specified applications. | NoSQL databases are still not considered comparable in high load for complex transactional applications. | 
| SQL databases provides excellent support for their vendors. | NoSQL database still relies on community support. Only few experts are available for setup and deployed for large-scale NoSQL deployments. | 
| SQL databases focuses on ACID properties Atomic, Consistency, Isolation And Durability. | NoSQL database focuses on CAP properties Consistency, Availability, and Partition tolerance. | 
| SQL databases can be classified as open source or closed source based on the vendors who have opted them. | NoSQL databases are classified based on the storage type. NoSQL databases are open source by default. | 
Why NoSQL for agile?
The above-mentioned comparison shows that the NoSQL document database completely supports agile development. It is schema-less and does not completely focus on data modelling. Instead, NoSQL defers applications and services and thus developers get a better idea of how data can be modeled. NoSQL defines data model as the application model.
 
MongoDB Installation
Throughout this tutorial, we will focus more on the examples of MongoDB as it is considered the best NoSQL schema.
 
 
 
 
