Tuesday, September 19, 2017

NoSQL screwing up

        This post is regarding my current (previous in case I quit before you see this) company. The company has been using RDBMS since years, which is trusted. Architects here started using Couchbase just because NoSQL gives options to scale big time. But the user permissions of each control is in RDBMS, which is pulled while Login and saved in access token in Couchbase. Another problem arise that the admins can grant and revoke permissions of any user or user group on any content or any specific collection of contents, let's call it document X. To use triggers there were about 25 tables which were needed to be applied. Syncing with RDBMS for every query was a big problem. They decided to keep a copy of the permissions into Couchbase and sync only the permission documents. It was also too much effort to sync using the triggers or a scheduled job. This was the most unfeasible option to take, when the database structure is already too big. Let's see other architectural decisions which were made.
        The document X in Couchbase is very big. Each of such document contains multiple user and multiple content, duplicated in each. Though there is no problem with duplicate, the architects decided to break it down into several small documents. User document and Content document were pulled out and Mapping documents of 3 types were made. Mapping between User and Content, X and User, X and Content. But Couchbase doesn't support join. NoSQLs are actually not meant to support join. I made many views to get the data. But that multiplied the interactions with Couchbase 3 times.
        So to replace the use of views, architect decided to use another data-source, to which we will come back. If Couchbase is just used for querying using keys, then why do we need Couchbase. Can't we directly use Memcache? Couchbase internally uses pool of Memcache. But no, the architects want to keep the NoSQL scaling option open. Then to solve our problem of architects introduced Elastic Search.
        I strongly opposed the use of Elastic Search. If there is no future requirement that there ever will be search on this project, then using the Elastic Search for id search is an under-utilization. Keeping the big document X in the Elastic Search, indexing only the ids and retrieving only the required document fields. But id search is hardly 5% utilization of the Elastic Search. The option is still under consideration because of the confident opposition and also addition of third layer in the system for a single project is not preferred. Though in the future the other projects of the company can use the same documents, the addition of third layer and problem in syncing the three data-sources is a big headache.
        Now coming to the point. In Couchbase, if we don't use views in Couchbase, Couchbase is useless. It just a cache then. And even if we are going to put many different type of document in Couchbase, views made for one type of document will be applied to all types of documents. Its preferable not to use more than 10 views on each bucket. This make the Couchbase too restricted. Also as views work well only if the documents are not added or modified frequently, else it needs to update the views for the document and the cache of the stale data.
        The same goes with most of the NoSQLs. NoSQLs are not meant to support relational capabilities. May be I am thinking too narrow. But instead of developing NoSQL to have relational capabilities, its better to develop RDBMS to have speed and scaling capabilities. It can also have option of less structured data model, something like column family in Cassandra. It can also have Solr or Elastic Search like indexing, to support text search as well.

No comments:

How pets and being stress-free can help in getting pregnant

We got 2 cats as soon as we returned from a 9 days vacation from Goa. As we were new to cats and with them playing around us, we were focuse...