Tuesday 9 April 2019

CAP theorem as an architectural tool

When we are writing an application that needs to scale on the cloud, getting an architecture in place that can horizontally scale becomes extremely important.  Let's look at one example. The diagram below captures a very early draft of our product's top-level use case diagram.

The diagram consists of five different actors and six different top-level use cases.  Here we provide a short explanation for each of these.
  1. Authentication And Authorization -- This use case captures the capability of authenticating a user and authorizing it for appropriate access levels in the application. 
  2. Admission -- This use case captures the processes associated with the admission of students within an institution. 
  3. Profile -- This use case captures the profile capability of the system. Different actors might have profiles which look very different.
  4. Lesson -- This use case captures the lesson capability of the system. It consists of the management of lesson plan for classes by faculty and interfaces to make these available to students and parents. It will also provide a mechanism to track the completion of a lesson plan.
  5. Standard -- This use case captures the abstraction of the standard. It will consist of different standards in school, different section, a combination of subjects that are offered, tracking of students in each of the section, etc.
  6. Evaluation -- This use case captures the expected functionality related to an evaluation of the student. It will track things like assignments, examinations, test papers, submissions, grading of these, etc.
Now we look at five different actors of the system.
  1. Student -- This actor represents a student in the system. As a student, he will be authorized to play Student role in the system and will be able to perform operations that he is authorized
  2. TenantAdmin -- Our system is a multi-tenanted system. Each education institute will become a tenant in the system and a TenantAdmin role will have the authorization to perform all the administrative functions related to a tenant.
  3. Faculty -- A faculty user in the system will have access to school-related activities of the student.
  4. Guardian -- A parent user will be authorized with Guardian role and will have access to complete set of activities of Student for which (s)he is a guardian.
  5. SuperAdmin -- This is a user with unrestricted access to perform a variety of housekeeping operations across multiple tenants.
With the above example top-level use case, let's look at what type of data storage requirements we might have. Following is one of the examples of how we can define the database requirements of our system.
  1. Authentication Database Database to store authentication information for all user. This will mandatorily have a field that will define the roles that a user is authorized for.
  2. Admission Database A database that will maintain assets related to the admission process.
  3. Profile Database A database that will store profile information related to a user.
  4. Lesson Database A database that will store lessons
  5. Standard Database A database that will store different standards, sections, and mapping of students to each of these for a particular year
  6. Evaluation Database A database that will store assets related to the evaluation of students.
With the databases identified as above, now we look at how we can analyze them based on their requirements with respect to CAP theorem.
CAP Theorem
  1. Authentication Database This database requires data to be consistent since the password and authentication tokens will be part of this. Availability is also an important requirement of this database. This database doesn't need to be partition tolerant since it will be split across multiple tenants and we may follow a scheme of manual sharding. So an appropriate solution to this might be a C-A database. As we can see in the picture to the right, any relational database would be a good solution for this data store. Since most of the authentication data doesn't change very often, we will frontend the RDBMS with a cache, probably Redis.
  2. Admission Database This data store would contain Parent registration, Admission Form, Comments on Admission Form, Call Letters to parents. None of this information falls into the category that requires ACID capabilities. Since we are building a multi-tenant system, information like parent registration might be shared across multiple tenants. The primary requirements of this data store are Availability so either of the A-P or C-A data stores might be good solutions. Depending on our scale requirements, we can choose Dynamo/Cassandra/Couchbase or any RDBMS.
  3. Profile Database This data store would contain profile information for all the users in all the roles. Since the total number of students across all the tenants might be a very large number, partition tolerance becomes an important factor here. We can live with Eventual Consistency in case of profile information. Any product in the A-P axis would be a good solution for this.
  4. Lesson, Standard, Evaluation Database This data store is on a per-tenant basis and only requirements in this are availability so either of the A-P or C-A data stores might be good solutions. Depending on our scale requirements, we can choose Dynamo/Cassandra/Couchbase or any RDBMS.
As shown by the analysis above, depending on the need for scale, we can have either of the following two data store configurations.
  • MySQL with Redis frontend for authentication and Couchbase for everything else if a large number of users
  • MySQL with Redis frontend for everything for a moderate number of users.
As we have shown above, keeping CAP theorem in mind while analyzing the architecture provides us helpful insights into the choice of products that we intend to use for data storage.

No comments:

Post a Comment

Spring Microservices with Kubernetes on Google, Ribbon, Feign and Spring Cloud Gateway

In the previous post, we built a JWT authentication server. Now we will build on that server and create an initial instance of the applicat...