Scaling and Securing Your Data with Microservices|Arturo Klie
A common concern in the Oil and Gas industry is how to optimize cost, maintain security, and increase agility. Additionally, many firms are currently looking into leveraging data analytics as a way to expose new business opportunities, and gain a competitive edge.
In order to deliver accurate and hasty analytics solutions, a robust platform is needed. Fortunately, there are many learnings we can capture from the tech industry to help solve this problem. One such topic is the concept of microservices. In this article, we will go over what microservices are and how they can enable us to run a more nimble data-driven business.
Services are essentially standalone programs that run on a computer or server, and microservices are a miniaturization of services. Large application like Google Search have thousands of such services running in the background. They are usually designed to communicate with other services or client apps (i.e. your browser, mobile apps, or desktop apps) via an API (Application Programming Interface). Some examples of services in Oil and Gas may include SQL databases, ETL platforms, BI platforms, among other things. Fig. 1 below provides a visual reference for the anatomy of a Service-Oriented Architectures (SOA), and will be used as a reference for other illustrations later on.
In the 90’s and early 2000’s, many tech companies implemented their SOA follow a common pattern:
As these monolithic services grow, it becomes increasingly harder to scale. The complexity of such services makes it increasingly expensive to deploy new features and can cause a company to stagnate in terms of innovation. As a result, for the past decade tech companies have learned many ways to beat Moore's Law by exploiting parallelism.
Tech companies have come to learn that scaling horizontally is usually more efficient than scaling vertically. In other words, upgrading your CPU, RAM, and storage is less efficient than adding more machines to your cluster. This is because computer hardware scales like cars. New mid-range models scale linearly in single-threaded applications. As you push the hardware to its limit, the cost to value ratio degrades rapidly. Additionally, new hardware depreciates fast, and many parts break down after a few years. Moreover, there is a baseline cost for every machine. In this case, you need to buy a power supply, motherboard, case, and other parts that don’t significantly contribute to the overall performance. To complicate matters, semiconductor miniaturization is quickly approaching a physical limit.
These physical constraints have prompted software companies to invest heavily in building highly-distributed applications on “commodity hardware” (e.g. Hadoop, Spark, etc.). In tandem, hardware vendors have been shifting their focus from building chips with few powerful cores, to many specialized cores. This is made evident by the upward trend in GPU computing, AI accelerators, and multi-core CPU’s.
The premise behind microservices is to break down problems into manageable parts. Since these services are well encapsulated and loosely coupled from each other, it is easier for IT operation teams to deploy and orchestrate these services to get the most out of each computer. There are several benefits in taking a microservice approach, the top being:
1) Security: By distributing and scoping code and data access for each service, it becomes harder for an attacker to gain access to large parts of the system. Security can be hardened by hiding each service behind layers of private networks, firewalls, TLS encryption, and address whitelisting between services.
2) Nimbleness: A proper microservice should be very light, only a few megabytes in size. It is a good practice to package code with mock services for dependencies so that tests can be run efficiently on a developer’s machine. This model makes it possible for Data Science and Software Engineering teams to operate more Agile and ship features regularly. Organizing your code this way also makes it easier to train new hires.
3) Scalability: Given the atomicity of microservices, you can make the most out of each server by stacking service like Tetris pieces. There are powerful orchestration tools available that allow you to provision hardware and saturate them with microservices. For instance, you can train a machine learning model in the cloud by scaling up during training and back down when idle to save IT cost. Since the services are lightweight, it also makes it possible to scale up and down in seconds rather than minutes or hours.
4) Resilience: Since you’re able to distribute and interlace similar services across many small machines in difference regions, it is harder to take down an application, and easier to recover data from extensive failures. Servers die all the time, and having such architecture has given tech companies the confidence to guarantee Service-Level Agreements (SLA’s) with uptimes higher than 99.99%.
5) Flexibility: Microservices can be language and platform agnostic. This is important to avoid vendor lock-ins, and to configure the most cost effective and robust IT stack. It also makes it easier to combine libraries and solutions exclusive to certain platforms.
Microservices is a new concept for many developers. It requires exposure to novel concepts and technologies typically not covered or often misused outside the tech industry. There are two ways to embrace microservices: