This article is split into the two parts:
If you are interested more about the decisions and thoughts behind an ecosystem, check out Part I of this article. Meanwhile, the story continues with defining a core architecture.
A modern fintech bank is a marketplace with a banking license, a core platform, KYC, CRM, an API and a few core products. The products directly offered by the Fintech bank may be limited to funds holding, bank accounts, cards and a wallet for payments.
However, our task wasn’t quite typical. At first, our main goal was to design and architect a financial platform that would be able to scale horizontally, to be able to integrate with multiple external service providers, have an open API system and be modular enough to scale throughout multiple product lines and features.
That is why it was important to start with ground principles of the platform.
We need the ability to have a single canonical API definition that could be used in all channels, or that would allow us to at least verify the correctness of the clients.
It is necessary to standardize gRPC to define the service APIs, standardizing gRPC inside the “logical” data center, and then using grpc-gateway to create REST adapters that clients could call. Any client features will be covered in this component of the adapter.
We do not want to spend extra money on overdeveloped systems to cope with the load, and we do not want the additional capacity to be burdensome. Everything should be scaled horizontally.
Therefore, we want to translate everything into Kubernetes and make all services containerized, which will allow us to cope with the load, increasing efficiency.
We still need to standardize Bigtable as our permanent store because of its cost for the performance profile.
It is necessary that all code has a standard way of logging and log search support. This is critically important.
All log lines are converted to standard output in JSON format, and then connected using the EFK stack (ElasticSearch+ Fluentd+ Kibana).
We need a single standard way of presenting indicators for the dashboard, as well as warnings.
Since we have already standardized Go for development, it makes sense to expose any metrics using expvars, which can then be cleared Datadog or Prometheus. As an added bonus, since we standardized the gRPC, it was easy enough to make sure that all gRPC services set a standard set of metrics, such as the percentage of the response time to the RPC endpoint.
Developers can also easily identify additional, service-specific indicators using the same approach.
The Gateway app should be the only service being exposed to the public. It should perform the following criteria:
When selecting our cloud service provider, the decision was made to use Google Cloud Platform. We were driven by three major factors: Isolation, Performance & Cost and lastly Kubernetes out of the box.
We knew that we wanted to experiment with a more advanced geo store and other techniques, and felt that we would definitely hit bumps early on. The best way to make sure not to impact other services in those cases was to have a new, completely separate one.
There was a plan to store and serve a lot of geo-tagged and aggregated data and we needed a low-cost, persistent store that would lend itself well to this use case. This is where Google Bigtable really shined and made the migration more appealing to us. For example, you can stand up the default, 3-node cluster, that supports around 30,000 QPS for about $1,500 per month, including storage. Since capacity is provisioned at the cluster-level and not at the table-level, you don’t need to worry about wasting money over-provisioning use-case specific tables.
This one is very simple. At the moment of writing this article, AWS already provides “out of the box” Kubernetes services, so this bullet point is relatively useless. However, at the moment of architecting the platform, it was one of the critical reasons involved in decision making.
This is the high-level overview of the GCP stack. All blockchain related data is left out of the scope and are not discussed in here, yet. Whereas, all data that is stored in a centralized manner fall under the current stack.
In this specific case, the most efficient approach would be using Microservices. The decision was based on an attempt to create a core set of services and controlling the dependency index.
Number of services running on the platform (besides network layers)
A quick note about Microservices. Microservices are independently built systems, each running in their own process and often communicating with REST API. Representing different parts of your application, they are separately deployable and each part can be written in any language.
You can easily see how, by dealing with the problems of a monolithic system, Microservices have become a requirement for any state-of-the-art software.
I strongly recommend reading Microservices (by James Lewis) and On Monoliths and Microservices if you want to understand more in depth what are the key concepts in this architectural style.
Kubernetes was designed from the ground up to be the ideal platform to build, manage, and orchestrate distributed applications using containers. It includes primitives for replication and service discovery as part of its core (these are added via frameworks in Mesos and requires some know-how to configure properly) and the vision for Kubernetes is to develop a system that allows enterprises to manage scalable application deployments with maximum efficiency, security, and ease.
Personally, I like to describe Kubernetes as “A special kind of operating system for the cloud” — It’s an operating system which allows developers to treat an arbitrary number of machines as if they were a single, very powerful machine. In my view, K8s is to the cloud what Linux or Windows is to the computer.
In the same way that an Operating System provides developers with an abstraction layer from the hardware which makes up a computer, K8s provides developers with an abstraction layer from the computers which make up a cloud.
The more you think about this comparison, the more parallels you should notice. For example, one of the main purposes of an OS is to schedule processes such that they can efficiently share hardware resources across a single machine — Similarly, one of the main purposes of K8s is to schedule containers such that they can efficiently share computational resources across a cloud (potentially made up of multiple machines).
Once you think of K8s as being an “operating system for the cloud” the business opportunities start to materialize. We’re no longer just talking about a system to improve deployment efficiency, we’re talking about an entirely new platform for building cloud native applications.
There are three layers of blockchains in the Hybrid system is built upon: the Ethereum main-net, the Master internal blockchain, and country-specific blockchains. Each internal blockchain is a modified version of the Ethereum protocol, with zero transaction fees and emissions. Inside them a token, acts as the main currency, just like ETH in the main-net.
Note that there is no emission happening on the internal blockchain, as these all begin with the maximum available supply of tokens already existing in a ‘Master Wallet’. When tokens leave the network, they are sent to a second ‘Clearance Wallet’ which in effect ‘burns’ or removes them.
Three critical scenarios exist that must be handled by the system: when a user creates an account; when the tokens are sent from the ecosystem to the outside world; and, finally, when the token is sent from the outside world to the hybrid ecosystem.
When a user registers a new account, a service is responsible for creating two identical wallets through the blockchain APIs, re-using the same private key. In this way, a user has exactly the same address on both the ETH Mainnet and the Private Chain. The information is then returned to the Tapatybe service (responsible for holding user identities) and tied to the account.
In response, the TokenRef service is called, which performs the required emission on both the ETH Mainnet wallet (by calling the contract and minting the tokens), and the Private chain: at this point, the tokens are transferred from the Master/Clearance wallet to the user’s internal wallet.
When a user initiates a transaction, there are two possible scenarios: either the wallet belongs to the network, or it belongs to the ETH Mainnet.
As the transaction commences, a service is called that determines whether the destination address exists in the database. If it does, it simply proxies the transaction to the Blockchain API for the internal chain. If the address is indeed external, then the transaction is added to a queue that is picked up by a second service.
This service then proxies the transaction to the ETH Mainnet, which is broadcasted from the transit address. Once that is complete, it calls the Blockchain API to create a transaction that ‘burns’ the tokens on a user’s internal balance, sending them to the Clearance Wallet.
Given the fact an external transaction can come at any time, a blockchain service must ‘listen’ to all of the wallets for incoming transactions. Whenever a new transaction is detected, a second service initiates a transfer of funds from the user’s mainnet address to the Transit Wallet, grouping transactions to lower the cost as much as possible.
A third service is then called, which orders the internal Blockchain API to distribute the tokens from the Master Wallet to the user’s internal balance.
The private blockchains work in parallel, meaning that we can add as many of them as necessary to multiply Ethereum’s TPS. Of course, the presence of inter-blockchain operations implies some loss in efficiency, but these will be a small part of the overall transaction pool.
If a service uses an identifier already listed in a typical contact card (phone number or email), it’s simple to quickly display which contacts of a user are also registered with the service and immediately make social features available to that user. This means friends don’t have to “discover” each-other on a service if they already have each-other as contacts.
The problem is that the simplest way to calculate the intersection of registered users and device contacts is to upload all the contacts in the address book to the service, index them, reverse index them, send the client the intersection, and subsequently notify the client when any of those contacts later register.
There’s an entire field of study dedicated to problems like this one, known as “private information retrieval” (PIR). The simplest form of PIR is for the server to send the client the entire list of registered users, which the client can then query locally. Basically, if the client has its own copy of the entire database, it won’t leak its database queries to the server.
One can make this slightly more network efficient by transmitting the list of registered users in a bloom filter tuned for a low false positive rate. To avoid leaking the list of all registered users, it’s even possible to build a “symmetric PIR” system using “encrypted bloom filters” by doing the following:
It’s also possible to compress “updates” to the bloom filter. The server just needs to calculate the XOR of the version the client has and the updated version, then run that through LZMA(the input will be mostly zeros), and transmit the compressed diff to the client.
Facial recognition is used in two places on the platform:
All API methods require a simple token-based HTTP Authentication. In order to authenticate, you should put world “Token” and a key into the Authorization HTTP header, separated by a whitespace:
Authorization: Token yfT8ftheVqnDLS3Q0yCiTH3E8YY_cm4p
**Face**Represents a human face. Note that it might be several faces on a single photo. Different photos of the same person as also considered to be different faces.
"id" (number)
: unique identifier of the face generated by the services."timestamp" (string)
: time of face object creation as ISO8601 string."photo_hash" (string)
: Hash of the original photo. Note that identical photos will always have the same hash, and different photos will most certainly have different hashes. Don't interpret this value and don't make assumptions about particular hash function used for hash calculation."x1" (number)
: x coordinate of the top-left corner of face's bounding box on the original photo."y1" (number)
: y coordinate of the top-left corner of face's bounding box on the original photo."x2" (number)
: x coordinate of the bottom-right corner of face's bounding box on the original photo."y2" (number)
: y coordinate of the bottom-right corner of face's bounding box on the original photo."meta" (string)
: metadata string that you can use to store any information associated with the face."galleries" (string[])
: array of galleries names that have this face."photo" (string)
: URL of file name of a photo that had been used to create the face object."thumbnail" (string)
: URL of face thumbnail stored in the service cache.
**Create face**Processes the uploaded image or provided URL, detects faces and adds the detected faces to the searchable dataset. If there are multiple faces on a photo, only the biggest face is added by default.
Optionally, you can add a custom string meta, such as a name or an ID, which uniquely identifies a person. Multiple face objects may have the same meta. We recommend that you don’t assign the same meta to different persons. Thus when using person’s name as a meta, make sure that all names are unique.
Sample response code, for this method looks as following:
{ "results": [ { "age": 40, "emotions": [ "neutral", "surprised" ], "galleries": [ "default", "ppl" ], "gender": "male", "id": 2333, "meta": "Sam Berry", "photo_hash": "dc7ac54590729669ca869a18d92cd05e", "timestamp": "2016-06-13T11:06:42.075754", "x1": 225, "x2": 307, "y1": 345, "y2": 428 } ]}
**Detect faces**This method detects faces on the provided image. You can either upload the image file as multipart/form-data or provide an URL, which the API will use to fetch the image.
Sample response code, for this method looks as following:
{ "faces": [ { "age": 36, "emotions": [ "neutral", "happy" ], "gender": "female", "x1": 236, "x2": 311, "y1": 345, "y2": 419 } ], "orientation": 1}
**Verify face**This method verifies that two faces belong to the same person, or, alternatively, measures the similarity between the two faces. You can choose between these two modes by setting the threshold
parameter.
In the case, when a binary decision is required, the user can pass a value for the threshold
parameter. We provide 3 preset values for the threshold
: strict
, medium
and low
, with the former aimed at minimizing the false accept rates and the latter being somewhat more permissive. The client can also override these preset values by a fixed threshold.
Sample response code, for this method looks as following:
{ "results": [ { "bbox1": { "x1": 610, "x2": 796, "y1": 157, "y2": 342 }, "bbox2": { "x1": 584, "x2": 807, "y1": 163, "y2": 386 }, "confidence": 0.9222600758075714, "verified": true } ], "verified": true}
Since the launch of the app, the demand on the infrastructure has increased tenfold, utilising 96Gb of memory across nodes and reaching peak traffic figures of hundreds of megabytesper second. The infrastructure is coping effectively with the load, with average SLA 99.4%. All builds and updates are rolled out automatically using a continuous delivery system that allows transparent upgrading of services.
Our current stack is based on the following technologies and languages: Java (na- tive Android application) + native Android libraries, GoLang, Python, PostgreSQL, MySQL, RabbitMQ, Redis, MongoDb, Kubernetes, Docker, Tensorflow (for AI based assistant bot ML), Google Cloud, Sentry, Grafana, various analytics SDKs (Firebase and others), BigQuery, Apache Zeppelin, MQTT protocol for realtime communication, Node.js for web services.
The development team is working using scrum methodology with weekly sprints and includes backend and frontend developers, a DevOps engineer, a data scientist, a report and event engineer, a blockchain developer, QA engineers and an administrator/scrum-master. Before June 2018, bi-weekly sprints were used, subsequently reduced to speed up the response to the changing environment. All releases are covered by manual smoke tests and, when needed, complete regression tests, to ensure the proper functioning of the service.
This is what 500,000 users are supported by