The bank moved to the cloud: what's next? Cloud solutions architect Raif - about challenges and solutions that will simplify cloud infrastructure management after migration
With the onset of a full-scale invasion, Ukrainian businesses, in particular financial institutions, have been actively migrating to the cloud in order to secure user data and preserve the functionality of the financial system. A vivid example of fast and high-quality migration was shown by Raiffeisen Bank, which in 2022 managed to migrate its infrastructure without disruption to customers for $0 and in just three months .
However, migration requires considerable effort even after its implementation. Pavlo Klets, architect of cloud solutions at Raif, told dev.ua about the challenges that await businesses after a successful migration, and how to prepare for them and respond correctly. Raif more than two years of work in the cloud allowed him to formulate useful insights and practical advice.
With the onset of a full-scale invasion, Ukrainian businesses, in particular financial institutions, have been actively migrating to the cloud in order to secure user data and preserve the functionality of the financial system. A vivid example of fast and high-quality migration was shown by Raiffeisen Bank, which in 2022 managed to migrate its infrastructure without disruption to customers for $0 and in just three months .
However, migration requires considerable effort even after its implementation. Pavlo Klets, architect of cloud solutions at Raif, told dev.ua about the challenges that await businesses after a successful migration, and how to prepare for them and respond correctly. Raif more than two years of work in the cloud allowed him to formulate useful insights and practical advice.
REFERENCE
On March 9, 2022, the National Bank of Ukraine published on its website Resolution 42 «On the Use of Cloud Services by Banks under Martial Law in Ukraine,» which radically changed views on what business continuity should look like now. The engineers of all banks in the country were faced with the task of building a backup of all systems as quickly as possible and ensuring their stability in wartime and inaccessibility to the enemy.
Mass flight into the cloud
In the first months of the great war, all major market players successfully completed the migration task and moved to the clouds. Due to constant shelling and power outages, cloud versions of banks became not a temporary backup, as originally planned, but the main infrastructure that needs to be refined, updated, and developed.
And here it is important to think about such parameters as the cost of cloud hosting, ease of maintenance, 100% compliance with regulatory requirements, and readiness for the rapid implementation of global changes in the financial market, such as SEP 4.0, SEPA, etc.
According to Pavlo, Raif has developed a whole range of solutions that helped to rationally use funds and provide customers with a decent level of service in the first year after the completion of the Cloud migration.
FinOps as a must, or financial literacy for engineers
It is important to understand what is critical to do in the first months so that cloud costs do not become prohibitive. «Cloud providers’ business models are built so that the most conservative and inflexible will pay for the fast and adaptive. So, a bank that has just migrated from the ground to the cloud „as is“ usually looks so heavy and clumsy from the point of view of the latest technologies that it looks like a big monster that will bring huge revenues to the provider, who in turn will be able to offer lower prices to more adaptive users of its services,» he says.
«At Raif, we could not allow this and immediately moved on to optimizing our technological solutions for the latest approaches to infrastructure management,» adds Pavlo.
But don’t take this as a problem, it’s just a set of common engineering problems. There are a few simple rules that give quick results:
low-hanging fruits. The most powerful tool in Raif’s experience was Instance Scheduler. This is a ready-made solution built on the basis of technologies such as Amazon EventBridge, Amazon Lambda and Amazon DynamoDB as a database. This solution allows you to automatically turn on and off cloud resources according to the schedules specified in the database. To use it, you don’t need to have any experience with the above services at all, just run the installer, click «Next» several times and get the created Lambda functions and configured EventBridge rules. «The only creative work left is to go into the DynamoDB database and create several records with schedules, maybe even come up with names for them, no more. That’s it, then just hang the scheduler=<schedule name> tag on each resource, and Amazon will do the rest for you,» says Pavlo.
He gives an example of real cost savings.
Imagine that you have a test copy of your entire infrastructure, on which you test the implementation of all new solutions. It can cost the same or half as much as your entire production environment. But it’s still +50% to your costs. And you’re a modern technology bank at the stage of digital transformation, so you have a development environment that simulates all the components of your infrastructure and also costs half as much as your production environment. With these two additional environments, the cost of the infrastructure has already almost doubled. And the development environment and the test environment together cost at least as much as your entire other infrastructure, and they work 24 hours a day, 7 days a week — 168 hours every week. And your developers and testers who use them — only 40 hours a week, that is, 4 times less.
This means that your internal environments, which recently cost +100% of the infrastructure cost with Instance Scheduler, will cost only +25%. And this is without any software adaptation to all sorts of Cloud-native features, just the ability to turn on test environments at the beginning of the working day and turn them off at the end.
And let’s also mention sales systems that are not needed 24/7. Those that are used only during the opening hours of the branches or perform night calculations. At one time, just thanks to the implementation of Instance Scheduler, some of our departments received about 60% savings.
Audits and regulatory requirements
Usually, as part of an audit, a bank is required to provide a detailed description of the processes and technologies that ensure business continuity and physical data security. «We all understand that it will hardly be a problem for a bank’s Solution Architect to provide a description of the number of copies of each application, the scaling and balancing method. But there can really be problems with answering questions about how continuity of power, communication, cooling of equipment is achieved, and how physical security of premises is ensured. Because this is the information that Cloud providers consider to be their trade secret, which gives them an advantage over competitors, and they do not seek to disclose it in detail, even to their clients,» says Pavlo.
The question arises as to how to correctly fulfill the requirements of auditors, both external and internal, with such input data. It should be borne in mind that appeals to Amazon support and managers will be useless, technical details are protected by numerous NDAs. So how to remain transparent to the regulator or auditor?
Pavlo explains that at Amazon, Raif was greatly helped by the AWS Artifact service in solving this issue. The service allows users to upload documents on compliance with standards, such as ISO, PCI, and SOC, which demonstrate that AWS adheres to industry standards and regulatory requirements. It contains an archive of about 900 documents with the conclusion of external audits and certificates of compliance with dozens of popular standards for all AWS technical platforms.
«That is, you really can’t know how your Cloud provider achieves compliance with the standards, but you know for sure that it complies with them and you always have a set of relevant documents on hand,» says solutions architect Raif.
According to Pavlo, by understanding the regulatory requirements for your infrastructure and the list of standards it must meet, in AWS Artifact you can quickly select a package of documents that confirms that at the physical level your Cloud provider has long passed all the necessary checks, and you personally only have to close the issue of exclusively logical connections between systems. It is also important to allocate enough time for the audit and not rush.
Data access and infrastructure management
At the beginning of the migration, a fixed group of Cloud engineers and BCM specialists with fairly broad powers could manage the infrastructure. But as more systems move to the cloud, the less likely it is that this team will be able to successfully cope with all operational tasks: resource monitoring, software updates, network access management, and the many routine tasks that bank IT professionals face every day.
«Having migrated a large number of systems to the cloud, they need to be transferred to new processes and technologies and the teams assigned to them. And for this, a flexible access model is needed,» says Klets.
He adds a list of technologies that allow the bank to combine the flexibility of cloud technologies and the rigor of industry requirements.
AWS IAM allows you to create users in the AWS console and grant them rights to manage certain resources, both through the web interface and using various tools such as Terrafrom and Terragrunt. But manually managing users without the ability to audit and centralize is definitely not the way for a large corporation, which is usually a bank. AWS IAM Roles are, to some extent, a template for users and AWS SAML, which allows you to combine Active Directory with these templates.
AWS Key Management allows you to not only create encryption keys for all data, but also to slice access to these keys, regardless of access to the resources that these keys use. «It works like this: if you give one of the engineers maximum rights to a pool of virtual machines on which critical databases live, but do not give them access to the encryption keys from their disks, then he will be able to create new copies of these virtual machines, reboot existing ones, manage network access to them, in general, do anything, except for one thing: he will never be able to see real client data,» explains Pavlo. This is exactly what the regulator requires.
And according to cloud solutions architect Raif, it is possible to replace SSH and RDP access with AWS Session Manager and have all the benefits of centralized access to operating systems, with logging of all actions of your administrators and automatic package updates according to pre-defined rules.
For a more detailed dive into the processes that accompany the process of supporting migrated infrastructure, Pavlo recommends using the AWS Well-Architected Framework. This is a large document that best describes how Amazon sees the use of its services in an ideal world.
«However, applying all these approaches to a large and dynamic organization, like any bank, can take years. Therefore, of all the areas for development, it was the above-listed options that gave us the most tangible effect at the time,» the specialist states.
Raif’s experience, according to him, has shown that the practices and approaches described in this document are very harmoniously applied to the infrastructure of a financial institution in the realities of modern Ukraine.
And finally: before making a decision to migrate to the cloud, it is worth weighing all the prospects and risks, calculating possible costs, and carefully studying the documentation so that the process is as comfortable and optimal as possible for all parties involved.
Raif is migrating to a single MFI. An IT specialist who has been working at the bank for 23 years tells how to transfer millions of accounts without errors and not lose customers