Part 1 - Cloud Cost Optimisation
It is a well-established fact that cloud technologies can be an important driver for overall improvement in an organisations business agility and service delivery. But like many other good things this too comes with a set of conditions that should be met. If not implemented the right way, services running on cloud can end up being costly, add management overhead and introduce additional risk to your environment.
In order to overcome these challenges, it is important to adhere to certain best practices which can be used as guidelines while using cloud technologies. The goal of this writeup is to equip the readers with enough information to go ahead and optimise their Cloud Datacenter not only in monetary terms but also with respect to implementing proper governance and security standards.
Overview
Following are the areas which need to be optimised in order to run a highly optimised and secured cloud practice in an organisation,
Cloud Spend
Reliability
IT Compliance
Security
Operational Excellence
Building the Right Team
This article is a part of series of articles on cloud optimisation and is focused on techniques that can be used to reduce spend on cloud
Cloud Cost Optimisation
Cost optimisation on Cloud is a continuous process as there are new resources being provisioned in the environment on an on-going basis and the cloud providers keep coming with new offerings and also keep modifying a number of existing services. In order to stay on top of the expenses you are incurring to run your cloud, some of the best practices are,
Term based commitments – Reserved instances, savings plans
Key take away: Discounts for long term commitments
Best suited for: Production Environments
Detailed description:
Cloud vendors provide discounts based on long term commitments. These discounts can turn out to be a big contributing factor for effectively managing your expenses on cloud. This model usually works best for production environments. Identifying servers which will be required for at least an year or more and purchasing RIs for them is an effective cost cutting exercise
Intelligent Shutdown
Key take away: Environment based shutdowns and utilisation based shutdowns can be used to manage cost effectively on Cloud
Best suited for: Non-Prod environments (development, QA, UAT)
Detailed description:
There are numerous techniques that can be used to reduce costs on the cloud if you are able to mark your resources clearly with the environment they are serving like Development, QA, or Production. Based on this classification, you can create schedules which will shutdown servers when not in use,
For example, a typical development server will not be required when the developer using it is no longer working. On a weekday, the schedule could be set to shutdown daily 6 pm and start automatically at 8 am the next morning. And on weekends the server could be completely shut down, say shutdown at 6 pm Friday and start at 8 am Monday.
This kind of schedule will end up saving close to 120 hours per server per week. This can lead to 60%-70% cost savings on your development servers which could mean a few thousands of dollars of saving per week depending on how large your development environment is.
Similarly, a QA environment can be optimised by shutting it down when not needed and bringing it up only when a test is scheduled to occur.
Further optimisations could be achieved if you tag your servers based on the applications they are serving. Once this is done, you can apply specific shutdown schedules to different environments of each application based on the
Criticality of the application
Working hours of the development and test teams
Any jobs scheduled to run on any of those servers
Unused / Unwanted Resource Cleanup
Key take away: Eliminate waste by identifying and deleting unused resources
Best suited for : Across all environments
Detailed description:
Identifying unused resources and timely cleaning of those resources can prove to be a cost saving exercise. There are a number of unwanted resources that can pile up on Cloud, for instance on AWS, resources like EBS volumes, EIPs, snapshots, Unattached ELBs that tend to fall off the radar overtime and add to the expenses associated with your account. Proper automation to identify these and defining policies around cleaning them on a timely basis is important to cut down the waste expenditure.
Right Sizing / Right Fit
Key take away: Understand what is an ideal size for your server upfront and on an ongoing basis
Best suited for: Across all environments
Detailed description:
Often it happens that when new infrastructure is requested, the configuration of the requested hardware is based on the peak usage. This is not a wise move as the infrastructure tends to be under utilized for most of it’s lifetime and will result in unnecessary expenditure.
Defining a process which runs periodically to look at the usage patterns of the servers and right sizing servers by vertical scaling (down-sizing or up-sizing) is an effective way to make sure you are not over spending on your infrastructure or under provisioning the resources.
Implementing Auto scaling (horizontal scaling) is also an effective way to manage your cost and application performance based on load. For this approach, you have to understand if the application has the ability to be distributed across multiple servers.
Bulk usage Discounts
Key take away: Discounts based on heavy usage
Best suited for: Enterprises and Organisations who are have large cloud consumption
Detailed description:
Cloud vendors offer discounts on some of their resources depending on the volume of usage. This could end up in decent savings of up to 5%-10% on certain kinds of resources. Determining the right timelines and values to commit need a bit of good data engineering and can be a cumbersome process. It usually helps to have a dedicated cloud financial governance personnel who is looking at this on a continuous basis and is ensuring all areas are looked at.
Comments