We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Analysis Summary
Direct appeal
Explicitly telling you what to do — subscribe, donate, vote, share. Unlike subtler techniques, it works through clarity and urgency. Most effective when preceded by emotional buildup that makes the action feel like a natural next step.
Compliance literature (Cialdini & Goldstein, 2004); foot-in-the-door (Freedman & Fraser, 1966)
Worth Noting
Positive elements
- This video provides a clear technical explanation of the 'Shared Responsibility Model' in cloud computing, specifically regarding the difference between VM uptime and application health.
Be Aware
Cautionary elements
- The guest uses technical jargon like 'split-brain' and 'data inconsistency' to create a sense of inevitable failure that only their specific category of software can prevent.
Influence Dimensions
How are these scored?About this analysis
Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.
This analysis is a tool for your own thinking — what you do with it is up to you.
Transcript
Enterprise cloud migrations are at an all-time high. But here is the problem. Most IT teams assume the cloud providers infrastructure resilience means their applications are automatically highly available. That's not the case and that gap is costing companies millions in unplanned downtime. Cloud providers give you resilient infrastructure but your application that's still your problem. So how do you bridge that gap and actually achieve true high availability in the cloud? That's exactly what we are going to find out in this episode of datadriven with Philip Murray, solutions engineer at science technology. Philip, it's great to have you on the show. >> Thank you for having me again. >> Today's topic is of real interest to a lot of folks because there are a lot of misconceptions when people talk about high availability. A lot of IT teams think that just because they are running it in AWS or Azure or whatever cloud they prefer their applications are automatically highly available. Now is that true or is that wrong? If that is wrong, why? >> I think that's part of the picture. Um the systems, the virtual machines, the infrastructure that's hosted in the cloud is highly available. Um it's managed by the cloud. However, you also need availability in the cloud. Um, while AWS or your cloud provider is responsible for your storage devices, the hardware for the virtual machines, you're ultimately responsible for your applications and the applications that run upon your services in the cloud. And so, high availability is something that is a joint effort. you have to um work with your cloud provider to make sure that your services or I'm sorry your infrastructure is available but you also have to take your own uh step forward to providing um the resil resilience of applications within the cloud. Um and that requires an application aware failover mechanism. um application aware disaster recovery. Um and that persists whether you're running on prim or in the cloud um you know one way or another you still have to make sure those applications are running accessible and serving the purpose that you have set for them. >> So if cloud providers delivering resilient infrastructure where does the responsibility gap exists? What are IT teams missing when they assume the clouds got them covered? >> So suppose you know I'm hosting a database in the cloud. Um the systems that I'm running that database upon you know um suppose I'm using um you know AWS Azure the virtual machine that I'm running that that database upon AWS Azure my cloud provider they're going to make sure that that virtual machine is available. However, just because that machine is operational doesn't mean that my database is actually running upon that machine. Um, there could have been an application level crash. There could have been a drive filling up with data and that's preventing my application from running at its full uh capacity. So, you know, in that gap, you know, I'm ultimately responsible for making sure that my database is running upon the infrastructure that the cloud has provided. Let's talk about multi- availability zones and multi- region deployments. Companies are spending significant amount of money on these setups thinking that they are protected. Walk us through what they're actually getting versus what they think they're getting. Well, a lot of the draw towards multi- availability zone or multi-reion deployments is to gain some extra protection, some extra risk reduction in the environment. And that does get provided when using a multi- multi-remise, multi- um multi- availability zone or multi-reion deployment, but that doesn't ensure that there's the seamless failover for applications between those regions. Um so you know if I use my database system example, if I have failover between the US East one and US West regions, um you know I still have to make sure that my application is able to promptly come in service when switching over from the US East region to the US West region. Um and so with that there's the possibility or the introduction of um some considerations that need to come into play. Um you know in order to ensure that seamless failover you need to make sure that your prerequisite applications resources services data are all available. um if my US- West region doesn't have all of the data that my US East region has, it doesn't do me much good if I can bring the database in service over there because it doesn't have the up-to-date data that I need to remain operational. Um, additionally, you know, when dealing with these um, multi- availability zones, multi-reion deployments, you also have to consider the possibility for a split brain um, where both systems in either region believe that they are the active operational copy of the environment. Um, and so you need a high availability solution that can manage that failover, switch over between those regions and help to avoid those conditions. And I touched on it briefly, but there's also the concern of data consistency or data inconsistency. Um, your solution also needs to make sure that your data is replicated, is present on the standby systems so that in the event that you do need to switch over to another region, you're continuing to operate with the up-to-date data and you're not losing anything in your um return point objective by bringing applications and service in another site. >> Excellent. I want to just go a bit deeper into split brain and data inconsistence. Can you break down what does that mean in practical terms and why multi zone multi availability zones alone don't protect against them. >> So a split brain in simplest terms is when you have two systems one meant to be active one meant to be standby but both systems say they are active. Um, now obviously I'm going to continue using my database analogy here. Obviously, if both systems are saying they're active, clients are still only connecting to one of those databases. Only one of those databases will have up-to-date data. And by both saying that they're the active copy, that they're the source of that, um, there's the potential for data to not get copied from one system to another. neither is in the role of accepting incoming replicated data. Um and so knowing what a split brain is um you don't really gain any protection to that by using a multi- availability zone or multi-reion deployment simply because that operates on the application level. um if there's an issue communicating between regions, if there's an issue um replicating between regions, or if anything were to happen where one region cannot communicate or the the I'm sorry, if there's ever an an issue where one system in the first region can't communicate with its peer system in another region, then there's that possibility for both systems to try to operate as the source. and they're then competing to provide availability rather than working together. Um, and the result being um inconsistent data, you know, difficulty figuring out which copy is the quote unquote good copy that you want to continue operating with once that situation is resolved. And then the other leg of that that you asked about was the data consistency between regions. Um, and anytime you're working with a business application and you're managing customer data or managing critical business critical data, you always want to be running with that most current copy of data. When you introduce replication of data from one site to another, there's obviously concerns about how long it takes to copy that data from one site to another. There's also the concern of how current is the remote site. You know, did um did node one get rights one, two, and three, and node two only get right one, you know. Um when you spread applications out across regions, you aren't really mitigating the factors that can contribute to data inconsistency. Um the time it takes to write data increases a little bit as latency increases as things are geographically spaced and the communication the relay of data from one system to another still has to occur and synchronicity still has to be achieved um regardless of how far apart or which regions the applications are running upon. Um, so you aren't really gaining any mechanisms that aid in providing data consistency when using a multi-availability zone or a multi-reion deployment, but you are gaining the uh peace of mind and the risk reduction knowing that if there were ever an issue in one of those regions, um, you know, if there were a fire in a data center or, you know, something terrible were to happen, a natural disaster, the services, the infrastructure in the other region would not be impacted. And so you do gain a little bit of availability that way um or a little bit of resilience that way, but you um still have to make sure that everything is getting copied from one region to another um before you're able to really rely on that as a uh risk reduction measure. Now there is this idea that cloud native high availability is simple and onesizefits all but you are dealing with everything from database to ERP systems to custom applications. What makes high availability so different across these workloads? >> So this comes down to what we at SCP to call application awareness. when you have an application um again I'm going to take a database as an example just because it's easiest for me to construct but before that database can really become operational and serve its business purpose you need the data to be accessible for it you may need a floating or virtual IP to be accessible so clients connect can connect to that database you may need um you know your ERP systems um an SAP environment or something to be ready to manage transactions across that database. Um there's dependencies that database has that the environment has to provide before everything can be fully operational. And a lot of times um you know cloudnative tools can lack that deep application awareness to be able to have the knowledge of what all of those dependent services are and get those um started and prepared for the uh you know for the Keystone application to really take over and provide its full uh full operation. And this is where the higher um or sorry not higher but the specialized high availability solutions um become necessary to minimize that downtime. Um it's one thing to be able to start a database in another region or in another site and have it provide um you know provide the data that would be required. But a specialized high availability tool can make sure that your volumes are accessible, that the storage is shared appropriately and mounted on the systems where your database might be running, that your ERP systems are operational in the other region and ready to manage transactions across the database. And that way when the database does go online, it is poised and ready to be fully functional. And that can minimize downtime. It can reduce the need for administrators to have to manually intervene to get the environment up and running. Um, which not only reduces the the possibility for, you know, human error in the in the environment, but it also reduces the amount of time that it takes for that environment to become fully operational. And one other element with this as I've been tagging on to all of them is the element of data. You know by making sure that your high availability solution not only considers the services that your keystone service or keystone application rely upon but also make sure that the data required for that is accessible, replicated and up to date on the other sites. Um, you're really making sure that in the event something happens to your primary environment, that secondary environment is ready to take over and provide that seamless continuation of of business sorry provide that seamless continuation of you know business operation. Can we also talk about what's missing from cloud native tools when it comes to application awareness and why does this matter for enterprises running critical workloads. >> So this is going to vary between whichever tool we're looking at. Um again I'm going to use the um database example simply because that's what I'm most familiar with and can talk most readily about. But suppose you know I have my database running on a managed service um via my cloud provider that may provide the database operation the database accessibility um it may even provide data redundancy um but what's not being tracked there is my SAP environment something to manage those and track those transactions upon the database to handle a lock table and make sure that clients using that database are using it appropriately and getting the correct data. Um that cloudnative tool doesn't intrinsically have the awareness of that SAP environment where I might be um needing that to be fully operational in my in my applications. And so that's where a tool like you know lifekeeper or one of the CIOS products can come into play where we don't just take the knowledge that the database needs to be started and have consistent data into consideration but we can also take into consideration that the SAP environment is a prerequisite to this database starting for the entire environment to be fully operational. Um and so that's where the key distinction comes in is mostly in the prerequisites that knowledge of what needs to be there before this other application so that I can achieve that full operation. For those IT teams who are watching this show who are also running mission critical applications in the cloud, what should they be evaluating when they are looking at HS solutions beyond what their cloud providers offer? >> The advice I would give to any IT teams looking is consider not only the ease of use of your high availab high availability solution. Um, consider that when you really need to interact with your high availability solution, it's either going to be in a situation of downtime, a situation where you're performing maintenance or updates. Um, but one way or another, it's going to be at a time where things need to operate smoothly. You don't want any unexpected um any unexpected issues, errors to occur. And you also want to make sure that it's an easy element of your process for whatever um whatever you're doing, whether it's maintenance or whether it's you know working to recover from an outage. So the first element I would recommend is look for that ease of use. Um the other element which may sound um contradictory but I really believe that they can go hand inand is that deep awareness in application or in that deep application awareness of your high availability tool to make sure that it's not just checking the box of starting the application but making sure that it starts the application with all of its dependencies already met so that when your high availability tool says your application are running, you can be assured that not only are your applications running, but they're running into the capacity where they are available and serving the purpose required for the business. Philip, thank you so much for joining me and breaking down these myths around high availability in the cloud. The key takeaway here is that cloud infrastructure resilience doesn't equal applications availability and enterprises need application aware at solutions like SCIOS to truly protect against downtime and as usual thank you for sharing these insights and I look forward to our next conversation. >> Yeah, thank you. Thank you for giving me the opportunity to uh share those insights and thank you for having me on yet again. Um, it's always a pleasure. >> And for those watching, if you are facing similar challenges with applications available in the cloud, make sure to check out SCIO's technology and its solutions. And don't forget to subscribe to TFIR, like this video, and share it with them. Thanks for watching.
Video description
Enterprise cloud migrations are at an all-time high, but most IT teams assume that the cloud provider's infrastructure resilience means their applications are automatically highly available. However, that's not the case, and this gap is costing companies millions in unplanned downtime. In this episode of Data Driven, Philip Merry, Solutions Engineer at SIOS Technology, breaks down the critical distinction between infrastructure resilience and application availability. Philip explains why multi-availability zones and multi-region deployments don't automatically protect against split-brain scenarios and data inconsistency—and what enterprises need to do to achieve true high availability in the cloud. Read the full story at www.tfir.io #CloudComputing #HighAvailability #DisasterRecovery #AWS #Azure #EnterpriseIT #ApplicationResilience #CloudMigration #DataProtection #SIOS