What led Netflix to shut their own data centers and migrate to AWS?
Netflix was originally a DVD shipping business where they would send out DVDs of your chosen programs to you. This was going well until 2008 where they experienced a major database loss and for 3 days could not ship out any DVDs to their customers. That was when the senior management at Netflix realized that they had to shift from continuous vertical scaling which leads to single points of failure to a more reliable and scalable horizontal scaling system. They chose Amazon Web Services despite having Amazon as a competitor (Amazon has their own streaming service known as Amazon Prime) because AWS provided them with the greatest scaling capabilities and the biggest set of available features. It took 7 years of migration for Netflix to shut down their last remaining data centers and move completely to the cloud.
Moving to the cloud has allowed Netflix to keep its existing members well engaged with overall viewing growing exponentially.
Netflix itself has continued to evolve rapidly by using many new features and relying on ever-growing volumes of data. Supporting this fast growth would not be possible earlier using their own in-house data centers. Netflix could not have racked the servers fast enough to support their own growth. While Cloud brings elasticity, which allows Netflix to add thousands of virtual servers and petabytes of storage within minutes which makes the whole process easier
As of January 2016, Netflix has expanded into 130 new countries. It uses multiple AWS Cloud regions which are spread all over the world to create a better and more enjoyable streaming experience for Netflix members wherever they are.
Netflix relies on Cloud for all its scalability, computing, and storage needs (not only video streaming) — Netflix business logic, distributed databases, big data processing, analytics, recommendations, transcoding, and hundreds of other functions that are used by Netflix all go through their Cloud infrastructure. Netflix also has its own Content Delivery Network (CDN) known as Netflix Open Connect which is used to deliver videos globally in an efficient manner.
When Netflix was using their own data centers, it faced a lot of outages. Cloud Computing is not perfect either, even though Netflix has hit some rough patches in the cloud, a steady increase in the overall availability has been noticed. Failures are ultimately unavoidable in any large-scale distribution system, even a cloud one. However, a Cloud-based system allows you to create redundancy measures while become quite helpful. Cloud Computing has made it possible to survive failures without impacting the member experience.
Why Netflix shifted to the cloud although it took so long?
Netflix did not shift to cloud for cost reduction reasons, but Netflix’s cloud costs ended up being a fraction of their cost which was a pleasant surprise. This was due to the elasticity factor of cloud computing, enabling Netflix to continuously optimize instances to grow and shrink as per requirement without the need to maintain large capacity machines. Economies of Scale helps Netflix in this scenario.
The benefits are very clear, but it still took seven years for Netflix to complete the migration. Moving to the cloud is a lot of work and a lot of factors need to be considered. Netflix could easily move all of its existing systems to AWS but bringing existing systems also brings all the problems and limitations that were present. So, Netflix took the cloud-native approach, they rebuilt all of their technology and fundamentally changed the way they operate the whole company. Netflix migrated from a single application to thousands of micro-services.
Netflix Realizes Multi-Region Resiliency Using Amazon Route 53
What happens when you need to move 89 million viewers to a different AWS region? Netflix’s infrastructure, built on AWS, makes it possible to be extremely resilient, even when the company is running services in many AWS Regions simultaneously.
In this episode of This is My Architecture, Coburn Watson, director of performance and reliability engineering at Netflix, walks through the company’s DNS architecture — built on Amazon Route 53 and augmented with Netflix’s Zuul — that allows the team to evacuate an entire region in less than 40 minutes.
Netflix is the world’s leading internet television network, with more than 100 million members in more than 190 countries enjoying 125 million hours of TV shows and movies each day. Netflix uses AWS for nearly all its computing and storage needs, including databases, analytics, recommendation engines, video transcoding, and more — hundreds of functions that in total use more than 100,000 server instances on AWS.