About Flutter UKI and Their Data Challenge
Flutter UKI, a division of Flutter Entertainment, leads the sports betting and gaming industry with brands like Paddy Power, Sky Betting and Gaming, and Tombola. Their data team faced significant challenges managing a monolithic Apache Airflow deployment on Amazon EC2 instances that grew increasingly complex as their digital footprint expanded.
By 2022, Flutter UKI reached a critical decision point: re-architect their service on Amazon EKS or embrace Amazon Managed Workflows for Apache Airflow (Amazon MWAA). After extensive proof-of-concept testing and collaboration with AWS Enterprise Support, they chose MWAA to reduce operational complexity and accelerate innovation.
Strategic Migration to Amazon MWAA
The migration followed a methodical approach with multiple POCs and data-driven decisions. Starting with a small subset of directed acyclic graphs (DAGs), they gradually expanded to thousands of workflows while validating performance and reliability.
The data team brilliantly managed over 3,500 dynamically generated DAGs by implementing a sophisticated distribution across multiple MWAA environments. This created workload isolation and prevented overloading any single environment. By configuring unique DAG_FOLDER paths and using a round-robin distribution strategy, they maintained high performance while efficiently orchestrating workflows.
Advanced Architecture Design
Flutter UKI’s architecture delegated DAG execution to Amazon EKS for optimal performance. Key components included:
- Kubernetes Pod Operator (KPO) for tasks: Simplifying architecture by eliminating unnecessary complexity and enabling per-task resource allocation
- Custom KPO wrapper (KPOw): Abstracting complexity and minimizing the impact of version changes
- Monthly image updates: Ensuring code remained current and preventing security vulnerabilities
- Continuous Airflow updates: Implementing new versions with careful testing protocols
- Comprehensive monitoring with CloudWatch metrics: Providing early warning signals for potential issues
- CI/CD integration: Streamlining development with GitLab, code reviews, and Argo Workflows
Performance Optimization Techniques
Flutter UKI implemented several key optimization strategies:
- DAGs dynamically generated based on database metadata with its own CI/CD pipeline
- Parameters and secrets stored in AWS Secrets Manager and retrieved at runtime
- Scheduled DAGs to distribute execution times evenly
- Task code and common modules hosted on Amazon S3 and retrieved at runtime
- Amazon EFS volumes mounted to task pods for larger codebases
Impressive Results
Today, Flutter UKI operates four Amazon MWAA clusters executing tasks on dedicated Amazon EKS node groups. They manage approximately 5,500 DAGs encompassing over 30,000 tasks, handling more than 60,000 DAG runs daily with concurrency exceeding 450 simultaneous tasks.
During major events like Cheltenham and Grand National, when data load increases by 30%, their service remains stable and scalable, achieving a 100% success rate for critical processes – a significant improvement over previous years.
Key Benefits Realized
The migration to Amazon MWAA delivered multiple benefits:
- Enhanced stability, scalability, and resilience
- Improved security with managed updates and built-in encryption
- Reduced operational overhead allowing teams to focus on business-critical tasks
- Accelerated innovation in core business operations
If you’re considering reducing operational overhead and migrating to a fully managed Airflow solution on AWS, Amazon MWAA offers a compelling solution. Contact your Technical Account Manager or Solutions Architect to discuss your specific use case.
Leave a Reply