Tag: ETL
-

Capture Data Lineage from dbt, Apache Airflow, and Apache Spark with Amazon SageMaker
Amazon SageMaker now offers enhanced data lineage capabilities compatible with OpenLineage, allowing users to track data flow from tools like dbt, Apache Airflow, and Apache Spark. This integration creates transparency, builds trust, and centralizes governance of data assets in a single place.
-

Amazon PackScan: Revolutionizing Real-Time Sort Center Analytics with AWS Services
Discover how Amazon transformed its logistics operations with PackScan, an AWS-powered platform that reduced data latency from 1 hour to under 1 minute. This real-time analytics solution processes 500,000 scan events per second across 80 sort centers, resulting in 25% increased throughput and 12% reduction in labor hours.
-

How Flutter UKI Optimized Data Pipelines with Amazon MWAA
Discover how Flutter UKI transformed their data pipelines by migrating from EC2-based Airflow to Amazon MWAA, managing 5,500 DAGs and 60,000 daily runs with improved stability and reduced operational overhead.
-

Unify Data Analytics: Integrating Amazon S3 Tables with SageMaker Lakehouse
Amazon SageMaker Lakehouse now integrates with Amazon S3 Tables, offering unified access to data across S3, Redshift warehouses, and other sources. This integration enables seamless analytics using preferred tools while maintaining security through fine-grained permissions, helping organizations derive insights from distributed data without duplication or complex connectors.
-

Accelerate Data to AI Innovation with Amazon SageMaker Unified Studio
AWS announces the general availability of Amazon SageMaker Unified Studio, bringing together analytics and AI capabilities in a single development environment. This integrated platform enables teams to discover data, collaborate on projects, and build advanced applications with built-in governance, dramatically reducing time-to-value for data-driven initiatives.
-

Cross-Account Data Collaboration with Amazon DataZone and AWS Analytics Tools
Amazon DataZone enables secure cross-account data collaboration for AWS services. This solution streamlines data sharing between producer and consumer accounts while maintaining governance. Learn how to set up, publish, and consume shared data assets across accounts using AWS Glue and Amazon Redshift.
-

Revolutionizing Data Engineering: How Gemini in BigQuery Transforms Data Management
Discover how Gemini in BigQuery is revolutionizing data engineering through automated schema management, enhanced data quality control, and sophisticated data generation capabilities. Learn practical implementations and best practices for modern data solutions.
-

How Open Universities Australia Reduced ETL Costs Using AWS Cloud Services
Discover how Open Universities Australia revolutionized their data infrastructure by transitioning from costly third-party ETL tools to AWS services, achieving significant cost savings and improved efficiency in just 5 months.
-

Amazon Q Data Integration: Enhanced DataFrame Support and Context-Aware ETL Development
Discover how Amazon Q data integration has evolved with DataFrame support and context-aware development, revolutionizing ETL workflows. Learn about its enhanced capabilities, multiple data source support, and seamless integration with AWS services.
