Tag: Spark
-

Capture Data Lineage from dbt, Apache Airflow, and Apache Spark with Amazon SageMaker
Amazon SageMaker now offers enhanced data lineage capabilities compatible with OpenLineage, allowing users to track data flow from tools like dbt, Apache Airflow, and Apache Spark. This integration creates transparency, builds trust, and centralizes governance of data assets in a single place.
-

Implementing Hybrid Analytics with Amazon EMR on AWS Outposts
Discover how Amazon EMR on AWS Outposts enables powerful hybrid analytics, combining cloud scalability with on-premises control. Learn to process sensitive data locally while accessing cloud resources for comprehensive big data solutions.
-

Integrating AWS Glue with Amazon OpenSearch Service for Streamlined Data Ingestion
Discover how to effectively integrate AWS Glue with Amazon OpenSearch Service for streamlined data ingestion. Learn about three powerful integration methods, best practices, and infrastructure setup for building robust data pipelines.
-

Streamlining Spark Debugging: AWS Glue Introduces Generative AI Troubleshooting Feature
AWS Glue introduces a game-changing generative AI troubleshooting feature for Apache Spark applications. This innovative solution automates root cause analysis and provides actionable recommendations, transforming hours of debugging into minutes.
