Job Url: https://www.linkedin.com/jobs/search/?currentJobId=4341792966&f_AL=true&f_TPR=r86400&f_WT=2&keywords=machine%20learning%20engineer&origin=JOB_SEARCH_PAGE_JOB_FILTER&spellCorrectionEnabled=true&start=75 Job Description: AWS Data Engineer  United States · 1 hour ago · 63 applicants Promoted by hirer · No response insights available yet Remote Matches your job preferences, workplace type is Remote. Full-time Easy Apply Save Save AWS Data Engineer  at Tredence Inc. AWS Data Engineer Tredence Inc. · United States (Remote) Easy Apply Save Save AWS Data Engineer  at Tredence Inc. Show more options Get personalized tips to stand out to hirers Practice mock interviews personalized to every role and get custom feedback Try Premium for PKR0 People you can reach out to Fakhar profile photo Fakhar Khalil · 3rd -- School alum from University of the Punjab Connect Meet the hiring team Manish M. 3rd Associate Manager @ Tredence | Ex - Axtria | | Management Consulting | HR Consulting | Driving Organizational Excellence 🚀| Job poster Message About the job We are seeking a highly skilled AWS Data Engineer with 10+ years of experience to design, build, and optimize large-scale data pipelines and cloud-based data solutions. The ideal candidate will have deep expertise in AWS data services, strong analytical skills, and the ability to work cross-functionally to deliver complex data engineering initiatives. This role will also contribute to architectural decisions, best practices, and mentorship across the data engineering team. Key Responsibilities Data Pipeline & ETL Development Design, develop, and maintain scalable, reliable, and high-performance data pipelines using AWS services (Glue, EMR, Redshift, Lambda, Kinesis, Step Functions, etc.). Build and manage ETL/ELT workflows for structured, semi-structured, and unstructured data using Glue, Spark, Python, and SQL. Implement data ingestion frameworks including real-time streaming (Kinesis) and batch data processing. Data Integration & Transformation Integrate data from diverse sources (RDBMS, APIs, streaming sources, on-prem systems) into cloud-based data platforms. Develop complex data transformation logic using PySpark, Glue ETL, SQL, and EMR jobs. Work with AWS data migration tools such as DMS, DataSync, SCT, and MWAA to support migration and modernization initiatives. Data Architecture & Modeling Collaborate with data architects to design scalable data lake and data warehouse architectures on AWS. Apply strong knowledge of data modeling (star/snowflake schemas), dimensional modeling, and data warehousing concepts. Create optimized table structures and schemas in Redshift and Delta Lake formats. Data Quality, Governance & Security Implement and automate data quality checks, validation rules, and reconciliation frameworks. Ensure data governance and security best practices, including IAM permissions, encryption, access controls, and compliance with regulatory standards. Maintain data lineage, metadata, and documentation for auditability. Monitoring & Optimization Monitor data pipelines using CloudWatch, Glue job metrics, EMR logs, and custom observability dashboards. Troubleshoot pipeline failures, performance bottlenecks, and data inconsistencies. Optimize data pipelines for performance, reliability, and cost-efficiency, leveraging AWS best practices. Collaboration & Leadership Work closely with data scientists, analysts, architects, and business stakeholders to understand data needs and propose technical solutions. Provide technical leadership and mentorship to junior engineers, conducting reviews, training sessions, and knowledge-sharing. Contribute to continuous improvements in engineering processes, CI/CD practices, and DevOps automation for data pipelines. Qualifications & Skills Required Qualifications 10+ years of hands-on experience in data engineering, with at least 4+ years working extensively in AWS. Proven expertise with AWS services: Core Data Services: Glue, Redshift, EMR, Athena, Lake Formation Storage & Compute: S3, Lambda, EC2, Step Functions Migration Tools: DMS, DataSync, MWAA Strong proficiency in Python, PySpark, SQL, and data pipeline frameworks. Experience building distributed data processing systems using Apache Spark on EMR or Glue. Strong understanding of data modeling, data warehousing, ETL frameworks, and big data ecosystems. Hands-on experience with CI/CD pipelines (CodePipeline, GitHub Actions, Jenkins) for data workloads. Solid understanding of security best practices, IAM policies, encryption (KMS), and network configuration in AWS.