We are seeking a highly skilled Cloud – MLOps Engineer<\/b> to design, implement, and manage scalable cloud -based machine learning infrastructure on AWS<\/b>. The ideal candidate will have strong expertise in AWS cloud architecture<\/b>, MLOps pipeline automation<\/b>, and Large Language Models (LLMs)<\/b>, along with solid DevOps<\/b> and CI/CD<\/b> capabilities. This role involves developing automated, secure, and efficient workflows to support AI -driven applications, enabling continuous training, deployment, and monitoring of models in production environments. Design, develop, and maintain<\/b> robust AWS cloud infrastructure to support large -scale, data -intensive applications<\/b>. Architect and implement scalable solutions<\/b> using AWS services including EC2, S3, Lambda, RDS, IAM, VPC, and CloudFormation. Develop and manage MLOps pipelines<\/b> to automate model training, validation, deployment, and monitoring processes. Ensure seamless integration between data pipelines and machine learning workflows<\/b> for efficient model lifecycle management. Implement and optimize LLM -based solutions<\/b>, including fine -tuning pre -trained models and deploying AI -driven services in production. Build and maintain CI/CD pipelines<\/b> using Jenkins, GitLab CI, CircleCI, or AWS CodePipeline to automate testing and deployment. Adopt Infrastructure as Code (IaC)<\/b> practices with Terraform, AWS CDK, or CloudFormation for consistent, repeatable infrastructure provisioning. Ensure high standards of security, scalability, and performance<\/b> across all cloud environments. Monitor and analyze cloud environments<\/b> using AWS CloudWatch, ELK Stack, Prometheus, or Grafana to ensure operational stability. Collaborate with data scientists, developers, and architects to deliver reliable and automated MLOps solutions<\/b>. Mentor and guide junior engineers<\/b>, promoting best practices and continuous improvement within the engineering team. Drive technical innovation<\/b>, exploring new tools, frameworks, and AWS features to enhance AI and cloud capabilities. Bachelor’s or Master’s degree<\/b> in Computer Science, Data Engineering, or a related field. 5+ years of experience<\/b> in cloud engineering or DevOps, with at least 2+ years in MLOps<\/b> and model deployment environments. Proven hands -on experience with AWS cloud infrastructure<\/b> and core services (EC2, S3, RDS, Lambda, VPC, IAM, CloudFormation). Strong proficiency in MLOps frameworks<\/b> and pipeline orchestration for training, validation, and deployment automation. Experience working with Large Language Models (LLMs)<\/b> — including fine -tuning, optimization, and production deployment. Solid understanding of CI/CD processes<\/b> using Jenkins, GitLab CI, or AWS CodePipeline. Practical experience with Infrastructure as Code (IaC)<\/b> tools such as Terraform, AWS CDK, or CloudFormation. Knowledge of monitoring and observability tools<\/b> (CloudWatch, ELK, Prometheus, Grafana). Familiarity with security best practices<\/b> in AWS and DevOps environments. Excellent problem -solving, analytical, and communication skills with a passion for automation and AI innovation.
<\/p>
<\/div>
<\/div>
<\/b><\/div>
<\/div><\/b>
<\/h3><\/b>
<\/h3>Roles and Responsibilities<\/b>
<\/h3>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li><\/ul>Requirements<\/b>
<\/h3>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li><\/ul>
<\/div><\/span>Requirements<\/h3>
<\/div><\/span>