Job Information
Cisco Principal Software Engineer, ML in Bangalore, India
Overview
As a Principal Software Engineer in the Artificial Intelligence group, you will play a crucial role in building and optimizing the core software infrastructure that powers AI-driven solutions. You will focus on architecting and deploying highly scalable, production-ready backend systems that support AI assistants, intelligent agents, and foundational AI services. Collaborating with machine learning engineers and cross-functional teams, you will drive best practices in software engineering, DevOps, Kubernetes-based deployments, and backend service development. Your expertise will be instrumental in accelerating AI innovation by ensuring robust, reliable, and efficient system operations.
Responsibilities
Design and implementhigh-performance backend architecturesthat seamlessly integrate with AI-powered products. Focus on buildingmodular, fault-tolerant, and efficient servicesthat support large-scale AI workloads while ensuring low-latency interactions between data pipelines, inference engines, and enterprise applications.
Developrobust model-serving APIsandcontainerized microservicesthat enable real-time AI inference and batch processing with high throughput and low latency.
Implementend-to-end monitoring, logging, and alertingsolutions to ensure AI systems operate reliably at scale.
Improve scalabilityby designing distributed systems that efficiently handle AI workloads and inference pipelines.
Own Kubernetes-based deploymentsby developing and maintaining Helm charts, Kubernetes operators, and cloud-native workflows to streamline AI model deployment.
Automate infrastructure managementusing Infrastructure as Code (IaC) tools like Terraform or CloudFormation.
Optimise CI/CD pipelinesfor AI applications, ensuring smooth model retraining, testing, and deployment cycles.
Improve security and complianceby implementing best practices in access control, container security, and vulnerability management.
Partner closely with AI/ML teamsto ensure seamless model integration into production environments.
Lead architecture discussionsand provide strategic technical guidance on AI platform evolution.
Mentor and guide engineersto enhance team skills in backend development, DevOps, and cloud technologies.
Requirements
Strong backend development experiencein Python (preferred) or Java, with expertise in building RESTful APIs, micro-services, and event-driven architectures.
Deep understanding of Kubernetes and container orchestration, with experience in deploying AI/ML workloads at scale.
Expertise in DevOps and CI/CD pipelines, including experience with Jenkins, GitHub Actions, ArgoCD, or similar tools.
Cloud expertise (AWS/GCP/Azure), including hands-on experience with cloud-native services for AI workloads (e.g., S3, Lambda, EKS/GKE/AKS, DynamoDB, RDS etc.).
Experience in performance tuning and system optimizationfor large-scale AI/ML workloads.
Proven ability to collaborate with ML engineers, data scientists, data engineers and product teamsto deliver AI-powered solutions efficiently.
Experience in technical leadership, driving architectural decisions, and mentoring engineers.
Strong problem-solving skills, with the ability to balance trade-offs between scalability, maintainability, and performance.
Preferred Experience
Prior experience working withAI/ML pipelines, model serving frameworks, or distributed AI workloads.
Experience inAI observability, monitoring model drift, and optimizing inference latency.
Understanding ofcybersecurity, observability, or related domainsto enhance AI-driven decision-making.
Splunk, a Cisco company, is an Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, colour, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis.
Note: