Machine Learning Triage Engineer
3 days ago
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.
Together, we advance your career.
The Role
Join AMD as a Machine Learning Triage Engineer and become the first line of defense in maintaining the quality, reliability, and performance of AMD's ROCm software stack. In this critical role, you'll perform initial assessment, diagnosis, root cause analysis, and assignment of defects reported across different devices and operating systems. You'll collaborate closely with kernel/compiler/runtime developers, driver teams, and CI/CD infrastructure engineers to ship high-quality releases that meet our correctness, conformance, and performance goals.
Key Responsibilities
- Initial investigation: Analyze incoming bug reports, failed tests, customer feedback, and other issues to understand the problem.
- Issue diagnosis: Determine the nature of the problem, whether it's a product defect, automation issue, or test environment issue.
- Root cause analysis: Perform in-depth investigation to identify the root cause of a problem.
- Assignment: Route issues to the appropriate development team or engineer
- Automation: Develop tools and scripts to automate parts of the triage and debugging process.
- Collaboration: Work with support, QA, and development teams to ensure efficient problem-solving.
Preferred Experience/Skills
- System Programming: Strong background in C, C++ or other system programming languages
- Python Proficiency: Demonstrated expertise in Python for scripting and automation
- Neural Networks: Familiarity with neural network architectures, preferably generative AI models (Transformers, Diffusion models)
- ML Frameworks: Familiarity with machine learning frameworks such as PyTorch, TensorFlow, ONNX, JAX, or similar
- Debugging Tools: Experience with debugging, logging, and issue-tracking tools
- Linux Development: Practical experience with testing and development on Linux environments
Bonus
- Scripting: Knowledge of Bash, PowerShell, or similar scripting languages
- GPU Programming: Familiarity with GPU programming (CUDA, HIP, OpenCL, Vulkan) or high-performance computing
- Open-source: Experience contributing to open-source projects
- Windows Development: Experience with testing and development on Windows platforms
Academic Credentials
- Bachelor's or Master's degree in Computer Science, ML, Computer Engineering, or a related field
Benefits offered are described:
AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
-
Machine Learning Engineer
2 weeks ago
Belgrade, Central Serbia Synechron Full timeAt Synechron, we harness the power of digital transformation to drive business success. As a global consulting firm, we combine creativity with advanced technology to deliver innovative solutions across industries. Recognized with multiple employer awards, we are committed to building talented teams and creating a dynamic work environment.We are hiring a...
-
Machine Learning Engineer
1 week ago
Belgrade, Central Serbia Netconomy Full timeJob description Over the past 20 years, NETCONOMY has grown from a startup to a 500-people team working across 10 European locations. We believe in the power of agile and cross-functional collaboration, bringing together people from various backgrounds to create outstanding digital solutions.YOUR JOBAs a Machine Learning Engineer, you will play a crucial...
-
Machine Learning Platform
3 days ago
Belgrade, Central Serbia Everseen Full timeEverseen: A leader in vision AI solutions for the world's leading retailers.The RoleWe are seeking aMachine Learning Platform/Backend Engineerto design, build, and maintain scalable infrastructure that empowers our data scientists and machine learning engineers to develop, train, benchmark, and monitor machine learning models efficiently. You will be...
-
Senior Machine Learning Engineer
3 days ago
Belgrade, Central Serbia INFOMEDIJI d.o.o. Full timeJoin Our Vision: We are building the next generation of Spatial Media - a new way people experience videos through interaction, multi-user presence, haptics, AI, and immersive storytelling.DeoVR is the leading immersive streaming technology serving multiple enterprise customers and millions of users globally.We are now expanding the boundaries of what video...
-
LLM und Machine Learning Engineer
2 weeks ago
Belgrade, Central Serbia Coorpix AG Full timeYour RoleAs an MLOps Engineer, you'll bridge the gap between AI research and production. You will design, build, and operate pipelines that bring our AI agents to life. From model training and evaluation to deployment cloud or on prem environment. Your goal: build reliable, explainable, and secure AI agents that perform in real-world, high-impact...
-
Developer Support Engineer
1 week ago
Belgrade, Central Serbia Insightful Full timeAbout usInsightful is a market-leading platform for employee productivity and workforce analytics. We process really big data, synthesize it into actionable insights, and ultimately provide a best-in-class, easy-to-use product that empowers enterprise customers to improve employee productivity, business processes, and overall staff well-being.Job...
-
AI/ML Engineers
2 weeks ago
Belgrade, Central Serbia Uvation Full timeThe Role: Job Title: AI/ML EngineerDepartment: IT ServicesReports To: IT Project Manager Location: RemoteJob Overview:The AI/ML Engineer plays a critical role in designing, developing, and deploying machine learning models and AI-driven solutions to support strategic business initiatives. The role involves collaborating with cross-functional teams,...
-
Lead Security Operations Engineer
1 week ago
Belgrade, Central Serbia Cloudlinux Full timeCloudLinux is a global remote-first company. We are driven by our principles: do the right thing, employees first, we are remote first, and we deliver high-volume, low-cost Linux infrastructure and security products that help companies to increase the efficiency of their operations. Every person on our team supports each other and does what we can to ensure...
-
AI Tier 2 Support Specialist
1 week ago
Belgrade, Central Serbia Exacta Solutions Ltd Full timeOur client, a fast-growing B2B service provider in the iGaming industry with a top AI based product offering, are seeking a high-energy, highly technicalAI Support & Product Operations Specialistto join their fast-moving team. This hybrid role sits between Tier 2 Support, Product Ownership, and AI Operations. You will investigate complex product issues,...
-
MLOps Engineer, Serbia
1 week ago
Belgrade, Central Serbia Rhythm Energy Full timeWe are looking for an ML Ops Engineer to join our Data Science and Analytics team to own and support the deployment, management, and optimization of machine learning models and analytics frameworks across of our organization. The role will bridge the gap between data science, technical stakeholders, and operations to ensure a smooth and efficient lifecycle...