Back to Search Results
Get alerts for jobs like this Get jobs like this tweeted to you
Company: AMD
Location: Secaucus, NJ
Career Level: Mid-Senior Level
Industries: Technology, Software, IT, Electronics

Description



WHAT YOU DO AT AMD CHANGES EVERYTHING 

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.  Together, we advance your career.  



THE ROLE:

The Quality Engineering team is looking for an experienced GPU ASIC and PCBA Debug and Failure Analysis Engineering Manager to lead and develop a team of FA engineers. This role is intended for a proven people manager with prior experience building, mentoring, and guiding high-performing engineering teams, while also serving as a strong technical lead in GPU ASIC and board-level (PCBA) failure analysis. The individual will oversee customer and factory failure investigations for GPU accelerators, help drive failure reproduction and isolation, and work closely with cross-functional teams including design, validation, FW, and manufacturing to accelerate root cause analysis and corrective actions. Your contributions will directly impact team effectiveness, product quality, reliability, and customer satisfaction.

 

THE PERSON:

The ideal candidate is a strong people leader and technical expert who leads by example and is passionate about building, teaching, and mentoring a growing team of high-performing FA engineers. They bring prior experience managing, hiring, and developing engineers, creating an environment of accountability, collaboration, and continuous learning, while remaining hands-on enough to guide complex debug and failure analysis efforts in a fast-paced time to market environment. This person is a clear communicator, and a trusted technical leader who can elevate team capability, help others grow in their careers, and drive strong execution in a fast-paced environment. They combine deep analytical problem-solving skills with a practical, hands-on approach, and continuously look for ways to improve team effectiveness, technical depth, and overall quality outcomes.

 

KEY RESPONSIBILITIES:

  • Provide technical leadership for triage and debug of complex GPU and PCBA failures across power, ASIC, firmware, and thermals, guiding the FA team to root cause.
  • Lead failure reproduction and triage by defining debug plans, directing investigations, and guiding experiments and escalation paths for complex issues.
  • Drive debug automation, diagnostic tools, and data analysis methods that improve triage efficiency and consistency across failure domains.
  • Lead cross-functional triage with manufacturing partners and AMD teams to align on failure hypotheses, reproduction, and root cause.
  • Guide board-level debug using schematics, layouts, and design documentation to direct analysis and mentor engineers through the process.
  • Ensure clear documentation of failure analysis results, root cause findings, and corrective actions for customer and internal use.
  • Present technical findings, triage updates, risks, and recovery plans to stakeholders and senior leadership.
  • Drive continuous improvement of FA methods, triage processes, and best practices across power, ASIC, firmware, and thermal debug.
  • Manage and develop a team of FA engineers by setting priorities, providing technical guidance, and coaching through complex investigations.

PREFERRED EXPERIENCE:

  • Experience leading and developing engineering teams, with a strong track record of hiring, coaching, mentoring, and growing FA engineers.
  • Deep expertise in GPU ASIC debug, validation, and functional or stress test development.
  • Strong background in PCBA diagnostics, failure analysis, and board-level debug from NPI through production.
  • Experience leading triage across power, ASIC, firmware, and thermal failure domains.
  • Strong hands-on lab experience with oscilloscopes, logic analyzers, and custom debug tools.
  • Solid understanding of firmware, drivers, and hardware interactions in complex system debug.
  • Extensive experience in hardware verification, system integration, and failure reproduction.
  • Proficient in Python, shell scripting, and working across Windows and Linux environments.
  • Strong leadership, communication, and presentation skills, with the ability to teach, mentor, and lead by example.
  • Able to read schematics, interpret datasheets, identify components, and support board-level debug and rework.
  • Knowledge of high-speed digital design, HBM or GDDR memory, PCIe, and GPU data center systems is a plus.

 

ACADEMIC CREDENTIALS:

  • Bachelor's degree in Electrical Engineering, Computer Engineering, or a related field.
  • 3+ years of experience management experience

 

LOCATION:

  • Secaucus, NJ


This role is not eligible for visa sponsorship.

 

#LI-AP2



Benefits offered are described:  AMD benefits at a glance.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

 

AMD may use Artificial Intelligence to help screen, assess or select applicants for this position.  AMD's “Responsible AI Policy” is available here.

 

This posting is for an existing vacancy.


 Apply on company website