Description
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
Do you want to develop communication libraries to enable high performance computing and machine learning workloads at Exascale AMD is searching for talented and motivated mathematicians, scientists and engineers to develop GPU libraries as part of the AMD Radeon Open Ecosystem (ROCm).
THE PERSON:
You are accustomed to working in a dynamic, geographically distributed agile team, where partnership and collaboration are paramount. You possess excellent written and verbal communication skills, strong attention to detail, and the ability to express your work in a clear, cohesive fashion. You are results-oriented and accustomed to tight deadlines and changing priorities. Most importantly, you are constantly thinking of ways to improve performance of software and hardware.
KEY RESPONSIBILITIES:
- Support AMD's RCCL, an open source, GPU-accelerated communication collective middleware and related technologies
- Design, implement, and test algorithms for multi-GPU and multi-node communication libraries.
- Benchmark, profile and optimize code to maximize throughput on single-GPU, multi-GPU and clustered systems
- Deliver high-quality code and documentation following best practices for open source software development
- Work with key technical experts across AMD and with our partners and customers to improve ROCm applications, libraries, and tools
PREFERRED EXPERIENCE:
- Strong background developing applications and libraries in C, C++, and Python
- GPU software development using HIP, CUDA, or OpenCL
- Experience with communication middleware
- Experience with data transfer technologies, such as RDMA, Infiniband, and libfabric
- Understanding of CPU and GPU architectures and low-level optimization techniques including assembly programming and/or vectorization
- Parallel programming experience using OpenMP, MPI
- In-depth knowledge of best-practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning
- Contributions to open source libraries and applications
ACADEMIC CREDENTIALS:
- B.Sc. or B.Eng. degree in Computer Science, Software Engineering, Electrical Engineering, or equivalent
- Advanced degrees, such as M.Sc., M.Eng., Ph.D. are preferred
This role is not eligible for visa sponsorship.
THE ROLE:
In this role, you will provide our development team Quality support for a library enabling GPU and multicore operations powering AI, LLM, and deep learning applications. You will be responsible for developing and executing comprehensive test strategies for our open-source, C++-based library, leveraging your expertise in test automation, continuous integration, and quality assurance processes. You will work closely with developers to ensure stability, reliability, and performance of the library via both automated tests, as well as hands-on testing.
THE PERSON:
We are seeking an talented and motivated Developer with a eye for Quality to join our team. If you're passionate about high-quality code and test-driven development, this is an excellent opportunity to make a significant impact.
KEY RESPONSIBILITIES:
Test Automation Development: Design, implement, and maintain automated test suites using Google Test (gtest) for an open-source, C++-based library.
- CI/CD Integration: Integrate test automation frameworks into the Jenkins pipeline, ensuring seamless execution of tests and rapid feedback for developers.
Performance Testing: Conduct performance testing to ensure the library meets necessary performance benchmarks and can scale as needed. Investigate performance regressions, and help establish baseline performance tests
Bug Detection & Reporting: Identify, isolate, and report defects found during testing and work with developers to prioritize and resolve issues.
Continuous Improvement: Continuously improve the test infrastructure and methodologies, proposing tools or techniques that can improve the testability of the codebase.
Collaboration & Documentation: Work with cross-functional teams, document test results, and assist in creating user-friendly reports that communicate the quality status of the project.
- Test Planning: Collaborate with developers and product teams to define test strategies, test cases, and acceptance criteria for new features and enhancements in the library.
- Code Coverage: Develop and analyze solutions, identify gaps, and drive improvements in both test coverage and quality.
PREFERRED EXPERIENCE:
- Proven experience as an SDET or in a similar role with a focus on C++ development and testing.
- Strong experience with GoogleTest (gtest) for unit and integration testing in C++ environments.
- Hands-on experience with Jenkins for automating test execution and integrating tests into the CI/CD pipeline.
- In-depth knowledge of software testing methodologies, frameworks, and tools for automated testing.
- Proficiency in C++ programming, with a strong understanding of memory management, data structures, and algorithms.
- Experience working in Linux-based environments for development and testing.
- Familiarity with Docker and containerization technologies for managing test environments and ensuring consistent test execution.
- Familiarity with version control systems like Git, as well as development tools and practices used in open-source communities.
- Solid understanding of performance testing, including profiling, benchmarking, and analyzing results.
- Excellent problem-solving skills and a proactive approach to testing and debugging.
- Strong written and verbal communication skills with the ability to collaborate effectively with both technical and non-technical teams.
ACADEMIC CREDENTIALS:
- B.Sc. or B.Eng. degree in Computer Science, Software Engineering, Electrical Engineering, Applied Mathematics, or equivalent
LOCATION: Seattle, Washington
#LI-DR1
#LI-HYBRID
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's “Responsible AI Policy” is available here.
This posting is for an existing vacancy.
Apply on company website