Carnegie Mellon University is a private, global research university that stands among the world’s most renowned education institutions. With ground-breaking brain science, path-breaking performances, creative start-ups, big data, big ambitions, hands-on learning, and a whole lot of robots, CMU doesn’t imagine the future, we invent it. If you’re passionate about joining a community that challenges the curious to deliver work that matters, your journey starts here!

The Pittsburgh Supercomputing Center (PSC) a joint research center of Carnegie Mellon University and the University of Pittsburgh, was established in 1986, and for over 30 years has provided university, government, and industrial researchers with access to several of the most powerful systems for advanced computational research, communications, and data storage available to scientists, engineers and scholars nationwide for unclassified research. PSC advances science across a wide spectrum of fields, including artificial intelligence / machine learning, medical imaging, weather modeling, cell biology, and genomics.

PSC is seeking creative and capable individuals to join our highly experienced Advanced Systems and Operations team as our new Cyber Infrastructure Engineer.  The successful candidate will collaborate with the research community and key IT staff to create robust solutions by designing and implementing next-generation infrastructure. They will be responsible for planning and managing automations while also maintaining the current PSC resources. This is an exciting opportunity to join a growing team at the nexus of technology, research, and software development dedicated to helping the scientific community solve challenging and complex problems.

The Advanced Systems and Operations group within PSC is responsible for the integration and operations of computational assets central to this pursuit. We are looking for creative and capable individuals to join an experienced team and continue our part in pushing forward the boundaries of science.

Core Responsibilities:

  • Conceives, designs, implements, administers, optimizes, and monitors existing and future HPC systems and the cyber infrastructure that supports them.
  • Collaborate with research teams and head technology responsibilities for grant implementation.
  • Build relationships with research community and external vendors.
  • Assess and provide advanced technical support to research community for ongoing and future systems and/or application needs.
  • Contribute to best practices, documentation, and published papers.
  • Proactive and reactive performance analysis, monitoring, troubleshooting and resolution of issues.
  • Engineer tools and automations to assist with maintenance tasks.
  • Research and explore next-generation technologies for future implementation.
  • Perform operating system software upgrades, deployments and troubleshooting of the project servers and desktop workstations.
  • Other related duties as assigned.

Adaptability, excellence, and passion are vital qualities within Carnegie Mellon University.  We are in search of a team member who can effectively interact with a varied population of internal and external partners at a high level of integrity. We are looking for someone who shares our values and who will support the mission of the university through their work.

Qualifications:

  • Bachelor’s degree or equivalent experience
  • Experience with Linux systems administration
  • Experience with file system administration.
  • Experience with network switching and routing.
  • Experience with scripting languages (e.g. Python, BASH)
  • Desire to teach, learn and lead for continuing team development.
  • A combination of education and relevant experience from which comparable knowledge is demonstrated may be considered.

Preferred Skills and Experience:

  • Experience writing and/or extending systems administration software (e.g. utilities, libraries, plugins)
  • Experience with parallel file systems (e.g. Lustre, GPFS)
  • Experience in high performance computing (HPC) environment.
  • Experience with configuration management software such as Puppet, Chef, or Ansible for systems and networking.
  • Experience with virtualization management infrastructure such as oVirt, VMWare, KVM
  • Experience with cloud services such as AWS, GCP, OpenStack, or Azure.
  • Experience with containerize execution such as Singularity, Docker, and Kubernetes.

Requirements:

  • Successful background check

Are you interested in this exciting opportunity? Please apply