HPN-SSHHigh performance SSH/SCP
HPN-SSH is a research project based at the Pittsburgh Supercomputing Center
(PI) Chris Rapier PSC, Michael Stevens CMU, Benjamin Bennett PSC, Mike Tasota PSC/CMU
What is HPN-SSH?
HPN-SSH is a series of modifications to OpenSSH, the predominant implementation of the ssh protocol. It was originally developed to address performance issues when using ssh on high speed long distance networks (also known as Long Fat Networks: LFNs). By taking advantage of automatically optimized receive buffers HPN-SSH could improve performance dramatically on these paths. Later advances include; disabling encryption after authentication to transport non-sensitive bulk data, modifying the AES-CTR cipher to use multiple CPU cores, more detailed connection logging, and peak throughput values in the scp progress bar. More information can be found on HPN-SSH page on the PSC website.
Developing the next version of HPN-SSH
The developers of HPN-SSH at the Pittsburgh Supercomputing Center (PSC) have recently received a grant from the National Science Foundation to develop and incorporate new features and optimizations. This grant will provide direct support to developers at PSC for two years. The goal of this grant (NSF Award#: 2004012) is to provide HPN-SSH with the level of performance required in modern high performance computing.
What are you working on?
We’ve identified six different areas where we would like to focus our efforts. This is not meant to be an exhaustive list but is more of a starting point for our deliverables. Depending on community input this list may change to develop advances of highest interest. The six initial proposed areas of work are:
- Automatically resume failed transfers: There is nothing quite as frustrating as having scp or sftp fail in the middle of a large transfer. Currently ssh does not have a mechanism to allow for failed transfers to restart from the point of failure. HPN-SSH is proposing to develop a mechanism to reliably resume failed transfers. We expect to do this by computing a hash of the partial file and compare it to a corresponding byte range of the original file. If these match then HPN-SSH will append the missing information to the partial file. If they do not match then the entire file will be transferred.
- Incorporate AES-NI into the AES-CTR cipher: The AES-NI instruction set is a set of on die instructions that use hardware acceleration to increase the performance of common AES functions. The result is that on CPUs that support AES-NI the default AES-CTR cipher is faster than HPN-SSH’s multithreaded cipher. We will work on incorporating AES-NI into the multithreaded cipher. We expect that this will allow for faster transfers when ssh is CPU bound.
- Parallelization of CHACHA20 cipher: CHACHA20 is a fast secure cipher that is the current default for OpenSSH. Initial investigation indicates that CHACHA20 can be transformed into a multithreaded cipher. This will allow the workload to be distributed across more CPU cores and should allow for faster transfers. We believe this will be important in situations where multiple users are simultaneously transferring files to the same host.
- Inline Network Telemetry: Sometimes figuring out why a ssh connection is underperforming is a difficult task. To help with diagnostics HPN-SSH will deploy network telemetry. In this diagnostic mode both the client and server will periodically query network statistics (such as retransmits, out of order packets, time spent buffer limited, and so forth) and store this data for analysis. This data may also be periodically displayed to the user. Initially this will be limited to linux installations where we have access to the TCP_INFO struct.
- Pipelining HMAC generation: The Hash-based Message Authentication Code (HMAC) is a one way cryptographic hash used by ssh to ensure that a datagram has not been modified en-route between the hosts. This ensures that the data has not been subjected to a man in the middle attack. In OpenSSH this is a step in a very linear process. No other work can be conducted (such as encrypting other data) while the HMAC is being computed. In many cases this can act a bottleneck on throughput. HPN-SSH is proposing to pipeline this process in order to mitigate this bottleneck as much as possible.
- Packaging and Distribution: HPN-SSH was, for a very long time, only available as a series of patches. Later it became a GitHub repo. This turned out to be a non-optimal method of distributing HPN-SSH to the public. With this in mind we will be working to provide precompiled packages for a variety of operating systems and Linux distributions and the creation of canonical package repositories (such as PPAs). We will also be reaching out to distribution maintainers to make HPN-SSH an option for all of their users.
What can I do to help?
Join our HPN-SSH community mailing list
Stay up-to-date on progress and improvements to HPN-SSH by joining this list, intended for both developers and users.
Make a donation
If you care about HPN-SSH there is no better way to show your support than making a donation to the Pittsburgh Supercomputing Center. I do not personally receive any money from these donations but your support ends up supporting our work. Any amount is worth while – even a dollar will show PSC and CMU your support for our work. Seriously, show your support in order to both keep HPN-SSH current and fund new improvements.
To support HPN-SSH, go to the PSC giving page at https://www.psc.edu/giving/ and click the “Give online” button. In the next window, choose “Add a designation” and note that it is to support HPN-SSH. Thank you!
Notes and News
All patch sets from 4.4p1 to 8.1p1 are now available on SourceForge at https://sourceforge.net/projects/hpnssh/. The entire codebase (merged with OpenSSH) is also available as a git repo from https://github.com/rapier1/openssh-portable. The SourceForge location now...
We are proud to announce that the HPN-SSH development team has received a grant from the National Science Foundation (Award#: 2004012) to continue development on HPN-SSH. This grant will be used to develop and incorporate new features and optimizations. This grant...
This work was made possible in part by grants from Cisco Systems, Inc., the National Science Foundation, and the National Library of Medicine.