Developing the next version of HPN-SSH:
The developers of HPN-SSH at the Pittsburgh Supercomputing Center (PSC) have recently received a grant from the National Science Foundation to develop and incorporate new features and optimizations. This grant will provide direct support to developers at PSC for two years. The goal of this grant (NSF Award#: 2004012) is to provide HPN-SSH with the level of performance required in modern high performance computing.
What is HPN-SSH?
HPN-SSH is a series of modifications to OpenSSH, the predominant implementation of the ssh protocol. It was originally developed to address performance issues when using ssh on high speed long distance networks (also known as Long Fat Networks: LFNs). By taking advantage of automatically optimized receive buffers HPN-SSH could improve performance dramatically on these paths. Later advances include; disabling encryption after authentication to transport non-sensitive bulk data, modifying the AES-CTR cipher to use multiple CPU cores, more detailed connection logging, and peak throughput values in the scp progress bar. More information can be found on HPN-SSH page on the PSC website.
What are you working on?
We’ve identified six different areas where we would like to focus our efforts. This is not meant to be an exhaustive list but is more of a starting point for our deliverables. Depending on community input this list may change to develop advances of highest interest. The six initial proposed areas of work are:
- Automatically resume failed transfers: There is nothing quite as frustrating as having scp or sftp fail in the middle of a large transfer. Currently ssh does not have a mechanism to allow for failed transfers to restart from the point of failure. HPN-SSH is proposing to develop a mechanism to reliably resume failed transfers. We expect to do this by computing a hash of the partial file and compare it to a corresponding byte range of the original file. If these match then HPN-SSH will append the missing information to the partial file. If they do not match then the entire file will be transferred.
- Incorporate AES-NI into the AES-CTR cipher: The AES-NI instruction set is a set of on die instructions that use hardware acceleration to increase the performance of common AES functions. The result is that on CPUs that support AES-NI the default AES-CTR cipher is faster than HPN-SSH’s multithreaded cipher. We will work on incorporating AES-NI into the multithreaded cipher. We expect that this will allow for faster transfers when ssh is CPU bound.
- Parallelization of CHACHA20 cipher: CHACHA20 is a fast secure cipher that is the current default for OpenSSH. Initial investigation indicates that CHACHA20 can be transformed into a multithreaded cipher. This will allow the workload to be distributed across more CPU cores and should allow for faster transfers. We believe this will be important in situations where multiple users are simultaneously transferring files to the same host.
- Inline Network Telemetry: Sometimes figuring out why a ssh connection is underperforming is a difficult task. To help with diagnostics HPN-SSH will deploy network telemetry. In this diagnostic mode both the client and server will periodically query network statistics (such as retransmits, out of order packets, time spent buffer limited, and so forth) and store this data for analysis. This data may also be periodically displayed to the user. Initially this will be limited to linux installations where we have access to the TCP_INFO struct.
- Pipelining HMAC generation: The Hash-based Message Authentication Code (HMAC) is a one way cryptographic hash used by ssh to ensure that a datagram has not been modified en-route between the hosts. This ensures that the data has not been subjected to a man in the middle attack. In OpenSSH this is a step in a very linear process. No other work can be conducted (such as encrypting other data) while the HMAC is being computed. In many cases this can act a bottleneck on throughput. HPN-SSH is proposing to pipeline this process in order to mitigate this bottleneck as much as possible.
- Packaging and Distribution: HPN-SSH was, for a very long time, only available as a series of patches. Later it became a github repo. This turned out to be a non-optimal method of distributing HPN-SSH to the public. With this in mind we will be working to provide precompiled packages for a variety of operating systems and Linux distributions and the creation of canonical package repositories (such as PPAs). We will also be reaching out to distribution maintainers to make HPN-SSH an option for all of their users.
What can I do to help?
Please start by taking the HPN-SSH User survey.
First and foremost we would like to have the community help us figure out what our priorities should be. While we will address each of the above areas we don’t have a fixed schedule of what we should work on first. So if the community tells us (via our survey) that resuming failed transfers would have the biggest impact we will work on that first. Second, we need people who are willing to try the new versions of HPN-SSH and give us feedback on how they are working by filing bug reports, giving us suggestions for improvements, and so forth. We really want this next phase of HPN-SSH development to be a joint and iterative process between the developers and the users.
Also, please join our HPN-SSH community mailing list. This is a list for both developers and users.
This work has been made possible by a grant from the Nation Science Foundation