Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite distant-ssh2 using russh (native Rust) #193

Open
chipsenkbeil opened this issue Jun 2, 2023 · 8 comments
Open

Rewrite distant-ssh2 using russh (native Rust) #193

chipsenkbeil opened this issue Jun 2, 2023 · 8 comments
Labels
refactor Refactor portions of codebase
Milestone

Comments

@chipsenkbeil
Copy link
Owner

There are a lot of problems with the ssh libraries we're using today. They're unreliable, error-prone, and inconsistent. This ignores the build complexity that they introduce as well.

russh is a Rust-native implementation of ssh, which should ideally work as a client to other SSHD implementations. The core library is lacking sftp support, but we could use russh-sftp for inspiration, even though it only supports the server-side portion of sftp.

The specification for sftp (version 3) doesn't seem that complex to implement, so this could be worth pursuing.

@chipsenkbeil
Copy link
Owner Author

@baoyachi if you are asking if this is planned, then yes, it is. It is in the 1.0 milestone, meaning that it could be worked on at any point between now and the release of 1.0.

@chipsenkbeil
Copy link
Owner Author

Also look at https://www.rfc-editor.org/rfc/rfc4254#page-14 which defines the v2 spec of ssh. It highlights things like extended data for stderr.

@chipsenkbeil
Copy link
Owner Author

And https://github.com/Miyoshi-Ryota/async-ssh2-tokio/blob/main/src/client.rs as an example of authentication, server validation, and process execution. If we can extend this to support sftp, it should cover all needs.

@chipsenkbeil
Copy link
Owner Author

chipsenkbeil commented Aug 31, 2023

Check out https://www.rfc-editor.org/rfc/rfc4251#page-8 for data type formats.

Data Type Representations Used in the SSH Protocols

byte

  A byte represents an arbitrary 8-bit value (octet).  Fixed length
  data is sometimes represented as an array of bytes, written
  byte[n], where n is the number of bytes in the array.

boolean

  A boolean value is stored as a single byte.  The value 0
  represents FALSE, and the value 1 represents TRUE.  All non-zero
  values MUST be interpreted as TRUE; however, applications MUST NOT
  store values other than 0 and 1.

uint32

  Represents a 32-bit unsigned integer.  Stored as four bytes in the
  order of decreasing significance (network byte order).  For
  example: the value 699921578 (0x29b7f4aa) is stored as 29 b7 f4
  aa.

uint64

  Represents a 64-bit unsigned integer.  Stored as eight bytes in
  the order of decreasing significance (network byte order).

string

  Arbitrary length binary string.  Strings are allowed to contain
  arbitrary binary data, including null characters and 8-bit
  characters.  They are stored as a uint32 containing its length
  (number of bytes that follow) and zero (= empty string) or more
  bytes that are the value of the string.  Terminating null
  characters are not used.

  Strings are also used to store text.  In that case, US-ASCII is
  used for internal names, and ISO-10646 UTF-8 for text that might
  be displayed to the user.  The terminating null character SHOULD
  NOT normally be stored in the string.  For example: the US-ASCII
  string "testing" is represented as 00 00 00 07 t e s t i n g.  The
  UTF-8 mapping does not alter the encoding of US-ASCII characters.

mpint

  Represents multiple precision integers in two's complement format,
  stored as a string, 8 bits per byte, MSB first.  Negative numbers
  have the value 1 as the most significant bit of the first byte of
  the data partition.  If the most significant bit would be set for
  a positive number, the number MUST be preceded by a zero byte.
  Unnecessary leading bytes with the value 0 or 255 MUST NOT be
  included.  The value zero MUST be stored as a string with zero
  bytes of data.

  By convention, a number that is used in modular computations in
  Z_n SHOULD be represented in the range 0 <= x < n.

     Examples:

     value (hex)        representation (hex)
     -----------        --------------------
     0                  00 00 00 00
     9a378f9b2e332a7    00 00 00 08 09 a3 78 f9 b2 e3 32 a7
     80                 00 00 00 02 00 80
     -1234              00 00 00 02 ed cc
     -deadbeef          00 00 00 05 ff 21 52 41 11

name-list

  A string containing a comma-separated list of names.  A name-list
  is represented as a uint32 containing its length (number of bytes
  that follow) followed by a comma-separated list of zero or more
  names.  A name MUST have a non-zero length, and it MUST NOT
  contain a comma (",").  As this is a list of names, all of the
  elements contained are names and MUST be in US-ASCII.  Context may
  impose additional restrictions on the names.  For example, the
  names in a name-list may have to be a list of valid algorithm
  identifiers (see Section 6 below), or a list of RFC3066 language
  tags.  The order of the names in a name-list may or may not be
  significant.  Again, this depends on the context in which the list
  is used.  Terminating null characters MUST NOT be used, neither
  for the individual names, nor for the list as a whole.

   Examples:

   value                      representation (hex)
   -----                      --------------------
   (), the empty name-list    00 00 00 00
   ("zlib")                   00 00 00 04 7a 6c 69 62
   ("zlib,none")              00 00 00 09 7a 6c 69 62 2c 6e 6f 6e 65

@chipsenkbeil
Copy link
Owner Author

chipsenkbeil commented Aug 31, 2023

Some parts of sftp were wrapped to be more comfortable, particularly reading a file, which sftp exposes through specifying a maximum length to read and an offset. Two things:

  1. I'd been considering refactoring reading to be a stream, so maybe that's something we do to support reading subsets of a file. See [Investigate] Switch directory retrieval and file reading to be streams #164.
  2. If we want to support reading everything, we need to wrap this.

See how libssh does it by checking out sftp_read and sftp_readdir.

Our own wrapper around wezterm-ssh uses the implementation of AsyncRead to invoke read_to_string to continue reading until EOF is reached, reading up to buf len bytes at a time. We can probably keep that logic as long as our wrapper supports AsyncRead.

@baoyachi
Copy link

baoyachi commented Sep 1, 2023

Good. 👍

@AspectUnk
Copy link

AspectUnk/russh-sftp#4 (comment) if you need any help with russh integration let me know, I'm very interested in your project

@chipsenkbeil
Copy link
Owner Author

AspectUnk/russh-sftp#4 (comment) if you need any help with russh integration let me know, I'm very interested in your project

Thanks! Still haven't gotten to it yet. Was reading how to handle a pty using russh since I also need to support that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Refactor portions of codebase
Projects
None yet
Development

No branches or pull requests

3 participants