[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GSOC Project discussion : Async sftp implementation

Subject: Re: GSOC Project discussion : Async sftp implementation
From: Jakub Jelen <jjelen@xxxxxxxxxx>
Reply-to: libssh@xxxxxxxxxx
Date: Tue, 7 Mar 2023 12:22:45 +0100
To: Eshan Kelkar <eshankelkar@xxxxxxxxxxxxx>, libssh@xxxxxxxxxx

On 3/7/23 10:58, Eshan Kelkar wrote:

Hello Jakub,

    I would suggest starting from the benchmark code
    we have in tests/benchmarks/, which has an example of async donwload
    (but there is only the download part -- the upload part does not
    exist now).
As per your suggestion, I have read that example and have gone throughthe source code to understand how the current async read is workingusing sftp_async_read_begin and sftp_async_read. Please correct me if Iam wrong but the way I understand how the api currently works is like this.
First let me go through the synchronous read function which will help usto better understand how the async read api works :
[sync_read refers to the sftp_read function in normal(blocking mode) insrc/sftp.c]
sync_read(file_handle, buffer_to_read_into, count)
{
NOTE : In the text sftp session refers to the session corresponding to the
file_handle received in the  parameter

//Phase-1 (Registering the request)
//----------------------------------------------
Step- 1 : Get an id for this read request with respect to the sftpsession corresponding
to the file whose handle is received in the parameter.
Step-2 : Pack a ssh_buffer with the id, file handle, offset and countof bytes to read.Step-3 : Write a packet using sftp_packet_write to send a read requestalong with
this ssh_buffer.
Step-4 : After a successful Step-3 free this ssh_buffer as its no longerneeded and
its data has been sent along with the packet

//Phase-2(Waiting for the request to get processed)
--------------------------------------------------------------------
Now the packet gets sent along with the request id. The server processes it
and sends a packet back again containing the same id as in the request.
The id is kept the same to match the request and response. The incomingpacketsfrom the server are read using sftp_packet_read_and_dispatch which readsa packetand adds it to a queue of messages from the server corresponding to thissftp session.
The waiting is done like this
msg=NULL;
while(msg==NULL)
{
read_and_dispatch - read a response packet and add message to queue
sftp_dequeue - try to dequeue a message from the queue corresponding to
the id we got in Phase-1 and assign whatever it returns to the msgvariable. In casethe queue doesn't contain any message with the id of Phase-1 the dequeuefunction
returns NULL which gets assigned to msg and the loop runs again.
}//while(msg==NULL) ends

After we receive the message corresponding to the send request.
If the message contains some data(i.e msg->packet_type is SSH_FXP_DATA),
write that data in the location whose address has been received in theparameter.
So the loop of Phase-2 is responsible for the waiting, which isessentially reading theresponse packet adding the message received in it to aqueue(corresponding to a sftp session)and then the loop checks whether the response received was correspondingto the request sent(matching of request and response based on the id of Phase-1) and if notthen the loop runs again.
}//sync_read ends


This sounds correct.

This explanation of synchronous read is important to understand how theasynchronous read currentlyworks, asynchronous read api is divided into two functionssftp_async_read_begin and sftp_async_read
sftp_async_read_begin(file_handle, count)
{
Phase-1's code(Registering request by sending a packet) of sync_readcomes hereand return the id for the request to the user which he'll pass whilecalling sftp_async_read
}

sftp_async_read(file_handle, where_to_store, bytes_to_read, request_id)
{
Phase-2's code to wait for the response message corresponding to therequest with thatrequest_id as received in parameter and then writing the data which camein that message
comes here
}


Correct.

Now one problem I notice in this approach is that Phase-1's code forsending/registeringa request involves the use of sftp_packet_write and sftp_packet_writeinternallyuses ssh_channel_write (defined in src/channels.c) and thisssh_channel_write is synchronous
write (it may block if unable to write to the channel).

The fact the code is blocking does not imply it is synchronous and viceversa. These are two separate things.

Blocking is "just" a "property" of the socket we are using. Moreover, itlooks like the SFTP does not work in non-blocking mode according to thisissue (but as you can see in the functions sftp_async_read andsftp_read, there are some stubs for working with the non-blocking mode):


https://gitlab.com/libssh/libssh-mirror/-/issues/58

The concept of synchronous process (or polling process) is that it isbased on calling poll on userspace, sending the requests and waiting forresponses. Both of these, involve writing to sockets, reading fromsockets and polling of the socket, which needs to switch context to thekernel space to handle these operations, switching back once theoperation is ready and then a lot of waiting, which will rapidlyincrease in case the remote host is far away. This is not very suitableif we are striving for high throughput and speeds for the transfers suchas in SFTP.

If you build the benchmarks under the libssh, you can test the speedsyourself (against localhost it is not very informative as when you wouldtry to transfer the data across the half of the country or Earth, but itshould give you the idea)


[jjelen@t490s obj (poll-block)]$ ./tests/benchmarks/benchmarks  -h localhost
ping RTT : 0.065000 ms
SSH request times : 0.134000 ms ; 0.094000 ms ; 0.063000 ms
SSH RTT : 0.097000 ms. Theoretical max BW (win=128K) : 1.319588 Gbps
parse error :
localhost : benchmark_raw_download : 746.649597 Mbps
localhost : benchmark_sync_sftp_upload : 9.228348 Mbps
localhost : benchmark_sync_sftp_download : 94.978531 Mbps
localhost : benchmark_async_sftp_download : 6.836124 Gbps

Hence accordingto me sftp_async_read_beginmay also block if say the channel corresponding to that sftp session istoo much saturated/dirty
with pending writes.

Correct. But given that for the download we write just the requests,this is very unlike case, but certainly worth investigating. This ismuch more likely to happen on the sending side (for example with theasync upload).

So technically this operation is not anasynchronous one, its synchronous inthe sense that the control will return after the packet for registeringthe read request has been written
to the underlying send buffer (corresponding to the channel).

So is this current state of async read acceptable ?

Yes, but the blocking properties would be worth investigating andimprove if time permits anyway.

One may argue that the chances of blocking in case of registering a readrequest are lessbecause we're sending less info in the packet : request id, from wherewe have to read and fromwhat offset, but still a scope for blocking still exists and certainlythis kind of approach won't work for async write.As in the write request we also send the data to write usingsftp_packet_write and in this case the chances of
blocking are significant if too much data is to be sent.


Correct (as I mentioned above before finalizing reading your whole message).

Kindly comment on my interpretation of the code, and answer whether thecurrent state of asyncread is as desired or not. If not, please give a rough overview of howit should be and
what is expected out of the async libssh api.

This async API exists only for the upload so the upload speeds(sftp_write) are several orders of magnitude slower. And we need tosupport this direction too, which is the main part of the project.

(For example - my interpretation of the async read is that user issues areadrequest using an api function call, the function returns and hecontinues to dowhat he wants to do while the the read request is handled by the api andthedata received from the server is written to the buffer supplied by theuser. API has somemeans to communicate[via callbacks or some data structure of which userhas access]
to the user about the state of the operation)

I do not think we want an async in this extent that the user would beable to do anything and the stuff would just happen in the "background".The calling application still needs to drive the uploads/downloads,either via callbacks or be in control of how many "concurrent" requestof writes are issued.

Moving the whole logic to io_uring will certainly add some additionalcomplexity and make it less compatible (as mentioned by others) so if wecan do without that, it is probably my preference.


Hope, it helps. Let me know if you will need some more clarifications.

Regards,
Jakub

Thanks,
Eshan Kelkar

On Sun, Mar 5, 2023 at 5:30 PM Jakub Jelen <jjelen@xxxxxxxxxx<mailto:jjelen@xxxxxxxxxx>> wrote:


    On 3/3/23 06:36, Eshan Kelkar wrote:
     > Hi, I am Eshan Kelkar and would like to create the async sftp for
    libssh
     > as the GSOC project. I have gone through liburing to understand what
     > async i/o is and how it is implemented using io_uring. So in this
    async
     > sftp implementation I believe we can place calls to the liburing api
     > functions from inside of the async sftp api functions so that things
     > occur asynchronously.
     >
     > Another approach that comes to my mind is that on a call to async
    sftp
     > api function a seperate thread gets created which does the
    waiting and
     > all and on completion places a call to the callback function
    notifying
     > that the operation has occurred. The second approach is async
     > conceptually as the user of api can continue his job after the
    call as
     > the waiting occurs on the separate thread but this approach seems
    a bit
     > "naive" as for each api call a new thread gets created which is
    resource
     > expensive.
     >
     > Kindly comment on these two approaches and suggest any other
    approach
     > which you have in mind to implement the async sftp api, those
     > suggestions will help me prepare better before sending in the
    proposal.

    Hello Eshan,
    first of all, sorry for late reply. I saw your message on IRC, but
    before I got back to reply, you were already away so thank you for
    patience to reach out to us on other channels.

    The async SFTP implementation is one of our priorities and one of the
    more complicated tasks. I would suggest starting from the benchmark
    code
    we have in tests/benchmarks/, which has an example of async donwload
    (but there is only the download part -- the upload part does not
    exist now).

    I do not think it is a good idea to spawn more threads as it would
    require a lot of synchronization. The example in the benchmarks can run
    several download requests from a single thread, which can help saturate
    the network connection without the need for threads.

    I did not read much about io_uring yet, but it sounds like it solves
    the
    issues we have with speed of synchronous writes/reads caused by context
    switching so this would be our preference. There are already some
    hints/comments in the following issues, so if you will have some more
    questions, comments, feel free to ask here or in either of the
    following
    issues:

    https://gitlab.com/libssh/libssh-mirror/-/issues/65
    <https://gitlab.com/libssh/libssh-mirror/-/issues/65>
    https://gitlab.com/libssh/libssh-mirror/-/issues/124
    <https://gitlab.com/libssh/libssh-mirror/-/issues/124>

    Regards,

--Jakub Jelen

    Crypto Team, Security Engineering
    Red Hat, Inc.


--
Jakub Jelen
Crypto Team, Security Engineering
Red Hat, Inc.

Follow-Ups:
Re: GSOC Project discussion : Async sftp implementation	Eshan Kelkar <eshankelkar@xxxxxxxxxxxxx>

References:
GSOC Project discussion : Async sftp implementation	Eshan Kelkar <eshankelkar@xxxxxxxxxxxxx>
Re: GSOC Project discussion : Async sftp implementation	Jakub Jelen <jjelen@xxxxxxxxxx>

Archive administrator: postmaster@lists.cynapses.org