[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: ssh_connect blocking vs. non-blocking?


Thanks for the reference.  The app does not currently implement the thread callback.  I'll give this a try.

Normally, we receive LibSSH logs.  In the cases that the app crashes on ssh_connect attempt, we are not receiving anything.

Thanks,
Jonathan

-----Original Message-----
From: Aris Adamantiadis [mailto:aris@xxxxxxxxxxxx]
Sent: Monday, November 19, 2012 10:10 AM
To: libssh@xxxxxxxxxx
Subject: Re: ssh_connect blocking vs. non-blocking?

Hi Jonathan,

Before going looking any further, do you implement the initialization of threading in libssh as described here ?

http://api.libssh.org/master/libssh_tutor_threads.html

I still emphase on the need to a backtrace. Without it we cannot really understand what happened.

By the way, on which platform are you using libssh ? (linux, windows, ...)

Thanks,

Aris

Le 19/11/12 16:55, Jonathan Walker a écrit :
> We synced up to and built the code base from the master branch on 8-22-2012, because we needed a fix that was made but no official release was yet available.  Our application is calling "ssh_connect" from multiple threads.
>
> On attempt to start 11 simultaneous sessions, our application crashed.  I started the application on command-line, and received no LibSSH logs (even with FUNCTION level enabled).  This hints to me that something very fishy is happening, since ssh_connect should be printing logs (and usually does when successful).
>
> Our application log shows the following:
>
> 08:34:08.493 ( 4136: 2348) [Agent] start
> 08:34:08.495 ( 4136: 1768) [Agent] start
> 08:34:08.497 ( 4136: 7152) [Agent] start
> 08:34:08.498 ( 4136: 7616) [Agent] start
> 08:34:08.499 ( 4136: 7056) [Agent] start
> 08:34:08.499 ( 4136: 6036) [Agent] start
> 08:34:08.499 ( 4136: 7532) [Agent] start
> 08:34:08.499 ( 4136: 8160) [Agent] start
> 08:34:08.500 ( 4136: 3156) [Agent] start
> 08:34:08.500 ( 4136: 5520) [Agent] start
> 08:34:08.501 ( 4136: 6320) [Agent] start
> 08:34:08.517 ( 4136: 2348) [Agent::Connect] start
> 08:34:08.517 ( 4136: 5520) [Agent::Connect] start
> 08:34:08.517 ( 4136: 3156) [Agent::Connect] start
> 08:34:08.517 ( 4136: 6320) [Agent::Connect] start
> 08:34:08.703 ( 4136: 7152) [Agent::Connect] start
> 08:34:08.703 ( 4136: 6036) [Agent::Connect] start
> 08:34:08.703 ( 4136: 8160) [Agent::Connect] start
> 08:34:08.706 ( 4136: 7056) [Agent::Connect] start
> 08:34:08.890 ( 4136: 1768) [Agent::Connect] start
> 08:34:08.890 ( 4136: 7532) [Agent::Connect] start
> 08:34:08.896 ( 4136: 7616) [Agent::Connect] start <EOF>
>
> Our Agent() constructor simply creates a new session and sets up some options:
>
>         DBG ("start");
>
>         // Create ssh session
>         m_session = ssh_new();
>         if (m_session == NULL)
>         {
>                 ERR ("Failed allocation new ssh session");
>         }
>
>         // Bind to IP address
>         if (ssh_options_set (m_session, SSH_OPTIONS_HOST, IPAddr)
>                 != SSH_OK)
>         {
>                 ERR ("Failed to set ssh host IP");
>         }
>
>         // set verbose, for debug purpose
>         int verbosity = SSH_LOG_FUNCTIONS;
>         if (ssh_options_set (m_session, SSH_OPTIONS_LOG_VERBOSITY, &verbosity)
>                 != SSH_OK)
>         {
>                 ERR ("Failed to set ssh verbose");
>         }
>
> Our Connect() function attempts the "ssh_connect" call:
>
>         DBG ("start");
>
>         // Connect to server
>         if (ssh_connect(m_session) != SSH_OK)
>         {
>                 ERR ("Error connecting to target");
>                 ssh_free (m_session);
>                 m_session = 0;
>
>                 return INIT_ERROR;
>         }
>
>         DBG ("end");
>         return SUCCESS;
>
> Associating the code to the logs, you can see that the constructor finishes for each of the 11 threads, and then the "Connect()" function is attempted in succession.  However, it seems that the application is not returning after the "ssh_connect" call (as can be seen by the fact that no logs are shown after the "start" indicator).  After the final "ssh_connect" call from the 11th thread, the application crashes (as indicated by the fact that the last log is the Connect() "start" from the last thread).
>
> Perhaps our best option is to sync up to the newest official release 0.5.2 and try this out?  Is it seen to be pretty stable insofar?  I'm not sure what other options we might have to resolve this issue.
>
> Thanks,
> Jonathan
>
>
> -----Original Message-----
> From: Aris Adamantiadis [mailto:aris@xxxxxxxxxxxx]
> Sent: Monday, November 19, 2012 6:17 AM
> To: libssh@xxxxxxxxxx
> Subject: Re: ssh_connect blocking vs. non-blocking?
>
> Hi,
>
> By default sessions are blocking. We have made a lot of progress in non-blocking connections and it's believed to work.
> However blocking doesn't mean 100% cpu. Could you provide a backtrace ?
> Also, which version of libssh are you using ?
>
> Thanks,
>
> Aris
>
> Le 14/11/12 19:09, Jonathan Walker a écrit :
>> Hello,
>>
>>
>>
>> I noticed the following note under the doc describing the function
>> */ssh_set_blocking/*
>>
>>
>>
>> *Bug: <http://api.libssh.org/stable/bug.html#_bug000003>*
>>
>> nonblocking code is in development and won't work as expected
>>
>>
>>
>> Are SSH sessions by default  in blocking or non-blocking mode?  I
>> notice in rare cases that the */ssh_connect/* function is getting
>> stuck and taking up 100% CPU on my system.  I was wondering if the
>> above note may have anything to do with this?  Presumably blocking
>> mode should not mean that the function gets stuck in a loop?  I'll
>> try to capture logs, but the issue is not reproducible with a known set of steps.
>>
>>
>>
>> Thanks,
>>
>> Jonathan
>>
>>
>> ---------------------------------------------------------------------
>> -
>> -- This e-mail and any files transmitted with it are ShoreTel
>> property, are confidential, and are intended solely for the use of
>> the individual or entity to whom this e-mail is addressed. If you are
>> not one of the named
>> recipient(s) or otherwise have reason to believe that you have
>> received this message in error, please notify the sender and delete
>> this message immediately from your computer. Any other use,
>> retention, dissemination, forwarding, printing, or copying of this
>> e-mail is strictly prohibited
>
>
> This e-mail and any attachments are confidential.  If it is not intended for you, please notify the sender, and please erase and ignore the contents.
>


This e-mail and any attachments are confidential.  If it is not intended for you, please notify the sender, and please erase and ignore the contents.

References:
ssh_connect blocking vs. non-blocking?Jonathan Walker <jwalker@xxxxxxxxxxxx>
Re: ssh_connect blocking vs. non-blocking?Aris Adamantiadis <aris@xxxxxxxxxxxx>
RE: ssh_connect blocking vs. non-blocking?Jonathan Walker <jwalker@xxxxxxxxxxxx>
Re: ssh_connect blocking vs. non-blocking?Aris Adamantiadis <aris@xxxxxxxxxxxx>
Archive administrator: postmaster@lists.cynapses.org