[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Infinite loop with concurrent ssh_connect() on CentOS 6


Hi Doug,

This is correct. Also for the linking with libssh_threads.so to be
useful, you had to include this snippet:
#include <libssh/callbacks.h>
...
ssh_threads_set_callbacks(ssh_threads_get_pthread());
ssh_init
<http://api.libssh.org/master/group__libssh.html#ga3ebf8d6920e563f3b032e3cd5277598e>();

Boost and C++11 on linux certainly use pthreads, but if you target
windows as well, your solution scales much better since boost already
provides cross-platform threading primitives. It's another story on
windows as there are competing threading techniques.

Aris

Le 28/10/14 23:50, Doug Judd a écrit :
> Correct me if I'm wrong, but looking at the verbose build output, it
> looks like the only difference between libssh.so and libssh_threads.so
> is that the latter includes thread/pthread.c.  pthread.c looks like
> the threading hooks for pthreads.  Since I've implemented my own
> threading hooks, I think I should be ok just linking with libssh.so. 
> Does this sound correct to you?
>
>
> On Tue, Oct 28, 2014 at 3:33 PM, Doug Judd <doug@xxxxxxxxxxxxxx
> <mailto:doug@xxxxxxxxxxxxxx>> wrote:
>
>     Hi Aris,
>
>     Thanks for the reply.  I did read that section, but I may have
>     mis-interpreted it.  My understanding is that I only need to link
>     with libssh_threads if using pthreads.  Since my application uses
>     Boost threads and C++11 threads, I assumed that I was not using
>     pthreads, so I followed the advice of bullet point #3 and
>     implemented my own threading hooks and did not link with
>     libssh_threads.   However, grep'ing through the boost header
>     files, there are over 500 lines that reference pthread, so maybe
>     Boost threads use pthreads under-the-hood.  What do you recommend
>     for my situation (combination of Boost threads and C++11
>     threads)?  Do I still need to link with libssh_threads, or is
>     implementing my own threading hooks sufficient?
>
>     - Doug
>      
>
>     On Tue, Oct 28, 2014 at 2:12 PM, Aris Adamantiadis
>     <aris@xxxxxxxxxxxx <mailto:aris@xxxxxxxxxxxx>> wrote:
>
>         Le 28/10/14 17:12, Doug Judd a écrit :
>         > It looks like I can get around this last problem by calling ssh_init()
>         > in the beginning of the program before spawning any
>         threads.  From an
>         > API design standpoint, however, if a call to ssh_init() is
>         > prerequisite for calling ssh_connect() concurrently, then
>         > ssh_connect() should verify that ssh_init() has been
>         called.  If it
>         > hasn't been called it should fail with an informative error
>         code/message.
>         >
>         > - Doug
>         Hi Doug,
>
>         Thanks for your feedback.
>         While this could be a solution, 99% of developers using libssh
>         are not
>         using threads and this issue is already extensively covered in the
>         documentation
>         (http://api.libssh.org/master/libssh_tutor_threads.html).
>         Calling ssh_init() in the beginning of your program is not
>         enough, you
>         must explicitly link with libssh_threads & us
>         ssh_threads_set_callbacks
>         or implement your own threading hooks.
>         I have looked at other alternatives, like doing this
>         automatically, and
>         they were not satisfying in a performance/dependency point of
>         view. We
>         basically copied the libcrypto model (with providing a shared
>         lib for
>         pthread for convenience).
>
>         Regards,
>
>         Aris
>
>         >
>         >
>         > On Tue, Oct 28, 2014 at 8:36 AM, Doug Judd
>         <doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx>
>         > <mailto:doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx>>> wrote:
>         >
>         >     Here's another infinite loop problem when calling
>         ssh_connect()
>         >     concurrently.  This time it looks like it's in the libcrypto
>         >     initialization code:
>         >
>         >     Thread 13 (Thread 0x7f833a002700 (LWP 24506)):
>         >     #0  0x00007f83420af3e9 in lh_insert () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #1  0x00007f834200e7e5 in OBJ_NAME_add () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #2  0x00007f83420c0176 in OpenSSL_add_all_ciphers () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #3  0x00007f83420bfd0e in OPENSSL_add_all_algorithms_noconf ()
>         >     from /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #4  0x00007f8342675515 in ssh_crypto_init () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #5  0x00007f83426776a2 in ssh_init () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #6  0x00007f8342673929 in ssh_connect () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #7  0x0000000000435614 in
>         >     Hypertable::SshSocketHandler::handle(int, int) ()
>         >     #8  0x0000000000484efa in
>         >     Hypertable::IOHandlerRaw::handle_event(epoll_event*,
>         long) ()
>         >     #9  0x0000000000492e24 in
>         Hypertable::ReactorRunner::operator()() ()
>         >     #10 0x00007f83415bace3 in thread_proxy () from
>         >   
>          /opt/hypertable/doug/0.9.8.3/lib/libboost_thread.so.1.54.0
>         <http://0.9.8.3/lib/libboost_thread.so.1.54.0>
>         >     <http://0.9.8.3/lib/libboost_thread.so.1.54.0>
>         >     #11 0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0
>         >     #12 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6
>         >     Thread 12 (Thread 0x7f8339601700 (LWP 24507)):
>         >     #0  0x00007f83420af3e9 in lh_insert () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #1  0x00007f834200e7e5 in OBJ_NAME_add () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #2  0x00007f83420c0176 in OpenSSL_add_all_ciphers () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #3  0x00007f83420bfd0e in OPENSSL_add_all_algorithms_noconf ()
>         >     from /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #4  0x00007f8342675515 in ssh_crypto_init () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #5  0x00007f83426776a2 in ssh_init () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #6  0x00007f8342673929 in ssh_connect () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #7  0x0000000000435614 in
>         >     Hypertable::SshSocketHandler::handle(int, int) ()
>         >     #8  0x0000000000484efa in
>         >     Hypertable::IOHandlerRaw::handle_event(epoll_event*,
>         long) ()
>         >     #9  0x0000000000492e24 in
>         Hypertable::ReactorRunner::operator()() ()
>         >     #10 0x00007f83415bace3 in thread_proxy () from
>         >   
>          /opt/hypertable/doug/0.9.8.3/lib/libboost_thread.so.1.54.0
>         <http://0.9.8.3/lib/libboost_thread.so.1.54.0>
>         >     <http://0.9.8.3/lib/libboost_thread.so.1.54.0>
>         >     #11 0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0
>         >     #12 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6
>         >     Thread 11 (Thread 0x7f8338c00700 (LWP 24508)):
>         >     #0  0x00007f83420af3e9 in lh_insert () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #1  0x00007f834200e7e5 in OBJ_NAME_add () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #2  0x00007f83420c0176 in OpenSSL_add_all_ciphers () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #3  0x00007f83420bfd0e in OPENSSL_add_all_algorithms_noconf ()
>         >     from /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0
>         <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     <http://0.9.8.3/lib/libcrypto.so.1.0.0>
>         >     #4  0x00007f8342675515 in ssh_crypto_init () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #5  0x00007f83426776a2 in ssh_init () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #6  0x00007f8342673929 in ssh_connect () from
>         >     /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4
>         <http://0.9.8.3/lib/libssh.so.4>
>         >     <http://0.9.8.3/lib/libssh.so.4>
>         >     #7  0x0000000000435614 in
>         >     Hypertable::SshSocketHandler::handle(int, int) ()
>         >     #8  0x0000000000484efa in
>         >     Hypertable::IOHandlerRaw::handle_event(epoll_event*,
>         long) ()
>         >     #9  0x0000000000492e24 in
>         Hypertable::ReactorRunner::operator()() ()
>         >     #10 0x00007f83415bace3 in thread_proxy () from
>         >   
>          /opt/hypertable/doug/0.9.8.3/lib/libboost_thread.so.1.54.0
>         <http://0.9.8.3/lib/libboost_thread.so.1.54.0>
>         >     <http://0.9.8.3/lib/libboost_thread.so.1.54.0>
>         >     #11 0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0
>         >     #12 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6
>         >     ...
>         >
>         >     The process is stuck at 100% CPU utilization.  The
>         version of
>         >     openssl that the program is linked with is 1.0.2-beta3.
>         >
>         >     - Doug
>         >
>         >
>         >     On Tue, Oct 21, 2014 at 10:32 AM, Doug Judd
>         <doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx>
>         >     <mailto:doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx>>> wrote:
>         >
>         >         I'm developing a multi-host ssh tool using libssh
>         0.6.3.  The
>         >         tool establishes connections asynchronously and in
>         parallel.
>         >         Intermittently, the tool will get stuck in a busy
>         loop with a
>         >         stack trace such as the following:
>         >
>         >         Thread 13 (Thread 0x7f65f9b41700 (LWP 13549)):
>         >         #0  0x0000003ee3adc613 in poll () from /lib64/libc.so.6
>         >         #1  0x0000003ee3b0fe3c in clntudp_call () from
>         /lib64/libc.so.6
>         >         #2  0x0000003ee76058bb in do_ypcall () from
>         /lib64/libnsl.so.1
>         >         #3  0x0000003ee76060ab in yp_match () from
>         /lib64/libnsl.so.1
>         >         #4  0x00007f65f1f14f79 in _nss_nis_getpwuid_r ()
>         from /lib64/libnss_nis.so.2
>         >         #5  0x0000003ee3aaa4ed in getpwuid_r@@GLIBC_2.2.5 ()
>         from /lib64/libc.so.6
>         >         #6  0x00007f6601b0152e in ssh_path_expand_tilde ()
>         from /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4
>         <http://0.9.8.2/lib/libssh.so.4>
>         >         #7 <http://0.9.8.2/lib/libssh.so.4#7> 
>         0x00007f6601b02bc3 in ssh_options_set () from
>         /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4
>         <http://0.9.8.2/lib/libssh.so.4>
>         >         #8 <http://0.9.8.2/lib/libssh.so.4#8> 
>         0x00007f6601b036fb in ssh_options_apply () from
>         /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4
>         <http://0.9.8.2/lib/libssh.so.4>
>         >         #9 <http://0.9.8.2/lib/libssh.so.4#9> 
>         0x00007f6601af68fe in ssh_connect () from
>         /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4
>         <http://0.9.8.2/lib/libssh.so.4>
>         >         #10 <http://0.9.8.2/lib/libssh.so.4#10>
>         0x000000000043547a in
>         Hypertable::SshSocketHandler::handle(int, int) ()
>         >         #11 0x0000000000484b9a in Hypertable::IOHandlerRaw::handle_event(epoll_event*, long) ()
>         >         #12 0x0000000000492ac4 in
>         Hypertable::ReactorRunner::operator()() ()
>         >         #13 0x00007f66010f8ce3 in thread_proxy () from
>         /opt/hypertable/doug/0.9.8.2/lib/libboost_thread.so.1.54.0
>         <http://0.9.8.2/lib/libboost_thread.so.1.54.0>
>         >         #14
>         <http://0.9.8.2/lib/libboost_thread.so.1.54.0#14>
>         0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0
>         >         #15 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6
>         >
>         >         I did a little digging around and came across a
>         ticket filed
>         >         against sssd
>         <https://fedorahosted.org/sssd/ticket/640> which
>         >         I believe is the source of the problem.  It appears that
>         >         getpwuid_r() is not thread safe under certain
>         circumstances.
>         >
>         >         From what I can tell, ssh_connect() will use ~/.ssh
>         as the ssh
>         >         directory if one is not explicitly supplied.  It's
>         during the
>         >         expansion of the ~ character that getpwuid_r() gets
>         called.
>         >         The workaround is to explicitly set the ssh
>         directory using a
>         >         path that does not include the ~ character, for example:
>         >
>         >         char *home = getenv("HOME");
>         >         if (home == nullptr)
>         >           error("Environment variable HOME is not set");
>         >         string ssh_dir(home);
>         >         ssh_dir.append("/.ssh");
>         >         ssh_options_set(m_ssh_session, SSH_OPTIONS_SSH_DIR,
>         >         ssh_dir.c_str());
>         >
>         >         Attached is a patch to libssh that eliminates the ~
>         expansion
>         >         for the default case (~/.ssh).  In my test
>         environment, the
>         >         problem is very intermittent and I don't have a
>         reproducible
>         >         test case, so I'm not 100% sure this solution solves the
>         >         problem.  However, given the evidence, I think it's
>         a safe bet.
>         >
>         >         - Doug
>         >
>         >         --
>         >         Doug Judd
>         >         www.hypertable.com <http://www.hypertable.com>
>         <http://www.hypertable.com>
>         >
>         >
>         >
>         >
>         >     --
>         >     Doug Judd
>         >     CEO, Hypertable Inc.
>         >
>         >
>         >
>         >
>         > --
>         > Doug Judd
>         > CEO, Hypertable Inc.
>
>
>
>
>
>
>     -- 
>     Doug Judd
>     CEO, Hypertable Inc.
>
>
>
>
> -- 
> Doug Judd
> CEO, Hypertable Inc.


Archive administrator: postmaster@lists.cynapses.org