[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Infinite loop with concurrent ssh_connect() on CentOS 6
[Thread Prev] | [Thread Next]
- Subject: Re: Infinite loop with concurrent ssh_connect() on CentOS 6
- From: Aris Adamantiadis <aris@xxxxxxxxxxxx>
- Reply-to: libssh@xxxxxxxxxx
- Date: Wed, 29 Oct 2014 08:21:12 +0100
- To: libssh@xxxxxxxxxx
Hi Doug, This is correct. Also for the linking with libssh_threads.so to be useful, you had to include this snippet: #include <libssh/callbacks.h> ... ssh_threads_set_callbacks(ssh_threads_get_pthread()); ssh_init <http://api.libssh.org/master/group__libssh.html#ga3ebf8d6920e563f3b032e3cd5277598e>(); Boost and C++11 on linux certainly use pthreads, but if you target windows as well, your solution scales much better since boost already provides cross-platform threading primitives. It's another story on windows as there are competing threading techniques. Aris Le 28/10/14 23:50, Doug Judd a écrit : > Correct me if I'm wrong, but looking at the verbose build output, it > looks like the only difference between libssh.so and libssh_threads.so > is that the latter includes thread/pthread.c. pthread.c looks like > the threading hooks for pthreads. Since I've implemented my own > threading hooks, I think I should be ok just linking with libssh.so. > Does this sound correct to you? > > > On Tue, Oct 28, 2014 at 3:33 PM, Doug Judd <doug@xxxxxxxxxxxxxx > <mailto:doug@xxxxxxxxxxxxxx>> wrote: > > Hi Aris, > > Thanks for the reply. I did read that section, but I may have > mis-interpreted it. My understanding is that I only need to link > with libssh_threads if using pthreads. Since my application uses > Boost threads and C++11 threads, I assumed that I was not using > pthreads, so I followed the advice of bullet point #3 and > implemented my own threading hooks and did not link with > libssh_threads. However, grep'ing through the boost header > files, there are over 500 lines that reference pthread, so maybe > Boost threads use pthreads under-the-hood. What do you recommend > for my situation (combination of Boost threads and C++11 > threads)? Do I still need to link with libssh_threads, or is > implementing my own threading hooks sufficient? > > - Doug > > > On Tue, Oct 28, 2014 at 2:12 PM, Aris Adamantiadis > <aris@xxxxxxxxxxxx <mailto:aris@xxxxxxxxxxxx>> wrote: > > Le 28/10/14 17:12, Doug Judd a écrit : > > It looks like I can get around this last problem by calling ssh_init() > > in the beginning of the program before spawning any > threads. From an > > API design standpoint, however, if a call to ssh_init() is > > prerequisite for calling ssh_connect() concurrently, then > > ssh_connect() should verify that ssh_init() has been > called. If it > > hasn't been called it should fail with an informative error > code/message. > > > > - Doug > Hi Doug, > > Thanks for your feedback. > While this could be a solution, 99% of developers using libssh > are not > using threads and this issue is already extensively covered in the > documentation > (http://api.libssh.org/master/libssh_tutor_threads.html). > Calling ssh_init() in the beginning of your program is not > enough, you > must explicitly link with libssh_threads & us > ssh_threads_set_callbacks > or implement your own threading hooks. > I have looked at other alternatives, like doing this > automatically, and > they were not satisfying in a performance/dependency point of > view. We > basically copied the libcrypto model (with providing a shared > lib for > pthread for convenience). > > Regards, > > Aris > > > > > > > On Tue, Oct 28, 2014 at 8:36 AM, Doug Judd > <doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx> > > <mailto:doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx>>> wrote: > > > > Here's another infinite loop problem when calling > ssh_connect() > > concurrently. This time it looks like it's in the libcrypto > > initialization code: > > > > Thread 13 (Thread 0x7f833a002700 (LWP 24506)): > > #0 0x00007f83420af3e9 in lh_insert () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #1 0x00007f834200e7e5 in OBJ_NAME_add () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #2 0x00007f83420c0176 in OpenSSL_add_all_ciphers () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #3 0x00007f83420bfd0e in OPENSSL_add_all_algorithms_noconf () > > from /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #4 0x00007f8342675515 in ssh_crypto_init () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #5 0x00007f83426776a2 in ssh_init () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #6 0x00007f8342673929 in ssh_connect () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #7 0x0000000000435614 in > > Hypertable::SshSocketHandler::handle(int, int) () > > #8 0x0000000000484efa in > > Hypertable::IOHandlerRaw::handle_event(epoll_event*, > long) () > > #9 0x0000000000492e24 in > Hypertable::ReactorRunner::operator()() () > > #10 0x00007f83415bace3 in thread_proxy () from > > > /opt/hypertable/doug/0.9.8.3/lib/libboost_thread.so.1.54.0 > <http://0.9.8.3/lib/libboost_thread.so.1.54.0> > > <http://0.9.8.3/lib/libboost_thread.so.1.54.0> > > #11 0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0 > > #12 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6 > > Thread 12 (Thread 0x7f8339601700 (LWP 24507)): > > #0 0x00007f83420af3e9 in lh_insert () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #1 0x00007f834200e7e5 in OBJ_NAME_add () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #2 0x00007f83420c0176 in OpenSSL_add_all_ciphers () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #3 0x00007f83420bfd0e in OPENSSL_add_all_algorithms_noconf () > > from /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #4 0x00007f8342675515 in ssh_crypto_init () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #5 0x00007f83426776a2 in ssh_init () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #6 0x00007f8342673929 in ssh_connect () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #7 0x0000000000435614 in > > Hypertable::SshSocketHandler::handle(int, int) () > > #8 0x0000000000484efa in > > Hypertable::IOHandlerRaw::handle_event(epoll_event*, > long) () > > #9 0x0000000000492e24 in > Hypertable::ReactorRunner::operator()() () > > #10 0x00007f83415bace3 in thread_proxy () from > > > /opt/hypertable/doug/0.9.8.3/lib/libboost_thread.so.1.54.0 > <http://0.9.8.3/lib/libboost_thread.so.1.54.0> > > <http://0.9.8.3/lib/libboost_thread.so.1.54.0> > > #11 0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0 > > #12 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6 > > Thread 11 (Thread 0x7f8338c00700 (LWP 24508)): > > #0 0x00007f83420af3e9 in lh_insert () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #1 0x00007f834200e7e5 in OBJ_NAME_add () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #2 0x00007f83420c0176 in OpenSSL_add_all_ciphers () from > > /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #3 0x00007f83420bfd0e in OPENSSL_add_all_algorithms_noconf () > > from /opt/hypertable/doug/0.9.8.3/lib/libcrypto.so.1.0.0 > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > <http://0.9.8.3/lib/libcrypto.so.1.0.0> > > #4 0x00007f8342675515 in ssh_crypto_init () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #5 0x00007f83426776a2 in ssh_init () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #6 0x00007f8342673929 in ssh_connect () from > > /opt/hypertable/doug/0.9.8.3/lib/libssh.so.4 > <http://0.9.8.3/lib/libssh.so.4> > > <http://0.9.8.3/lib/libssh.so.4> > > #7 0x0000000000435614 in > > Hypertable::SshSocketHandler::handle(int, int) () > > #8 0x0000000000484efa in > > Hypertable::IOHandlerRaw::handle_event(epoll_event*, > long) () > > #9 0x0000000000492e24 in > Hypertable::ReactorRunner::operator()() () > > #10 0x00007f83415bace3 in thread_proxy () from > > > /opt/hypertable/doug/0.9.8.3/lib/libboost_thread.so.1.54.0 > <http://0.9.8.3/lib/libboost_thread.so.1.54.0> > > <http://0.9.8.3/lib/libboost_thread.so.1.54.0> > > #11 0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0 > > #12 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6 > > ... > > > > The process is stuck at 100% CPU utilization. The > version of > > openssl that the program is linked with is 1.0.2-beta3. > > > > - Doug > > > > > > On Tue, Oct 21, 2014 at 10:32 AM, Doug Judd > <doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx> > > <mailto:doug@xxxxxxxxxxxxxx <mailto:doug@xxxxxxxxxxxxxx>>> wrote: > > > > I'm developing a multi-host ssh tool using libssh > 0.6.3. The > > tool establishes connections asynchronously and in > parallel. > > Intermittently, the tool will get stuck in a busy > loop with a > > stack trace such as the following: > > > > Thread 13 (Thread 0x7f65f9b41700 (LWP 13549)): > > #0 0x0000003ee3adc613 in poll () from /lib64/libc.so.6 > > #1 0x0000003ee3b0fe3c in clntudp_call () from > /lib64/libc.so.6 > > #2 0x0000003ee76058bb in do_ypcall () from > /lib64/libnsl.so.1 > > #3 0x0000003ee76060ab in yp_match () from > /lib64/libnsl.so.1 > > #4 0x00007f65f1f14f79 in _nss_nis_getpwuid_r () > from /lib64/libnss_nis.so.2 > > #5 0x0000003ee3aaa4ed in getpwuid_r@@GLIBC_2.2.5 () > from /lib64/libc.so.6 > > #6 0x00007f6601b0152e in ssh_path_expand_tilde () > from /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4 > <http://0.9.8.2/lib/libssh.so.4> > > #7 <http://0.9.8.2/lib/libssh.so.4#7> > 0x00007f6601b02bc3 in ssh_options_set () from > /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4 > <http://0.9.8.2/lib/libssh.so.4> > > #8 <http://0.9.8.2/lib/libssh.so.4#8> > 0x00007f6601b036fb in ssh_options_apply () from > /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4 > <http://0.9.8.2/lib/libssh.so.4> > > #9 <http://0.9.8.2/lib/libssh.so.4#9> > 0x00007f6601af68fe in ssh_connect () from > /opt/hypertable/doug/0.9.8.2/lib/libssh.so.4 > <http://0.9.8.2/lib/libssh.so.4> > > #10 <http://0.9.8.2/lib/libssh.so.4#10> > 0x000000000043547a in > Hypertable::SshSocketHandler::handle(int, int) () > > #11 0x0000000000484b9a in Hypertable::IOHandlerRaw::handle_event(epoll_event*, long) () > > #12 0x0000000000492ac4 in > Hypertable::ReactorRunner::operator()() () > > #13 0x00007f66010f8ce3 in thread_proxy () from > /opt/hypertable/doug/0.9.8.2/lib/libboost_thread.so.1.54.0 > <http://0.9.8.2/lib/libboost_thread.so.1.54.0> > > #14 > <http://0.9.8.2/lib/libboost_thread.so.1.54.0#14> > 0x0000003ee42077f1 in start_thread () from /lib64/libpthread.so.0 > > #15 0x0000003ee3ae5ccd in clone () from /lib64/libc.so.6 > > > > I did a little digging around and came across a > ticket filed > > against sssd > <https://fedorahosted.org/sssd/ticket/640> which > > I believe is the source of the problem. It appears that > > getpwuid_r() is not thread safe under certain > circumstances. > > > > From what I can tell, ssh_connect() will use ~/.ssh > as the ssh > > directory if one is not explicitly supplied. It's > during the > > expansion of the ~ character that getpwuid_r() gets > called. > > The workaround is to explicitly set the ssh > directory using a > > path that does not include the ~ character, for example: > > > > char *home = getenv("HOME"); > > if (home == nullptr) > > error("Environment variable HOME is not set"); > > string ssh_dir(home); > > ssh_dir.append("/.ssh"); > > ssh_options_set(m_ssh_session, SSH_OPTIONS_SSH_DIR, > > ssh_dir.c_str()); > > > > Attached is a patch to libssh that eliminates the ~ > expansion > > for the default case (~/.ssh). In my test > environment, the > > problem is very intermittent and I don't have a > reproducible > > test case, so I'm not 100% sure this solution solves the > > problem. However, given the evidence, I think it's > a safe bet. > > > > - Doug > > > > -- > > Doug Judd > > www.hypertable.com <http://www.hypertable.com> > <http://www.hypertable.com> > > > > > > > > > > -- > > Doug Judd > > CEO, Hypertable Inc. > > > > > > > > > > -- > > Doug Judd > > CEO, Hypertable Inc. > > > > > > > -- > Doug Judd > CEO, Hypertable Inc. > > > > > -- > Doug Judd > CEO, Hypertable Inc.
Infinite loop with concurrent ssh_connect() on CentOS 6 | Doug Judd <doug@xxxxxxxxxxxxxx> |
Re: Infinite loop with concurrent ssh_connect() on CentOS 6 | Doug Judd <doug@xxxxxxxxxxxxxx> |
Re: Infinite loop with concurrent ssh_connect() on CentOS 6 | Doug Judd <doug@xxxxxxxxxxxxxx> |
Re: Infinite loop with concurrent ssh_connect() on CentOS 6 | Aris Adamantiadis <aris@xxxxxxxxxxxx> |
Re: Infinite loop with concurrent ssh_connect() on CentOS 6 | Doug Judd <doug@xxxxxxxxxxxxxx> |
Re: Infinite loop with concurrent ssh_connect() on CentOS 6 | Doug Judd <doug@xxxxxxxxxxxxxx> |