Uploaded image for project: 'Support'
  1. Support
  2. SUPPORT-140

Signer segfaulting after SoftHSM access and pthreading problems

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: OpenDNSSEC 1.4.5
    • Fix Version/s: None
    • Component/s: PKCS#11 Interface, Signer
    • Labels:
      None
    • Environment:

      opendnssec-1.4.5-2.el6.x86_64
      (recompiled from source RPM to include support for MySQL back-end)
      on CentOS 6.5

      Description

      No functional problems with OpenDNSSEC, but after a few days the signer starts having problems with access to the SoftHSM file, followed by pthread problems, until ods-signerd crashes with a segfault:

      Sep 1 21:09:35 services ods-signerd: SoftHSM: init: Could not open token database. Probably wrong privileges: /var/softhsm/
      slot0.db
      Sep 1 21:09:35 services ods-signerd: [hsm] idle libhsm connection, trying to reopen
      Sep 1 21:09:35 services ods-signerd: [hsm] libhsm connection opened succesfully
      Sep 1 21:09:35 services ods-signerd: [hsm] error signing rrset with libhsm
      Sep 1 21:09:35 services ods-signerd: [rrset] unable to sign RRset[6]: lhsm_sign() failed
      Sep 1 21:09:35 services ods-signerd: [hsm] error signing rrset with libhsm
      Sep 1 21:09:35 services ods-signerd: [rrset] unable to sign RRset[50]: lhsm_sign() failed
      Sep 1 21:09:35 services ods-signerd: [hsm] error signing rrset with libhsm
      Sep 1 21:09:35 services ods-signerd: [worker[2]] sign zone chiplist.com failed: 2 RRsets failed
      Sep 1 21:09:35 services ods-signerd: [rrset] unable to sign RRset[6]: lhsm_sign() failed
      Sep 1 21:09:35 services ods-signerd: [worker[2]] CRITICAL: failed to sign zone chiplist.com: General error
      Sep 1 21:09:35 services ods-signerd: [worker[2]] backoff task [sign] for zone chiplist.com with 60 seconds
      Sep 1 21:09:35 services ods-signerd: [hsm] error signing rrset with libhsm
      Sep 1 21:09:35 services ods-signerd: [rrset] unable to sign RRset[28]: lhsm_sign() failed
      Sep 1 21:09:35 services ods-signerd: [worker[3]] sign zone processor-portal.com failed: 2 RRsets failed
      Sep 1 21:09:35 services ods-signerd: [worker[3]] CRITICAL: failed to sign zone processor-portal.com: General error
      Sep 1 21:09:35 services ods-signerd: [worker[3]] backoff task [sign] for zone processor-portal.com with 60 seconds
      ...
      Sep 1 21:10:02 services ods-signerd: SoftHSM: init: Could not open token database. Probably wrong privileges: /var/softhsm/slot0.db
      Sep 1 21:10:02 services ods-signerd: [hsm] idle libhsm connection, trying to reopen
      Sep 1 21:10:02 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 1 21:10:04 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 1 21:10:04 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): No such process
      Sep 1 21:10:04 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): No such process
      Sep 1 21:10:04 services ods-signerd: [hsm] libhsm connection opened succesfully
      Sep 1 21:10:05 services ods-signerd: [hsm] error signing rrset with libhsm
      Sep 1 21:10:05 services ods-signerd: [rrset] unable to sign RRset[6]: lhsm_sign() failed
      Sep 1 21:10:05 services ods-signerd: [worker[3]] sign zone techamatic.net failed: 1 RRsets failed
      Sep 1 21:10:05 services ods-signerd: [worker[3]] CRITICAL: failed to sign zone techamatic.net: General error
      ...
      Sep 3 15:59:03 services ods-signerd: SoftHSM: init: Could not open token database. Probably wrong privileges: /var/softhsm/slot0.db
      Sep 3 15:59:03 services ods-signerd: [hsm] idle libhsm connection, trying to reopen
      Sep 3 15:59:04 services ods-signerd: SoftHSM: init: Could not open token database. Probably wrong privileges: /var/softhsm/slot0.db
      Sep 3 15:59:04 services ods-signerd: [zone] unable to prepare signing keys for zone processor-portal.org: error creating libhsm context
      Sep 3 15:59:04 services ods-signerd: [worker[1]] CRITICAL: failed to sign zone processor-portal.org: HSM error
      Sep 3 15:59:04 services ods-signerd: [worker[1]] backoff task [sign] for zone processor-portal.org with 60 seconds
      Sep 3 15:59:05 services ods-signerd: SoftHSM: OSDestroyMutex: Failed to destroy POSIX mutex
      Sep 3 15:59:06 services kernel: ods-signerd[17589]: segfault at 25 ip 0000000000000025 sp 00007ff072da99d8 error 4
      ...
      ep 4 14:53:41 services ods-signerd: SoftHSM: init: Could not open token database. Probably wrong privileges: /var/softhsm/slot0.db
      Sep 4 14:53:41 services ods-signerd: [hsm] idle libhsm connection, trying to reopen
      Sep 4 14:53:41 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:43 services ods-signerd: [hsm] idle libhsm connection, trying to reopen
      Sep 4 14:53:43 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:43 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      ...
      Sep 4 14:53:44 services ods-signerd: SoftHSM: init: Could not open token database. Probably wrong privileges: /var/softhsm/slot0.db
      Sep 4 14:53:44 services ods-signerd: [hsm] idle libhsm connection, trying to reopen
      Sep 4 14:53:44 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:44 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:44 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:44 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:44 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:44 services ods-signerd: daemon/engine.c at 442 could not pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument
      Sep 4 14:53:44 services kernel: ods-signerd[5052]: segfault at 7f4f035fe9d0 ip 00007f4f0cd0d143 sp 00007f4f0939ace0 error 4 in libpthread-2.12.so[7f4f0cd05000+17000]

      The token file /var/softhsm/slot0.db has permissions 640 root:ods
      and both the Signer and Enforcer run as ods:ods.

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            adrian Adrian
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated: