Uploaded image for project: 'Support'
  1. Support
  2. SUPPORT-278

Concurrency problem when signer calls hsm_get_dnskey(), results in "[hsm] hsm_get_dnskey(): Got NULL key"

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: OpenDNSSEC 2.1
    • Fix Version/s: None
    • Component/s: PKCS#11 Interface
    • Labels:
      None
    • Environment:

      RHEL9 x86_64, OpenDNSSEC 2.1.12 using rpmbuild, SoftHSM 2.61. from RHEL9 AppStream repo, mysqld 10.5.16-MariaDB, 4-core VM in ESXi, signer config is default 4 WorkerThreads, 1 SignerThread

      Description

      Hello,

      }}{{we run ods for .sa and .xn--mgberp4a5d4ar with MySQL/MariaDB as the backend. After attempting migration from current 1.4.7 to 2.1.x (tried 2.1.6 and 2.1.12), I believe we hit a concurrency issue when signer calls hsm_get_dnskey() in src/signer/hsm.c:112. Many, but not all, zones fail with

      {{[hsm] unable to get key: key af0e063c291c05b20d57c74f156dee465101837d not found  }}
      [hsm] hsm_get_dnskey(): Got NULL key
      [hsm] unable to get key: hsm failed to create dnskey
      [zone] unable to prepare signing keys for zone gov.sa: error getting dnskey

      ..but the keys are visible with 'sudo -u ods ods-enforcer key list --verbose' and 'sudo -u ods ods-hsmutil list -v'.

      I played with gdb breakpoints a few hours, and saw that despite the key lookups succeed, the second argument (key) to hsm_get_dnskey() was 0x0:

      #0  hsm_get_dnskey (ctx=0x7fffd8005050, key=0x0, sign_params=0x555559da70a0) at ../../libhsm/src/lib/libhsm.c:3375

      I put mutex around the hsm_get_dnskey() call and the problem went away. No more 'Got NULL key'.

      I later tried making the key struct non-static, and also as separate variable like 'located_key = keylookup(ctx, key_id->locator' and using that in the hsm_get_dnskey() call, but they didn't help.

      I'm not expert enough to look deeper into the HSM code and the naive approach that seems to work for me might be incorrect.

      In any case, I'll attach a patch and the .spec file for our build. The migration script in .spec needs to be fixed for the MySQL backend case, but it's somewhat rewritten for the general SQLite3 backend case (and should work).

       

      Best regards,

      {{-- }}

      Mikko Rantanen

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            dogo Mikko Rantanen
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated: