Uploaded image for project: 'Support'
  1. Support
  2. SUPPORT-181

OpenDNSSEC claims saved ixfr files are corrupt when restarting

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: OpenDNSSEC 1.4.7
    • Fix Version/s: None
    • Component/s: Signer
    • Labels:
      None
    • Environment:

      NetBSD/amd64 6.1_STABLE

      Description

      I'm using OpenDNSSEC in the "zone transfer in, zone transfer
      out" mode of operation. Judging from the lack of response to
      my mailing list query, it doesn't look like there are many others
      which do...

      When restarting OpenDNSSEC, all too often it occurs that the
      signer complains bitterly about corrupted ixfr files. The code
      right after detecting this error unlinks the file and proceeds.
      I've created a patch which instead renames the supposedly
      corrupt file, so that it can be inspected instead of discarded,
      with the aim of getting rid of the supposed corruption, either by
      getting a fix to the "writer" or the "reader" part for these files,
      because the file contents appears to be "undamaged" from any
      external events.

      The typical log messages are now:

      Oct 13 10:55:55 hugin ods-signerd: [zone] corrupted journal file zone 2.39.128.in-addr.arpa, skipping (General error)
      Oct 13 10:55:55 hugin ods-signerd: [zone] corrupted journal for zone 2.39.128.in-addr.arpa saved as 2.39.128.in-addr.arpa.ixfr-bad
      Oct 13 10:55:55 hugin ods-signerd: [backup] bad ixfr journal: trailing RRs after final SOA

      The ixfr cache files I've found all contain a number of SOA records,
      and if I read the code correctly, that's the way incremental changes
      are represented. So the file format of the ixfr journal file isn't supposed to be the same as an AXFR, with one SOA at the front and one at the end. However, I'm also not quite able to decipher
      from the code what the actual format of the ixfr journal files are supposed to be.

      So the first gripe is that the error message being logged is probably misleading – it can direct the operator to think that the file is supposed to use the AXFR format with only two SOAs, and no RRs after the final SOA.

      I attach below the patch I have to save the supposedly-bad ixfr files, and a copy of the ixfr file corresponding to the log message above.
      The supposedly-bad ixfr file has no less than 8 SOA records, and I'm not able to decipher what's wrong with it.

      This is part of my push towards the goal of "OpenDNSSEC should be restartable and start up and run without scary-as-hell error messages, without first removing the temporary files it itself has written and left behind", and if error messages can be made less misleading and/or more informative to the operator in the process, that would be a nice bonus. Getting rid of what appears to be
      "self-inflicted" errors (which this appears to be) should also be a goal.

      Regards,

      Håvard

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            he Håvard Eidnes
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: