Discussion:
certutil -D corrupting NSS database...
Michael H. Warfield
2011-01-25 21:07:58 UTC
Permalink
Hey all!

I'm posting this from one of my alternate accounts since the Mozilla
sever (notorious.mozilla.org) strangely doesn't seem to like my spf
record and for some reason thinks that my server at 130.205.32.3 does
not match the ipv4:130.205.32.0/20 criterion in my spf record. Last I
looked 130.205.32.3 was contained within 130.205.32.0/20. Go figure.

IAC

Just wanted to raise an issue on this list before opening a bugzilla
ticket on it but I seem to have run into a circumstance under which
deleting a certificate from the NSS database ends up doing the wrong
thing with some real confusion resulting that looks like a corrupted or
bad database (but seems to be just a poor error message).

The senario is in Openswan with NSS for the peer certificates. The host
certificates are imported through pk12util after being converted from
their OpenSSL cert and key. The peer certificates have been imported
directly using "certutil -A" since they don't have a private key.
Everything was fine and someone on the Openswan list happen to ask why
didn't I used pk12 for the peer certificate by using the -nokey option
when creating them from openssl. So I tried that and didn't get an
error, but the import did something strange and didn't give me the
correct name from the openssl command. Instead of having a cert in the
database with the name I specified in creating the .p12 file, I ended up
with a cert in the database with the name of the E-Mail address in the
cert. Not sure where that problem is (openssl or the pk12util import).
But, I went to delete that certificate and that's when the fun begun.
"certutil -D -n ***@wittsend.com" ran without error but the cert
was still there. Run it again and you get this error:

[***@romulus ipsec.d]# certutil -D -n ***@wittsend.com -d .
certutil: could not find certificate named "***@wittsend.com":
security library: bad database.

That's also when I noticed I was missing at least one other cert. It
appears that the first delete deleted the wrong cert and then looked
like it did something bad in the database and can't find the cert
showing up in the list with that name. Turned out it had deleted the
first .p12 cert that I had imported and could no longer find the other
cert. Looking a little closer, it looks like certutil -D will give that
same "bad database" error anytime it can not find a named cert, so it
may actually not be corrupting the database per se but it is deleting
the wrong certificate and then refuse to find the right certificate at
all afterwords. You can't list it even though it's still there on the
list.

Sequence of things I did and the results are below my signature block
with a few comments in square brackets... I figure this one is heading
for bugzilla one way or the other but wanted to hear others thoughts on
it first.

Oh... This is on Fedora 13 with nss-util 3.12.8 as well as Fedora 14
with nss-util 3.12.9.

Regards,
Mike
--
Michael H. Warfield (AI4NB) | Desk: (404) 236-2807
Senior Researcher - X-Force | Cell: (678) 463-0932
IBM Security Services | ***@linux.vnet.ibm.com ***@wittsend.com
6303 Barfield Road | http://www.iss.net/
Atlanta, Georgia 30328 | http://www.wittsend.com/mhw/
| PGP Key: 0x674627FF
--
[***@romulus ipsec.d]# certutil -L -d .

Certificate Nickname Trust Attributes
SSL,S/MIME,JAR/XPI

remus.wittsend.com u,u,u
gorgon8.wittsend.com ,,
romulus.wittsend.com u,u,u
complex.wittsend.com ,,
gorgon9.wittsend.com ,,
wittsendCA C,C,C
WittsEndCA C,C,C
[***@romulus ipsec.d]# openssl pkcs12 -export -in certs/gorgon10.wittsend.com.crt -nokeys -name gorgon10.wittsend.com -out gorgon10.wittsend.com.p12
Enter Export Password:
Verifying - Enter Export Password:
[***@romulus ipsec.d]# pk12util -i gorgon10.wittsend.com.p12 -d .
Enter password for PKCS12 file:
pk12util: PKCS12 IMPORT SUCCESSFUL
[***@romulus ipsec.d]# certutil -L -d .

Certificate Nickname Trust Attributes
SSL,S/MIME,JAR/XPI

remus.wittsend.com u,u,u
gorgon8.wittsend.com ,,
romulus.wittsend.com u,u,u
complex.wittsend.com ,,
gorgon9.wittsend.com ,,
wittsendCA C,C,C
WittsEndCA C,C,C
***@wittsend.com ,,
***** [^^^ Note wrong name ^^^]
[***@romulus ipsec.d]# certutil -D -n ***@wittsend.com -d .
[***@romulus ipsec.d]# certutil -L -d .

Certificate Nickname Trust Attributes
SSL,S/MIME,JAR/XPI

gorgon8.wittsend.com ,,
romulus.wittsend.com u,u,u
complex.wittsend.com ,,
gorgon9.wittsend.com ,,
wittsendCA C,C,C
WittsEndCA C,C,C
***@wittsend.com ,,
***** [Note "remus" cert is gone. The "***@wittsend.com" cert is still there!]
[***@romulus ipsec.d]# certutil -L -n ***@wittsend.com -d .
certutil: Could not find cert: ***@wittsend.com
: File not found.
***** [Oh really... It's there in the listing.]
[***@romulus ipsec.d]# certutil -D -n ***@wittsend.com -d .
certutil: could not find certificate named "***@wittsend.com": security library: bad database.
***** [That's kind of a scary message to get back that the database is bad.]
Nelson B Bolyard
2011-01-30 12:12:23 UTC
Permalink
Michael,
Can you make available to me the cert8.db file and the "nokey" p12 files
exactly as they were before you did the fateful certutil -D step?
If so, I'm interested in trying to track this down.

I have a test for you to try that *MAY* (or may not) prove to be a
solution for you. I believe you're using cert8.db and key3.db files.
Let me suggest that you start over with cert9 and key4.db files,
which are sqlite3 NSS DB files. It's easy to do with NSS 3.12.x.
You can simply set the environment variable NSS_DEFAULT_DB_TYPE to "sql"
(without the quotes), or you can prefix "sql:" to the DB directory name
argument (usually -d) EVERYWHERE it is found, in every program that uses
the DB. I think you may find that the problems you had just go away
when using cert9.db files ... or maybe not.

You're absolutely right that the "bad database" errors don't necessarily,
or even usually mean database is corrupt. They usually mean that a read
or delete attempt failed because the record was not found, or that a write
attempt failed because a matching record WAS found and doing the
write would have created a duplicate. Most of the time, it just means
"record not found". But databases have been known to become corrupt and
when they do, those "not found" errors happen a LOT, which is how they
came to be (mis)identified as "bad database" errors.

In your case, I think you did achieve database corruption. When you get
to the state were certutil -L shows a cert with a nickname (say "server")
but certutil -L -n server says "not found: bad database", that really is
a bad database. I think the pk12util step to import the "nokeys" p12
file may have caused that corruption, and if so, then I'm very interested
in fixing it.
Post by Michael H. Warfield
Just wanted to raise an issue on this list before opening a bugzilla
ticket on it but I seem to have run into a circumstance under which
deleting a certificate from the NSS database ends up doing the wrong
thing with some real confusion resulting that looks like a corrupted or
bad database (but seems to be just a poor error message).
If you file the bug now, the most accurate thing you can really say is
"cert8 DB seems corrupt after pk12util import of p12 file with no keys".

NSS requires every cert in the DB to have a nickname or email address
that it can list out in certutil -L. For cert8 DB, these are actually
stored in separate DB tables. Every cert's subject name should appear in
exactly one table, nickname or email-address. The command certutil -A
adds a record to the nickname table. Certutil -E adds a record to the
email address table. Looks like you got a cert whose subject name appears
in both tables. Bad news. When it was deleted, one of those two table
entries (the nickname) was deleted with it, and the other (email) became
an orphan.

pk12util has code that will create a nickname to complete a cert import.
It will do this if either (a) a cert in a p12 file has no nickname, or
(b) the cert being imported has a "name collision" with a cert already
in the DB. I can't be certain which of those applies to you without
seeing the files.

Another question is: where did the email table entry come from?
As I recall, pk12util resolves the above issues by creating a nickname,
not by creating an email record, but something created an email record.

In any case, cert9 may just clear this all up, so give it a try.
Post by Michael H. Warfield
Sequence of things I did and the results are below my signature block
with a few comments in square brackets... I figure this one is heading
for bugzilla one way or the other but wanted to hear others thoughts on
it first.
Oh... This is on Fedora 13 with nss-util 3.12.8 as well as Fedora 14
with nss-util 3.12.9.
Thanks for all the great details.
--
/Nelson Bolyard
--
dev-tech-crypto mailing list
dev-tech-***@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto
Michael H. Warfield
2011-01-30 17:18:38 UTC
Permalink
Warning: This message has had one or more attachments removed
Warning: (gorgon10.wittsend.com.p12).
Warning: Please read the "WittsEnd-Attachment-Warning.txt" attachment(s) for more information.

Hey hey...
Post by Nelson B Bolyard
Michael,
Can you make available to me the cert8.db file and the "nokey" p12 files
exactly as they were before you did the fateful certutil -D step?
If so, I'm interested in trying to track this down.
Attached. Did two runs. Same p12 file. One with a cert8.db and one
with a cert9.db. The other certs in the cert?.db files were imported
with pkcs12util if they had keys or this if I did not have a private
key:

certutil -A -n ${BASENAME} -t u,u,u -i ${CERT} -d ${TARGETDIR}

It was only the gorgon10 cert that I was importing using a .p12 file
just to test the methodology.
Post by Nelson B Bolyard
I have a test for you to try that *MAY* (or may not) prove to be a
solution for you. I believe you're using cert8.db and key3.db files.
Let me suggest that you start over with cert9 and key4.db files,
which are sqlite3 NSS DB files. It's easy to do with NSS 3.12.x.
You can simply set the environment variable NSS_DEFAULT_DB_TYPE to "sql"
(without the quotes), or you can prefix "sql:" to the DB directory name
argument (usually -d) EVERYWHERE it is found, in every program that uses
the DB. I think you may find that the problems you had just go away
when using cert9.db files ... or maybe not.
This got a little strange. Working with my full database of certs, I
was able to reproduced the problems with both. So, it did not fix the
problem. I set up a test setup with just a handful of certs and a
non-production test "private key" (gorgon8) and reran the tests. This
time, the cert8 still failed but the cert9, while it still imported the
cert with the wrong name, seem to delete the errant cert properly. So
that's a bit confusing but I don't believe it's behaving properly
either.

If you want, I can also supply you with the cert?.db files from the full
run.
Post by Nelson B Bolyard
You're absolutely right that the "bad database" errors don't necessarily,
or even usually mean database is corrupt. They usually mean that a read
or delete attempt failed because the record was not found, or that a write
attempt failed because a matching record WAS found and doing the
write would have created a duplicate. Most of the time, it just means
"record not found". But databases have been known to become corrupt and
when they do, those "not found" errors happen a LOT, which is how they
came to be (mis)identified as "bad database" errors.
In your case, I think you did achieve database corruption. When you get
to the state were certutil -L shows a cert with a nickname (say "server")
but certutil -L -n server says "not found: bad database", that really is
a bad database. I think the pk12util step to import the "nokeys" p12
file may have caused that corruption, and if so, then I'm very interested
in fixing it.
Post by Michael H. Warfield
Just wanted to raise an issue on this list before opening a bugzilla
ticket on it but I seem to have run into a circumstance under which
deleting a certificate from the NSS database ends up doing the wrong
thing with some real confusion resulting that looks like a corrupted or
bad database (but seems to be just a poor error message).
If you file the bug now, the most accurate thing you can really say is
"cert8 DB seems corrupt after pk12util import of p12 file with no keys".
NSS requires every cert in the DB to have a nickname or email address
that it can list out in certutil -L. For cert8 DB, these are actually
stored in separate DB tables. Every cert's subject name should appear in
exactly one table, nickname or email-address. The command certutil -A
adds a record to the nickname table. Certutil -E adds a record to the
email address table. Looks like you got a cert whose subject name appears
in both tables. Bad news. When it was deleted, one of those two table
entries (the nickname) was deleted with it, and the other (email) became
an orphan.
pk12util has code that will create a nickname to complete a cert import.
It will do this if either (a) a cert in a p12 file has no nickname, or
(b) the cert being imported has a "name collision" with a cert already
in the DB. I can't be certain which of those applies to you without
seeing the files.
Another question is: where did the email table entry come from?
As I recall, pk12util resolves the above issues by creating a nickname,
not by creating an email record, but something created an email record.
In any case, cert9 may just clear this all up, so give it a try.
Post by Michael H. Warfield
Sequence of things I did and the results are below my signature block
with a few comments in square brackets... I figure this one is heading
for bugzilla one way or the other but wanted to hear others thoughts on
it first.
Oh... This is on Fedora 13 with nss-util 3.12.8 as well as Fedora 14
with nss-util 3.12.9.
Thanks for all the great details.
--
/Nelson Bolyard
--
Michael H. Warfield (AI4NB) | (770) 985-6132 | ***@WittsEnd.com
/\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
Nelson B Bolyard
2011-02-13 06:33:49 UTC
Permalink
[...] Instead of having a cert in the
database with the name I specified in creating the .p12 file, I ended up
with a cert in the database with the name of the E-Mail address in the
cert. Not sure where that problem is (openssl or the pk12util import).
But, I went to delete that certificate and that's when the fun begun.
security library: bad database.
That's also when I noticed I was missing at least one other cert.
I was unable to reproduce any of this with the cert DB you sent me.
Before I deleted the cert with that command above, the cert DB was OK,
not corrupted, and after I deleted it, it was also OK. The cert I had
specified, and its nickname record AND its email record were all deleted
from the DB, leaving it in a consistent state. A second delete attempt
produced the same error message you saw, but didn't modify the DB at all.
I tried with both certutil and libs from NSS 3.11.latest and 3.12.latest
and got the same results both ways.

I have these thoughts about the different behaviors that you and I
experienced.

1) Maybe you had another program that was also holding the DB files open
at the same time you did the certutil -D command.

2) IINM, You had the private key for some certs in your key3.db by virtue
of having used pk12util to import one or more, and I didn't. That might
have made a difference.

3) It's possible that the original cert DB you had was in some state of
corruption, and the cert DB you reconstructed for my testing was not
corrupted.

Unless and until I can reproduce the behavior you saw, I won't be of much
help in resolving it. Sorry. :-/
--
/Nelson Bolyard
--
dev-tech-crypto mailing list
dev-tech-***@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-crypto
Michael H. Warfield
2011-03-13 05:21:57 UTC
Permalink
Hey,

I've been massively distracted in other projects so I'm way behind in
this issue...
Post by Nelson B Bolyard
[...] Instead of having a cert in the
database with the name I specified in creating the .p12 file, I ended up
with a cert in the database with the name of the E-Mail address in the
cert. Not sure where that problem is (openssl or the pk12util import).
But, I went to delete that certificate and that's when the fun begun.
security library: bad database.
That's also when I noticed I was missing at least one other cert.
I was unable to reproduce any of this with the cert DB you sent me.
Before I deleted the cert with that command above, the cert DB was OK,
not corrupted, and after I deleted it, it was also OK. The cert I had
specified, and its nickname record AND its email record were all deleted
from the DB, leaving it in a consistent state. A second delete attempt
produced the same error message you saw, but didn't modify the DB at all.
I tried with both certutil and libs from NSS 3.11.latest and 3.12.latest
and got the same results both ways.
I have these thoughts about the different behaviors that you and I
experienced.
1) Maybe you had another program that was also holding the DB files open
at the same time you did the certutil -D command.
Nope. Absolutely not. This was done in an isolated test directory
which nothing else even referenced.
Post by Nelson B Bolyard
2) IINM, You had the private key for some certs in your key3.db by virtue
of having used pk12util to import one or more, and I didn't. That might
have made a difference.
Ah... Now we may have something here. I'm doing a pure import.
Nothing original was created in the NSS database. Everything was
imported into it. Take that as a premise. It is fundamental. Start
with a database and import everything. Create nothing new within NSS
itself.

This is, unfortunately, a result of Fedora's decision to convert
OpenSwan to use NSS. It's caused a number of people a great deal of
grief. People are trying to import existing configurations into NSS and
the documentation, quite frankly, doesn't work in many cases. People
with established X.509 cert setups that have been working for years
don't want to recreate everything from scratch. It's unfortunate that
this was poorly planned with no concrete upgrade / transition mechanism
in place before doing this and abysmal documentation.
Post by Nelson B Bolyard
3) It's possible that the original cert DB you had was in some state of
corruption, and the cert DB you reconstructed for my testing was not
corrupted.
No. I took snapshots of the database as I performed those actions. You
literally got what I had at each step of the process. Nothing was a
reconstruction. I'm a practiced diagnostician and forensic
investigator.
Post by Nelson B Bolyard
Unless and until I can reproduce the behavior you saw, I won't be of much
help in resolving it. Sorry. :-/
Sounds like I need to drop back a couple of steps then and give you the
whole process, step by steo, from the very beginning of the creation of
the database to the corruption. This I had not done. I'll go back and
set those scenarios back up and reproduce them from the "mkdir" forward
and provide you with the full import data (including the creation of the
pk12 files using openssl). It may take some time. My schedule is full
and I tend to get distracted.
Post by Nelson B Bolyard
--
/Nelson Bolyard
Regards,
Mike
--
Michael H. Warfield (AI4NB) | (770) 985-6132 | ***@WittsEnd.com
/\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
Loading...