Ssl_handshake code gives insufficient feedback

I changed a configuration in my running Elixir system. I’m now getting a notice (which really seems more like an error or at least warning):

16:52:14.545 [notice] ssl_handshake.path_validation_alert/1:2076 TLS :client: In state :wait_cert_cr at ssl_handshake.erl:2076 generated CLIENT ALERT: Fatal - Handshake Failure
 - {:bad_cert, :hostname_check_failed}

It would be incredibly more helpful if the message indicated the hostname that was causing this problem. Telling me that a hostname check failed, without telling me which hostname, is very unhelpful. I don’t know the typical response codes in Erlang, but somehow including the hostname with the problem is pretty much essential.

I’ve seen similar problems in other ssl issues. Please print more data. Or tell me how to turn on a debug flag that would tell me the host name. (Please be specific - I’m a new Elixir guy and know next to nothing about Erlang.)

I have a similar problem in my production logs which periodically generate groups of logs that:
Jul 02 19:23:19.669 [warn] Description: 'Authenticity is not established by certificate path validation' Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
This log also fails to tell me anything about the client or certificate with the issue, so I’ve been unable in over a year to find this problem. An error message that simply says “this error occurred” without giving any context, isn’t worth printing. How do I get these errors improved?

I’m on OTP 24.3 if anyone can tell me that a higher version has improved things.

1 Like

We can look into including the Server Name Indication into the host name check failed message as an enhancement. If no Server Name Indication is supplied to the client it will default to the first argument to ssl:connect. This will be checked against information provided in the servers certificate.

The warning is no longer part of OTP-26, as you will instead get an error if you do not provide any trusted certs to verify the connection with. If you still want to connect without verifying the servers certificates you can by specify the option {verify, verify_none}.

Previous to that the warning message was somewhat clarified to people not so familiar with TLS. “Server authenticity is not verified since certificate path validation is not enabled. The option {verify, verify_peer} and one of the options ‘cacertfile’ or ‘cacerts’ are required to enable this.”

2 Likes

I’d love such an enhancement. I have an application which I would judge quite complicated with many potential connections. If the error messages about connection issues are all just prepared text with no contextual parameters it is literally impossible to figure out where the problem is.

Possible solutions could include phrases like
Fatal - Handshake failure for your_server
or
Authenticity of your_server is not established...
These additions to the error text would at least give us a clue about what server to try connecting with.

If possible please let me know if these get incorporated in an upp-issue, and which issue that is.

Thanks.

ssl: Enhance error message by IngelaAndin · Pull Request #8714 · erlang/otp (github.com)

1 Like

The warning message that you said was clarified to people not so familiar with TLS:
09:56:55.407 [warning] ssl.maybe_client_warn_no_verify/2:2918 Description: 'Server authenticity is not verified since certificate path validation is not enabled' Reason: 'The option {verify, verify_peer} and one of the options \'cacertfile\' or \'cacerts\' are required to enable this.'
doesn’t actually clarify anything for me. In my own chat app right now I’m getting this error message in a single channel and not in others. I’m not at all sure why. It might be helpful if this error was also enhanced to Server authenticity for your_server is not verified.... It might also be helpful for trying to locate the problem if the options passed were listed. Having an option to enable a debug feature or two might also be help, for instance a stacktrace would help a lot here. The chances are that this problem is buried in an open source helper package somewhere and trying to trace through that borrowed but unknown code needs all the help I can get.

Thanks for the first fix in any case.

The error you’re getting states that {verify, verify_peer} and either cacertfile or cacerts config values must be passed to do validation. It states that validation was disabled because not all the options are present.

To figure it out you would therefore need to look at the options you pass when establishing connections and look whether you specify verify (you likely do), and then what certificate bundle is passed through to cover them (this is likely not configured).

The new information passed in the PR specifically triggers when the hostcheck failed because of a host mismatch, but your specific failure did not make it there because there was no set of certificates with which to compare the remote end’s certificate. But the error message as it stands explains that fields are missing and what they might be.

What I’m trying to figure out is that when I connect to one chat channel it gives this error and another chat channel it doesn’t. Which means that something is different for one specific host and it would be helpful to know the host name that is being connected to with the wrong options. Most of the connection logic is in open source dependencies, so I’m not familiar with that code, and trying to find configuration differences needs some more pointers to the data at my level, like a hostname, or the configured options that I’m using incorrectly.

It does not matter which is the host to be authenticated if you do not provide trusted certs to be able to preform the authentication. The chat-channels will of course supply argument to Erlang/OTP ssl library both the host to be connected to and verification arguments. So if it works for one chat-channel and not another one you need to look at the arguments supplied by the chat-channel to our library functions to understand why one configuration works an not the other.

What I’m suggesting is that the error message could help in that debugging effort if it gave the parameters that are being used in the error message, so that I would have something to look for in my code. Printing an error message that is completely static, without any context, is almost never useful. I’ve spent much of my time in support adding context to such uninformative messages. Would love some help with this one.

1 Like

You can always try tracing, it is great for debugging.

dbg:tracer().
dbg:p(all, [call]).
dbg:tpl(ssl, connect, cx).

Try your channel that is failing.

2 Likes