Azure IoT Device SDKs with x509 certificates backed by HSM
More and more folks become concerned about the security of their IoT devices. As a result many consider utilizing hardware security modules or chips such as e.g. TPM 2.0 to keep the device identity secure. One of the recommended approaches for IoT device security is using x509 certificate based mutual authentication with private keys backed by such a hardware security module or chip.
Both Azure IoT Hub and DPS support mTLS with x509 certificates as one of the authentication schemes. The device side is where it gets more complex. The variety of different OSes, HSMs and programming languages makes HSM integration a non-trivial task. In the following sections I’ll be using C# and JavaScript IoT device SDKs with TPM 2.0 on Linux to demonstrate the concept. There are also some C SDK samples. This concept can be definitely applied to SDKs in other programming languages as well as other HSMs on Linux. I haven’t done much investigation on Windows yet.
The Concept
In a nutshell: OpenSSL for the win! OpenSSL supports so called dynamic engines:
Typically engines are dynamically loadable modules that are registered with libcrypto and use the available hooks to provide cryptographic algorithm implementations. Usually these are alternative implementations of algorithms already provided by libcrypto (e.g. to enable hardware acceleration of the algorithm), but they may also include algorithms not implemented in default OpenSSL (e.g. the GOST engine implements the GOST algorithm family). Some engines are provided as part of the OpenSSL distribution, and some are provided by external third parties (again, GOST).
This basically means that there is no need to interact with an HSM directly. All cryptographic operations can be delegated to OpenSSL given that it is possible to interact with OpenSSL from the programming language (+ runtime) in use. You will find samples using TPM 2.0 as secure storage with Azure IoT Device SDKs in my GitHub repo.
These samples use PKCS#11 engine to interact with the TPM 2.0 chip. The GitHub repo contains more details on how to setup a TPM 2.0 chip with PKCS#11. It is of course possible to use other OpenSSL engines as well.

Looking at the code of the different SDK samples you will discover that the implementation of the concept varies depending on capabilities of the programming language and the runtime in use.
In the C# sample there was no need to touch the SDK source code itself. It was just a matter of initializing the OpenSSL engine and using it with a X509Certificate2 instance to load the private key handle from the TPM.
In the Node.js sample I actually copy-pasted the implementation of MqttBase from the original SDK in order to introduce some changes so it respects the OpenSSL engine settings as opposed to using the actual private key as string.
Conclusion
There seems to be a number of moving parts such as the OS (Linux vs. Windows), programming language and the runtime as well as the actual implementation of Azure IoT Device SDKs which greatly impact the amount of effort required to implement this concept (if at all possible). It may or may not to feasible to do so. In one of the next blog posts I will introduce another approach to solving this challenge.