Question about SHA1 with crypto - need help trying trying to code the algo defined in FIPS 186-2 change 1 document section 3.1

riccardomanfrin · December 6, 2024, 2:19am

Is there something equivalent to this in :crypto?

I’m particularly concerned about the magical sha1_transform defined here, where they seem to hack the initial SHA1 context state by loading with those magical numbers coming from the below references… I’m so tempted to make a nif to just not play their game…

TLDR

I’m trying to code the algo defined in FIPS 186-2 change 1 document section 3.1

In the doc G(t,c) is said to be define in sect. 3.3, where they say (quote)

G(t,c) may be constructed using steps (a) - (e) in section 7 of the Specifications for the Secure Hash Standard

Aside a long list of mistakes and missing things (like a browsable ToC), the document doesn’t even seem to provide a proper list of references at the end of it. The closest I found appears to be NIST-FIPS-180-4, although I don’t find the above steps, and section 7… well is rather short.

vkatsuba · December 7, 2024, 8:14am

Hey !

This is an interesting challenge! The algorithm in question, based on FIPS 186-2 Change 1, does involve intricate steps that reference earlier cryptographic standards in a somewhat convoluted way.

Regarding your concern about the sha1_transform function and its context hacking: Erlang’s crypto module doesn’t expose low-level internals like modifying SHA-1’s initial state directly. The crypto module focuses on providing high-level APIs for cryptographic functions, which makes it unsuitable for implementing the exact behavior described in your reference.

If you’re determined to replicate the precise steps involving the “magical” state manipulation, a NIF(Native Implemented Function) is probably your best bet. This allows you to write the critical cryptographic logic in C, leveraging libraries like OpenSSL, and then interface it with Erlang. It’s a bit of extra work but gives you full control over how the algorithm is implemented while maintaining high performance.

Regarding the references in the FIPS documents: yes, navigating these standards can be frustrating due to their lack of clarity and consistent references. While FIPS 180-4 defines the Secure Hash Standard (SHS), it doesn’t directly explain the steps in section 7 of FIPS 186-2 Change 1 you mentioned. Instead, it lays out the SHA family algorithms in detail. You may need to piece together the process by combining the FIPS 180-4 specifications with the implementation details in the code you’re studying, such as the wpa_supplicant implementation.

If you’re finding the references too ambiguous, you might consider reaching out to NIST for clarification or consulting existing cryptographic libraries (like OpenSSL) to see how they’ve implemented this functionality. Often, their implementations provide practical interpretations of these standards.

Let us know if you decide to go the NIF route or find a way to achieve this purely with existing libraries - it would be great to hear how you tackle this!

riccardomanfrin · December 8, 2024, 9:15pm

Thanks for the interest in my “journey” so to call it…

At the end I came to the conclusion that a NIF is the fastest and safest way to approach it, and I have it running all right and for good .

Since you took the effort to share the pain with me, I’m giving away a couple of “gems” I might have found along the path

Calling the SHA1_Init ACTUALLY does load the h0,h1… etc… with the exact values of the wpa_supplicant code , so I can at least avoid the memcpy hack.
The very rare man pages I found for SHA1_transform say (quote):

The SHA1Transform() function is used by SHA1Update() to hash 512-bit blocks and forms the core of the algorithm. Most programs should use the interface provided by SHA1Init(), SHA1Update() and SHA1Final() instead of calling SHA1Transform() directly.

In other words I can at least avoid the SHA1_Transform (which spits out a fixed 64 bytes output) and use the more broadly common SHA1_Update.

While I can leverage the SHA1_Init, I cannot use is the SHA1_Final.
I’ve written few tests to verify this

    char data[64];
	char out[20];
	memset(out, 0, 20);
	memset(data, 42, 64);

	SHA_CTX context;
	SHA1_Init(&context);
	SHA1_Update(&context, data, 64); // here Nl=0x200
	SHA1_Final(out, &context);
	
	printf("\nUpdate+Final");
	for (int i = 0; i < 20; i++) printf("0x%02X ", (uint8_t) out[i]);

	memset(out, 0, 20);
	memset(data, 42, 64);

	SHA1_Init(&context);
	SHA1_Transform(&context, data); // here Nl=0x0
	SHA1_Final(out, &context);

	printf("\nTransform+Final");
	for (int i = 0; i < 20; i++) printf("0x%02X ", (uint8_t) out[i]);

	memset(out, 0, 20);
	memset(data, 42, 64);

	SHA1_Init(&context);
	SHA1_Transform(&context, data); // here Nl=0x0
	memcpy(out, &context.h0, 20);

	printf("\nTransform+plain memcpy");
	for (int i = 0; i < 20; i++) printf("0x%02X ", (uint8_t) out[i]);

	memset(out, 0, 20);
	memset(data, 42, 64);

	SHA1_Init(&context);
	SHA1_Update(&context, data, 64); // here Nl=0x0
	memcpy(out, &context.h0, 20);

	printf("\nUpdate+plain memcpy");
	for (int i = 0; i < 20; i++) printf("0x%02X ", (uint8_t) out[i]);

and got these results

Update+Final0xFE 0x5E 0x8D 0x0F 0x87 0x8E 0x68 0xF7 0x53 0x87 0xFE 0x94 0x77 0xC3 0x82 0x77 0x23 0x52 0xB0 0x1C 
Transform+Final0xE1 0x95 0x64 0xDB 0xEE 0xC6 0x2C 0x25 0xCB 0x15 0xAA 0x84 0xF7 0x52 0x35 0x33 0x86 0x86 0x3E 0xB9 
Transform+plain memcpy0x81 0x71 0x1E 0x1C 0x5F 0xE0 0x67 0x60 0xF7 0x1C 0x6E 0xD4 0x52 0x3C 0x44 0x71 0x31 0xAD 0x27 0x39 
Update+plain memcpy0x81 0x71 0x1E 0x1C 0x5F 0xE0 0x67 0x60 0xF7 0x1C 0x6E 0xD4 0x52 0x3C 0x44 0x71 0x31 0xAD 0x27 0x39

So in the end the actual only thing that I cannot do (and for which I need the NIF) is to access the SHA1 context state and use it as an output.

vkatsuba · December 9, 2024, 5:46am

Thanks for sharing your insights and detailed findings! It’s always fascinating to see such a thorough investigation and documentation of the intricacies of a fundamental algorithm. The clarity of your comparisons and results highlights exactly where the constraints lie and why the NIF becomes essential for accessing the SHA1 context state.

Your observation about avoiding the memcpy hack by leveraging SHA1_Init is a particularly useful tip, and it’s great to see how this simplifies the process. The distinction between using SHA1_Update and SHA1_Transform is particularly enlightening, especially with the nuances of how Nl is handled in each case.

This detailed exploration is not only impressive but also incredibly helpful for anyone navigating similar challenges. Thanks again for generously sharing these “gems” - they’re bound to be valuable to others in the community. Looking forward to hearing more about your work!