Response to Recent Commentary on the CrossBar SPU

Clarifying design choices, security context, and mitigation strategies for CrossBar SPU

Mark Davis

4/27/20268 min read

Recent commentary has raised questions about aspects of the Crossbar SPU design. We welcome this discussion. In this article, we clarify key points, provide technical context, and outline relevant security considerations and mitigations.

We were recently made aware of commentary on the AES block on the CrossBar SPU, posted here.

We wanted to put out some specific response, but will also be publishing more detail on our Open Source in general.

The circuit discussed isn’t actually used. If it were used, there are multiple mitigations that could be invoked. But we agree it would be better if it had more features, including DOM. For this and other reasons, we also like OpenTitan, and you’ll be seeing some news on that shortly.

In Brief

First, we would like to thank the author for reading and making comment. Community examination only works when there is actually a community that actually examines. We have some distance to go to build an effective community, so we want to thank and acknowledge the author for being a pioneering examiner and commenter.

Second, we would like to acknowledge the author’s point that CrossBar has not effectively engaged the Open Source community directly yet. This is true, we are late to engage, for various reasons. Open Source was not initially our main selling point. Also, Bunnie’s community (Baochip) is quite active on a certain set of topics, and we were happy to let Bunnie manage this. But Open Source is an increasingly important part of our “reason for existence”, so we will be rolling out more direct community engagement. We recently released to Open Source a software package for the ARM. We will be adding more to this, including the ability to order severlal Open Source devkits.

We continue to collaborate very closely with Bunnie, and fully support his community. But certain topics fall upon us, and we need to answer them directly.

Semiconductor masks are expensive, but gates are not. So this favors putting items on the chip even if we aren’t sure what we are going to do with them. Better to have it and not need it, than need it and not have it.

We were aware this was a pretty bare-bones AES implementation, but it was already available, so we included it.

The original purpose of having HW-based AES was for XIP (execute in place) from external memory. However, this wasn’t a core feature with a clear market use case, and it proved to be complicated (not the encryption, but the memory controller part). So XIP from external memory (with or without encryption) was cut from the SPU product features. As such, the HW AES block doesn’t have clear purpose. As a result, it isn’t used the PHSM8 by any application currently.

No actual attack has been demonstrated, but even if one accepts hypothetically that such an attack could be demonstrated, it would be an over-generalization to say that the chip is “not secure” and that the PHSM8 should not be used. There are two main reasons:

This block’s presence on the die doesn’t affect any use case that doesn’t invoke it (which all present use cases do not).
Extracting information from repetitive power measurements is only applicable to some use cases.

In short, even a demonstrated attack would only apply to some situations. And Open Source enables a developer to be more informed about what functions to use in which situation.

Despite the fact that this circuit is not used, it is worth discussing the technical points anyway. This kind of discussion is key to the process of Open Source security (more on that later).

Yes, it is true that basic S-box masking we used has theoretical vulnerabilities to “glitches.” But that is in theory, and practice can be affected by many things.

First, note that the S-box masking on the SPU itself uses a 229 bit LFSR DRNG that could be reseeded at any time. It is an improvement, but we recognize it doesn’t fully address the main issue raised. But there are other layers of defense.

Another mitigation is “chaffing”, which Bunnie posted about several times on his Discord and GitHub; an example is here.

Another defense CrossBar SPU has against these kind of attacks in general (on all blocks, and whether collected by EM or DPA) is the ability to slightly jitter clocks. This is not in the AES block itself, it is in the clock tree. You can find it at /modules/sysctrl/rtl/sysctrl.sv in the RTL.

This is a 59 bit LFSR DRNG at the rate of fclk (e.g. 800 MHz) and which can swallow cycles at a programmable rate, 1/1–1/16. AES is in the 200 MHz domain, so its clock cycle will be between 4 to (for example) 16 (or more) fclk cycles. Software can also reseed this DRNG anytime. So mitigations could include changing the rate randomly and reseeding it periodically.

Software can also introduce random delays outside of the AES block.

The efficacy of such defenses remains to be seen. But the point is, CrossBar SPU provides hooks to enable many approaches to constructing defense against any demonstrated attack of this type.

And such demonstration is important, because another interesting aspect of the SPU is that it is in a 22 nm process using ReRAM (a novel NVM technology) instead of the typical floating-gate flash NVM technology that is used in the vast majority of secure elements and MCUs. For a more detailed treatment, you can see a whitepaper here.

A main reason for the existence of ReRAM is that it can be fabricated in more advanced process nodes, which floating gate embedded NVM cannot be. This enables us to make CrossBar SPU in a 22 nm TSMC process. The paper cited in the article says the device under test “has been designed using a 0.25 µm CMOS technology.” It builds on the work of another paper, cited therein as reference [11], which also reports measurements on 0.25 µm device.

Loosely speaking, the efficacy of side channel techniques is proportional to the amount of energy it takes to flip a gate, all else being equal. Energy is C V², where C is an equivalent gate capacitance (in a simplified view), and V is the logic voltage. For a typical 0.25 µm CMOS process, the gate capacitance is around 15 fF, and a typical voltage for that node is 2.5 V. In our case, a typical gate capacitance is around 0.5 fF, and the logic voltage is 0.8 V (configurable, but that’s the nominal center). So 15 fF 2.5² is around 94 fJ, whereas 0.5 fF 0.8 V ^2 is 0.32 fJ, which is a ratio of about 300x less energy.

These numbers are very approximate, and we use various logic gates in the design, but the point is the two processes are several orders of magnitude different in energy signature. We do not believe this is by itself definitive, because a low signal-to-noise ratio could be overcome with sufficient samples. But it does make everything that much harder. Again, it is defense in depth.

Another defense against DPA may be the fact that the chip has an internal regulator. This is unusual for a chip of this type, and may serve to reduce access to power signals. However, the power traces must still come out on an external pin for a bypass cap. Also the chip can be configured to use external power; depending on how this chip is set up, this could be easy or difficult. Also note that although this internal regulator might help stabilize Vdd, it seems unlikely it would provide any protection against EMI-based measurements.

Also, the papers cited use long plaintext to extract keys, which may or may not apply to a given situation. The reference product we are currently demonstrating uses Chacha/Poly for encrypted messaging, but if AES were used, the length of messaging (in our case) would be insufficient to perform DPA even with the high energy signature of a 0.25 µm process.

We are not saying that we know this attack is infeasible; we do not. But there are reasons to believe extracting keys by the process outlined in these papers could be quite a bit more difficult than in the cited examples. And if a successful attack is demonstrated, there are many hooks in the chip to construct defenses.

We are actively doing measurements ourselves, and would welcome collaboration from any others who want to make measures.

And side-channel attack is only applicable in a situation of repeated use of private value. Thus, applicability depends heavily on the system architecture. Some architecture-dependent exploits have been demonstrated, such as here and here. But we are presently unaware of any case ever of an actual loss of blockchain funds by a side-channel attack on a hardware wallet. If there is such a case, we would be grateful to be made aware of it.

To summarize the above:

There are multiple layers of defenses possible, and mitigating factors. We look forward to discussing real data.
System context also matters, and a vulnerability may or may not be important depending on system architecture and use case.

We think the cited theory of attack may be difficult to execute in practice on CrossBar SPU, especially with other mitigations applied (as we will explain later). However, we agree it would be — at least in theory — better if the block had DOM, or TI, or similar technique.

Big Picture, and Community Process

Let’s step back a moment and consider why we are doing all this in the first place. It’s a little unusual for a company to spend money on expensive EDA tools and masks to make an Open Source chip at all. Why do we do this?

Broadly speaking, there are two approaches to security — open examination, or closed secrecy. Each approach has merits that may favor it in various situations. But in silicon, there has not been much choice in the “open” method.

This has led to pathological situations where software projects are touted as “open source”, but they run on closed source chips. Running “trustless” software on a chip based on “trust the vendor” seems to defeat the purpose.

A good example is crypto hardware wallets. You could say the entire ethos of blockchain is “trustless.” Aficionados lean towards blockchain as a hedge against trusting centrally controlled financial systems. Phrases like “not your keys, not your coin” emphasize a preference for autonomy.

But the reality is that so-called “self-custody” wallets overwhelmingly depend on the most closed and secret chips in the world. You can find endless stories/rumors of vulnerabilities that are known to a small circle of people, but not published.

Is that really autonomy? Is that really self-custody? Blindly trusting a vendor who has flaws that are known to some vague set of people, but hidden from you?

Centralized trust and closed source have a place, but so does community examination and open source. We thought the latter was under-served in the silicon world. Vitalik Buterin recently spoke eloquently on this topic between 3:05 and 3:10 in this video.

Yes, putting our RTL out there invites criticisms such as the one we are responding to. And some criticisms and vulnerabilities will prove to be valid. But that is an essential part of an open process. You can’t have one without the other.

We would like to invite others to try out, and contribute to, the effort. The only way Open Source works is for people to raise vulnerabilities, measure and discuss, and collaborate on mitigations.

The author’s criticism that CrossBar had not effectively engaged with Open Source communities is a valid one. But we are now demonstrably increasing our Open Source support and engagement.

Taping out chips is extremely expensive, and we have spent and continue spending resources on tangible chips and hardware. So we think we bring a value to this space, and look forward to working with the community to improve the status of Open Source silicon for security.

And finally, the article specifically mentions OpenTitan. We have been collaborating with the founder of the project for some time, both to leverage open source secure silicon IP for our applications and to shape the future of open source silicon more broadly. There are often tradeoffs between various characteristics, and so it makes sense — especially in an advanced process node where gates are cheaper — to provide multiple implementations of the same functions for different applications. We’ll make a public announcement soon describing our open silicon strategy.

Mark Davis

CEO of CrossBar, Inc.

Community Engagement

Background of AES Block on CrossBar SPU

Summary of Response

Non-technical Response

Technical Response

Technical Commentary on the Side-Channel Analysis

CrossBar Inc.

info@crossbar-inc.com