Conformance Testing for Measurable Goals
[Written in consultation with the Interoperability Working Group chairs]
A recurring topic in the Interoperability Working Group is that of defining short-, medium- and long-term goals or “targets” for interoperability. The topic comes up again every few months, and each time it does, the chairs try to tackle it head-on for a full meeting or two, some emails, and various off-line discussions, but doing so never seems to arrive at a satisfactory or definitive answer. Each time, what feels a Herculean outlay of effort only gets us a tentative resolution, as if we’d deflected a debt collector with a minimum payment. Dear reader, we would like to refinance, or at least restructure, this debt to our community!
In this two-part overview of our goals for 2021, we would like to survey the landscape of “provable,” testable interoperability targets and give some guidance on what we would like to see happen next in the testable interop world. Then, in a companion article, we will lay out clear proposal for parallel, distributed work on multiple fronts, such that progress can be distributed across various subcommunities and hubs of open cooperation, each reasonably confident that they are helping the big picture by “zooming in” on one set of problems.
A seemingly uncontroversial target state
Interoperation is a deceptively transparent etymology: any two things that perform a function are inter-operating if they can perform their operation together and mutually. Two fax machines interoperate if one sends a fax and the other receives it, which is a far more impressive feat if the two machines in question were designed and built by different companies on different continents, fifteen years apart. This is the kind of target state people usually have in mind when they imagine interoperability for an emerging technology.
This everyday example glosses over a lot of complexity and history, however. Standards bodies had already specified exact definitions of how the “handshake” is established between two unfamiliar fax machines before either model of fax machine was a twinkle in a design team’s eye. In addition to all the plodding, iterative standardization, there has also been a lot of economic trial and error and dead-ends we never heard about. Even acceptable margins of error and variance from the standards have been prototyped, refined, socialized, and normalized, such that generations of fax machines have taken them to heart, while entire factories have been set up to produce standard components like white-label fax chips and fax boards that can be used by many competing brands. On a software level, both design teams probably recycled some of the same libraries and low-level code, whether open-source or licensed.
Decomposing this example into its requirements at various levels shows how many interdependent and complex subsystems have to achieve internal maturity and sustainable relationships. A time-honored strategy is to treat each of these subsystems in a distinct, parallel maturation process and combine them gradually over time. The goal, after all, is not a working whole, but a working ecosystem. Architectural alternatives, for example, have to be debated carefully, particularly for disruptive and emerging technologies that redistribute power within business processes or ecosystems. Sharing (or better yet, responsibly open-sourcing and governing) common software libraries and low-level hardware specifications is often a “stitch in time” that saves nine later stitches, since it gives coordinators a stable baseline to work from, sooner.
Naturally, from even fifty or a hundred organizations committed to this target state, a thousand different (mostly sensible) strategies can arise. “How to proceed?” becomes a far-from-trivial question, even when all parties share many common goals and incentives. For instance, almost everyone:
- Wants to grow the pie of adoption and market interest
- Holds privacy and decentralization as paramount commitments
- Strives to avoid repeating the mistakes and assumptions of previous generations of internet technology
And yet, all the same… both strategic and tactical differences arise, threatening to entrench themselves into camps or even schools of thought. How to achieve the same functional consensus on strategy as we have on principles? Our thesis here is that our short-term roadmaps need testable, provable alignment goals that we can all agree on for our little communities and networks of technological thinking to converge gradually. Simply put, we need a few checkpoints and short-term goals, towards which we can all work together.
Today’s test suites and interoperability profiles
Perhaps the biggest differences turn out not to be about target state or principles, but about what exactly “conformance” means relative to what has already been specified and standardized. Namely, the core specifications for DIDs and VCs are both data models written in the Worldwide Web Consortium (W3C), which put protocols out of scope. This stems partly from the decisions of the groups convened in the W3C, and partly from a classic division of labor in the internet standards world between W3C, which traditionally governs the data models of web browser and servers, and a distinct group, the Internet Engineering Task Force (IETF).
VCs were specified first, with a preference (but not a requirement) for DIDs, with exact parameters for DIDs and DID systems deferred to a separate data model. Then, to accommodate entrenched and seemingly irreconcilable ideas about how DIDs could best be expressed, the DID data model was made less representationally-explicit and turned into an abstract data model. This shift to a representation-agnostic definition of DIDs, combined with the still-tentative and somewhat representation-specific communication and signing protocols defined to date, makes truly agnostic and cross-community data model conformance somewhat difficult to test. This holds back interoperability (and objective claims to conformance)!
W3C: Testing the core specifications
The only test suite associated with the core W3C specifications is the VC-HTTP-API test suite for VC data model conformance and the fledgling DID-core test suite, both worked on in the W3C-CCG. The former tests implementations of VC-handling systems against some pre-established sample data and test scripts through a deliberately generalized API interface that strives to be minimally opinionated with respect to context- and implementation-specific questions like API authentication. The latter is still taking shape, given that the DID spec has been very unstable in the home stretch of its editorial process arriving at CR this last week.
The VC-HTTP-API interface, specified collectively by the first SVIP funding cycle cohort for use in real-world VC systems, has been used in some contexts as a de facto general-purpose architecture profile, even if it is very open-ended on many architectural details traditionally specified in government or compliance profiles. Its authors and editors did not intend it to be the general-purpose profile for the VC data model generally, but in the absence of comparable alternatives, it is sometimes taken as one; it has perhaps taken on more of a definitive role than originally intended.
Following the second iteration of the SVIP program and the expansion of the cohort, the API and its test suite are poised to accrue features and coverage to make it more useful outside of its original context. Core participants have established a weekly public call at the CCG and a lively re-scoping/documentation process is currently underway to match the documentation and rationale documents to the diversity of contexts deploying the API.
Aries: End-to-end conformance testing
Other profiles, like the Aries interoperability profile, serve an important role, but it would be misleading to call it a VC data model test suite-- it is more like an end-to-end test harness for showing successful implementation of the Aries protocols and architecture. Here “interoperability” means interoperability with other Aries systems, and conformance with the shared Aries interpretation of the standard VC data model and the protocols this community has defined on the basis of that interpretation.
Many matters specified by the W3C data model are abstracted out or addressed by shared libraries in Ursa, so its scope is not exactly coterminous with the W3C data model. Instead, the Aries interoperability profile has its own infrastructural focus, which focuses on scaling the privacy guarantees of blockchain-based ZKP systems. In many ways, this focus complements rather than supplants that of the W3C test suites.
Many of the trickiest questions on which SSI systems differ are rooted in what the Aries and Trust-over-IP communities conceptualize as “layer 2,” the connective infrastructural building-blocks connecting end-users to VCs and DIDs. As more and more features get added to be testable across language implementations, and as feature-parity is achieved with other systems (such as support for LD VCs), the case for productive complementarity and deep interoperability gets easier and easier to make.
The first wave of local profiles and guidelines
Other specifications for decentralized-identity APIs and wallets, modelled on that W3C CCG and/or extending the work of Aries-based infrastructures, are starting to crop up around the world. So far these have all arisen out of ambitious government-funded programs to build infrastructure, often with an eye to local governance or healthy competition. Canada and the European Commission are the most high-profile ones to date, building on earlier work in Spain, the UK, and elsewhere; Germany and other countries funding next-generation trust frameworks may soon follow suit.
It is important, however, to avoid framing these tentative, sometimes cautious attempts at bridging status quo and new models as universal standards. If anything, these frameworks tend to come with major caveats and maturity disclaimers, on top of having carefully narrowed scopes. After all, they are generally the work of experienced regulators, inheriting decades of work exploring identity and infrastructural technologies through a patchwork of requirements and local profiles that tend to align over time. If they are designed with enough circumspection and dialogue, conformance with one should never make conformance with another impossible. (The authors would here like to extend heartfelt sympathy to all DIF members currently trying to conform to multiple of these at once!)
These profiles test concrete and specific interpretations of a shared data model that provide a testing benchmark for regulatory “green lighting” of specific implementations and perhaps even whole frameworks. Particularly when they specify best practices or requirements for security and API design, they create testable standardization by making explicit their opinions and assumptions about:
- approved cryptography,
- auditing capabilities,
- privacy requirements, and
- API access/authentication
These will probably always differ and make a universal abstraction impossible; and that’s not a bad thing! These requirements are always going to be specific to each regulatory context, and without them, innovation (and large-scale investment) are endangered by regulatory uncertainty. Navigating these multiple profiles is going to be a challenge in the coming years, as more of them come online and their differences come into relief as a stumbling block for widely-interoperable protocols with potentially global reach.
The Interoperability working group will be tracking them and providing guidance and documentation where possible. Importantly, though, there is a new DIF Working Group coming soon, the Wallet Security WG, which will dive deeper into these profiles and requirements, benefiting from a narrow scope and IPR protection, allowing them to speak more bluntly about the above-mentioned details.