Internet-Draft OpenPGP Web of Trust February 2022
Walfield Expires 7 August 2022 [Page]
Workgroup:
openpgp
Internet-Draft:
draft-nhw-web-of-trust-00
Published:
Intended Status:
Informational
Expires:
Author:
N. H. Walfield
Sequoia PGP

OpenPGP Web of Trust

Abstract

The web of trust is a flexible, decentralized trust model created for PGP. PGP and GnuPG include implementations of the web of trust, and OpenPGP defines a number of authentication mechanisms that form the basis of both implementations.

Unfortunately, PGP and GnuPG implement different semantics, neither documents their semantics, and OpenPGP does not specify how a web of trust implementation should work.

This draft defines the semantics of the web of trust as implemented by Sequoia. Sequoia models the web of trust as a flow network, and authentication as a maximum flow problem. Although its semantics differ from both PGP's and GnuPG's semantics, in practice, it is largely compatible with both implementations.

By publishing this draft we hope to save developers of other OpenPGP implementations the time needed to design and specify a web of trust algorithm, and we hope to increase interoperability.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 7 August 2022.

Table of Contents

1. Introduction

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

1.2. Terminology

  • "OpenPGP certificate" or just "certificate" is the data structure that section 11.2 of [RFC4880] defines as a "Transferable Public Key". A certificate is sometimes called a key, but this is confusing, because a certificate contains components that are also called keys.

  • "User ID" is an OpenPGP packet. In this document, this term also encompasses OpenPGP's "User Attribute" packets. A User ID contains an identifier, which is typically a name and an email address.

  • "Binding" is a tuple consisting of a certificate and a User ID.

  • "Self signature" is a cryptographic signature that associates some data, e.g., a User ID or a subkey, with the signing certificate.

  • "Self certification" is a self signature over a User ID.

  • "Third-party certification" is a cryptographic signature that associates a User ID with a certificate different from the one doing the signing. A third-party certification is a type of vouch.

  • "Certification" is either a self certification or a third-party certification of a binding.

  • "Delegation" is like a certification, but it only certifies a certificate; it does not also certify a User ID. A delegation is used to indicate that certifications and delegations made by the target certificate should be considered valid.

  • "Trust root" is a certificate that the user directly relies on to make certifications and delegations. "Trust anchor" is another name for "trust root".

  • "Trusted introducer" or "certification authority" or "CA" is a certificate that is authorized to make certifications, and may be authorized to make delegations. Trust roots and the target of delegations are trusted introducers.

  • "Scope" is a set of constraints.

  • "In scope" is a property of a certificate, certification, User ID, etc. which holds if all constraints on it are satisfied.

  • "Liveness" is a property of a certificate, a certification, etc. An object is considered live with respect to some reference time if, as of the reference time, its creation time is in the past, and it has not expired.

  • "Authentication" is the process of determining whether a binding should be considered authentic.

  • "Trust model" is a process for doing authentication.

  • "Web of trust" is a decentralized trust model, which was created for PGP.

  • "X.509" is a hierarchical trust model. It is the most popular trust model used on the public Internet. It is a subset of the web of trust.

2. Problem Statement

The web of trust was designed for grass root activists who are not always willing to trust a central authority, and whose trust roots and certifications may be private. This is different from X.509, which largely assumes that there are a handful of globally trusted roots, and certifications are public.

Authentication in X.509 is relatively straightforward. A certificate normally includes a trust chain, which is anchored at a well-known trust root. Thus, authentication in X.509 means validating a trust chain.

In the web of trust, every user has their own set of roots, and certifications may not be public. So, authentication in the web of trust means building a certification network using the information that is available locally, and then finding a valid path from the user's trust roots to the binding. Since users need not unconditionally trust a certification authority, it may be necessary to find and combine multiple paths to have sufficient evidence to authenticate a binding.

This draft specifies a path finding algorithm for a web of trust using the mechanisms specified by [RFC4880]. Insofar as authentication mechanisms are specified by [RFC4880], they are used accordingly in this draft. [RFC4880], however, leaves many details unspecified including but not limited to: how to handle different Trust Signatures by the same issuer on multiple User IDs on the same certificate; the semantics of a Trust Signature on a third-party direct-key signature; and whether regular expressions need to match certifications of trusted introducers. This draft fills in the missing details.

The web of trust is a network in which the nodes are certificates, and the edges are certifications. Because a certificate may certify multiple User IDs on the same certificate, a network may include multi edges.

We view the network as a flow network in which an edge's capacity is the corresponding certification's trust amount. In this model, the trust amount parameter can be understood as an amount of evidence. We explicitly don't consider the trust amount to be a probability of correctness. First, humans are not good at reasoning about probability. Second, it is hard to reconcile this model with an adversary who does not make mistakes, but lies when it is to their advantage.

Using this model, authenticating a binding is a question of finding a flow from a set of trust roots to the binding with sufficient capacity. Unfortunately, OpenPGP certifications can impose constraints on the rest of the path. This means that most path finding algorithms cannot be used as-is. This draft describes how to use a variant of Dijkstra's shortest path algorithm to do path finding in this type of network.

3. OpenPGP's Authentication Mechanisms

OpenPGP provides four simple, yet powerful and flexible mechanisms to facilitate authentication. These are third-party certifications, a trust amount parameter, a trust depth parameter, and a regular expression parameter. This section describes the semantics that this specification assigns to these mechanisms.

This specification explicitly ignores the Signer's User ID subpacket, which is not meaningful for authentication.

3.1. Certifications and Delegations

A certification is a special type of OpenPGP Signature packet. It says that the issuer is convinced that the specified binding (User ID and certificate) is correct. When the issuer and the target certificate are the same, the certification is called a self signature or self certification. Otherwise, the certification is referred to as a third-party certification.

OpenPGP distinguishes four types of certifications (signature types 0x10 through 0x13). This specification treats all of these signature types identically. In common practice, a persona certification (signature type 0x11) is often treated as an invalid certification. This specification ignores this distinction.

It is possible to certify a certificate without also certifying a User ID by using a direct key signature (signature type 0x1F) over the primary key of the signature. This specification refers to such certifications as delegations. If the trust depth parameter (described below) is non-zero, this means that the target certificate should be treated as a trusted introducer.

3.2. Trust Amount

The trust amount parameter is controlled by the Trust Signature subpacket. It is the degree to which the issuer of a certification is convinced that the binding is correct. This can vary from 0 to 255. Values that are 120 or larger mean that the issuer is fully convinced. Traditionally, an issuer uses 60 to indicate that they are partially (aka marginally) convinced, however, any value between 1 and 119 can be used. A value of 0 means that the target should not be considered as certified. A certification whose trust amount is 0 should not be ignored: it overrides earlier certifications.

If edges along a path have different trust amounts, then the path's trust amount is the minimum trust amount of any of the edges. Consider the following network:

alice
  | 1/60
  v
 bob
  | 120
  v
carol

alice says that bob is a partially trusted (trust amount = 60) trusted introducer (trust depth = 1). Even though bob has certified carol's key with a trust amount of 120, alice only assigns the path alice - bob - carol a trust amount of 60.

This draft interprets trust amount as an amount of evidence. It assumes that evidence is independent and can be combined linearly. That is, if a trust root partially (trust amount < 120) trusts two certification authorities and they both certify a binding, the two paths can be added together.

3.3. Trust Depth

The trust depth parameter is controlled by the Trust Signature subpacket. It is used to indicate that a certification's target should be considered a trusted introducer.

The trust depth parameter ranges from 0 to 255. A value of 0 means that this certification is just a normal certification, and the target is not a trusted introducer. A value of 1 means that the target is a trusted introducer. A value of 2 means that the target is a trusted introducer and can designate level 1 trusted introducers. In general, a value of n means that the target of a certification can designate level n-1 trusted introducers. The value 255 is special and means infinity (i.e., it does not impose a constraint).

If a certificate designates a level n trusted introducer, but it is only allowed to delegate level m trusted introducers where m < n, then the trust depth parameter is limited to m.

3.3.1. Example

Consider the following network where the number is the certification's trust depth parameter:

alice
  | 2/120
  v
 bob
  | 2/120
  v
carol
  | 2/120
  v
dave
  | 2/120
  v
 ed

alice certifies bob with a trust depth of 2. This means that she considers bob to be a trusted introducer and that he can designate level 1 trusted introducers.

Likewise, bob certifies carol with a trust depth of 2. This means that he considers carol to be a trusted introducer and that she can designate level 1 trusted introducers.

From alice's perspective, however, bob's certification of carol extends too much authority to carol: she has only allowed bob to designate level 1 trusted introducers, but bob has designated carol as a level 2 trusted introducer. Instead of ignoring certifications that extend too much authority, the trust depth of any certification is capped by constraints imposed by any preceding certifications in the path. So, in this case, alice is willing to consider carol to be a level 1 trusted introducer.

carol certifies dave with a trust depth of 2. alice, however, only considers carol to be a level 1 trusted introducer. As with bob, carol's delegation is capped and, from alice's perspective, she is only allowed to certify other bindings. As such, alice considers dave's binding to be authenticated, but she does not consider him to be a trusted introducer.

Finally, dave certifies ed with a trust depth of 2. Clearly, there is a path from alice to ed: alice - bob - carol - dave - ed. However, because alice does not consider dave to be a trusted introducer, this path is not valid, and alice does not consider ed to be authenticated.

3.4. Regular Expressions

The regular expression parameter controls the scope of a delegation. A certification can include zero or more regular expressions. If it includes at least one regular expression, then at least one of them MUST match the User ID of the binding that is being authenticated for the path to be valid. A regular expression does not need to match intermediate trusted introducers.

3.4.1. Example

Regular expressions are a mechanism for a user to make use of a CA in a limited way. For instance, ed might be willing to rely on ca@nsa.gov to certify other nsa.gov User IDs, but doesn't want to rely on ca@nsa.gov to make a statement about any other User IDs.

Consider the following example in which the edges are labeled with the trust depth, trust amount, and optionally a domain, which corresponds to a regular expression that matches email addresses with that domain:

ed@lavabit.com
      | 255/120/nsa.gov
      v
  ca@nsa.gov
      | 1/120
      v
  ca@fbi.gov
      | 0/120
      v
  paul@nsa.gov

ed considers ca@nsa.gov to be a fully trusted (trust amount = 120) trusted introducer (trust depth 255) for User IDs that are in nsa.gov. ca@nsa.gov delegates to ca@fbi.gov, which has certified paul@nsa.gov. Even though the regular expression doesn't match the ca@fbi.gov, it does match the target User ID (paul@nsa.gov) so ed can authenticate paul@nsa.gov.

3.4.2. Rationale

A User ID identifies an entity. Because an entity may have multiple aliases or roles, it is reasonable and possible for a certificate to have multiple valid User IDs.

A certification's trustworthiness depends not on an identity, but on the entity. If an entity acts in conflicting ways depending on their role, then this draft takes the position that either they should not be trusted, or they should have multiple certificates.

4. Authentication

Authenticating a binding is a two-phase process. First, a network is built. Then, one or more paths starting at the trust roots and ending at the binding are located in the network.

4.1. Network

A web of trust network is built with respect to a reference time as follows:

  • A node is created for each non-revoked live certificate.

    • A node MAY be created for a revoked trust root, if the secret key material was not compromised.

  • A directed edge from the issuer to the target certificate is created for each non-revoked live certification and non-revoked live delegation.

    • Self certifications result in self loops.

    • If there are multiple live certifications for the same issuer and binding, or multiple live delegations for the same issuer and target certificate, then an edge is only created for the newest certifications or delegations. If there are multiple such certifications or delegations, then an edge is created for each one of them.

    • A third-party certification is valid even if the certified User ID does not have a self signature.

  • Edges are labeled with their certification's or delegation's parameters. In particular, edges are labelled with the trust amount, the trust depth, and any regular expressions.

    • If there is no trust amount, the trust amount defaults to 120.

    • If the trust amount exceeds 120, the trust amount is lowered to 120.

    • If there is no trust depth, the trust depth defaults to 0.

    • As an exception: self certifications always have a trust depth of 0.

  • The trust roots are set by the user. They are assigned an infinite trust depth, and a trust amount of 120.

4.2. Certification Validity

In addition to being well formed, and cryptographically valid, there are several additional conditions that must hold for a certification or delegation to be considered valid.

These additional conditions are evaluated with respect to a reference time. Usually the reference time is the current time. However, when authenticating a signature, it may make sense to also consider a reference time in the past. Consider a signed message that Alice sent to Bob a while ago, and which Bob is reviewing today. Assume that at the time Bob received the message, Bob found a valid certification path to Alice's certificate, but that that path is no longer valid, because one or more certifications have since expired. In this case, Bob's implementation MAY evaluate the validity of Alice's certificate by falling back to the signature's creation time, but it SHOULD fallback to the the earliest time at which the signature was known to exist, e.g., the time the message was recorded on a trusted storage medium. Using the time that the message was recorded prevents an attacker from backdating a signature to make it appear valid. If a certificate can't be validated now, but can be validated in the past, then the user's implementation SHOULD signal the user that the certificate was valid in the past, but is not valid anymore.

o Alice signs a message and sends it to Bob
|
v
o Bob receives message and validates Alice's certificate
|
v
o A certification that Bob used to validate Alice's certificate expires
|
v
o Bob reviews Alice's message

The additional constraints are:

  • The certification must be valid as of the reference time:

    • The certification's signature creation time (its Signature Creation Time signature subpacket), c, is not later than the reference time, r, i.e., c <= r.

    • The certification's expiration time (its Signature Expiration Time signature subpacket), e, if any, is after the reference time, r, i.e., r < e.

    • The certification has not been revoked either before the reference time or at any other time.

  • The certificate that issued the certification (the issuer) must be valid as of the certification time (the certification's Signature Creation Time signature subpacket):

    • The certificate's creation time (the primary key's Key Creation Time field), c, is not later than the reference time, r, i.e., c <= r.

    • The certificate's expiration time (the Key Expiration Time of the active binding signature as of the certification time), e, if any, is after the reference time, r, i.e., r < e.

    • If the certificate was revoked, and the reason for revocation was either 'Key is superseded' or 'Key is retired and no longer used' (reasons 0x1 and 0x3 in the Reason for Revocation signature subpacket), then the time the revocation was created (the revocation's Signature Creation Time's signature subpacket), v, is after the certification time, c, i.e., c < v.

    • The certificate was not revoked for any other reason at any time.

  • The target of the certification, the target certificate and the target User ID, if any, must be valid:

    • The target of the certification must be valid in the same way as the issuer of the certification, as described above.

    • If the target of the certification includes a User ID, i.e., the certification is being used as a certification and not a delegation, then to also be valid as a certification and not just a delegation, it must:

      • The target of the certification, must be valid in the same way as the issuer of the certification, as described above, but at the reference time.

      • The User ID must not be revoked as of the certification time.

      • The User ID must not be revoked as of the reference time.

      • Note: Unlike certificates and keys, User IDs do not have creation times, do not expire, and do not require a self signature to be considered valid.

4.3. Authentication

To authenticate a binding, it is necessary to find one or more valid paths from the roots to the binding in the network.

A path is valid if it starts from a trust root, ends at the target certificate, the last edge is a certification over the target User ID, all certificates, certifications, and the target User ID are in scope (that is, any trust depth parameters are respected, and for each edge that has regular expressions, at last one regular expression matches the target User ID), and the target User ID is not revoked.

Note: a self certification counts as an edge and thus is only in scope if the certificate is a trusted introducer.

A path SHOULD be minimal in the sense that it should not have any cycles.

A path's trust amount is the minimum trust amount of the trust amount of each edge in the path.

Multiple paths can be combined if they use the same edge in any multi-edges. The trust amount of multiple paths is the maximum flow of the network induced by the paths.

A binding is fully authenticated if the trust amount of the valid paths is at least 120. It is partially authenticated if the trust amount is between 1 and 119.

5. Implementation Strategy

The following text is non-normative. It motivates and describes one possible implementation strategy, which satisfies the above constraints. An implementation is free to implement this draft as it sees fit.

A simple algorithm to find the shortest path in a network is to enumerate all valid paths from the roots to the binding, and then select the best path. This algorithm is in NP (there are an exponential number of paths) however, and is thus only tractable for small networks.

Path finding algorithms like Dijkstra's shortest path algorithm are more efficient. Dijkstra's algorithm computes a shortest-path tree (the shortest distance from one node to every other node in the network) while visiting each node and each edge at most once. Its run time is O((N + E) * log(N)) where N is the number of nodes and E is the number of edges. In practice, this is fast even for large, highly connected graphs.

Unfortunately, Dijkstra's algorithm cannot be used as is. Dijkstra's algorithm assumes that edges do not impose constraints on the rest of the path. This is typically the case for a network of cities and roads. But, edges in a web of trust may have a finite trust depth, which may render some of the paths they are on invalid, and they may include regular expressions, which the target User ID has to match.

Let's say that we are applying Dijkstra's algorithm to a network that looks like this:

      root
        |
        v
       ...
     |     |
     v     v
     s     t
2/120 \   /  3/60
       v v
        u
        |
        v
       ...
        |
        v
      target

Say we are considering the edge t - u, and u's current back pointer is s - u. At this point, we have to decide if we prefer the edge s - u, which has a trust depth of 2 and a trust amount of 120, or the edge t - u, which has a trust depth of 3 and a trust amount of 60. We need to get this decision right now. As explained above, Dijkstra's algorithm only visits each edge once, so we won't have a chance to try the alternative later.

Unfortunately, neither s - u nor t - u is strictly better than the other. s - u has a larger trust amount, but t - u has a higher trust depth.

Let's assume that we choose s - u, the edge with the higher trust amount. As we continue to apply Dijkstra's algorithm, we might find that the only paths to the target are too long for s - u's trust depth. But now it is too late; we can't go back and revise our decision. More importantly, we can't even be sure that there is a valid path.

With a few tricks, however, we can still use Dijkstra's algorithm.

First, we need to limit the search from finding a shortest-path tree to finding a shortest path from a root to the target binding. Then we can easily satisfy any regular expression constraints by simply ignoring edges that have regular expressions that don't match the target User ID.

Second, as shown above, a cost function that prefers edges with a higher trust amount does not always return a path when there is one. But, we can construct a cost function that always returns a path if there is one, and then use the Ford Fulkerson algorithm to find a maximum flow. (The Ford Fulkerson algorithm finds a path, computes a residual network by subtracting that path, and then loops until no paths remain.) In some situations, this may mean that we have more paths than strictly necessary. However, because we have to deal with multiple paths anyway as there is not always a single path that can authenticate a binding, this doesn't actually increase the complexity.

We can actually do better than this. Through the use of a priority queue, Dijkstra's algorithm ensures that when a node is visited, the optimal path to that node is known. Thus, we know the constraints that a path will impose on the following node, and we can use that information to select the best edge.

Consider again the above network. If the path leading to t constrains t to be a level 3 trusted introducer, then it doesn't matter that t certifies u to be a level 3 trusted introducer: the previous path limits t's certification of u to be at most a level 2 trusted introducer. Thus, we can safely prefer the edge s - u, since it has the same effective trust depth.

In fact, this isn't an optimization; we have to consider any path constraints. Otherwise, we may not find a path when there is a valid path. Imagine now that the path leading to t constraints t to be a level 2 trusted introducer. In this case, the edge s - u is strictly better (it's effective trust depth is greater), and preferring it may be necessary to find a valid path to the target.

We recommend running the algorithm backwards, i.e., from the source towards the roots. We refer to this as backwards propagation. This has the advantage that we often don't have to explore as much of the network. Concretely, if the network is divided into multiple components, then only the component with the target needs to be explored. This is more often the case when working backwards, because we don't have to consider any paths via a root. Consider the following network:

          root
       /        \
     v            v
   left         right
  /    \       /     \
 v      v     v       v
...    ...   ...     ...

When running the algorithm forwards we start at the root and we need to explore the whole network. But when running the algorithm backwards we only need to explore the left side or the right side (and often less) as the root does not not connect the two sides.

When using backwards propagation, we use the following cost function: given two path suffixes, we prefer the path suffix that is shorter. This guarantees that if there is a valid path, we will find it. If the two path suffixes are the same length, we prefer the one with the higher trust amount.

When using backwards propagation, we sometimes come up with a better solution than when using forward propagation. Consider the following network:

         root
  255/120 |
          v
          a
  255/1 /   \ 2/120
       v     v
       b     c
255/120 \   / 1/120
         v v
          d
          | 120
          v
        target

When using forward propagation (i.e., starting at root and working towards the target), we set d's backpointer to b, because that path prefix is less constrained (via b the trust depth is unconstrained, but via c, d is only a level 1 trusted introducer). This means that we would find the path root - a - b - d - target, which has a trust amount of 1.

Using backwards propagation (i.e., reversing the edges, starting at the target, and working towards the root), when visiting a, we would see that both possible path suffixes are valid, and the paths are the same length, so we'd choose the path via c, because its trust amount is higher. Thus, backwards propagation would find root - a - c - d - target, which has a trust amount of 120.

But, forward propagation would perform better on this network:

         root
    3/120 |
          v
          a
  2/120 /   \
       v     |
       b     | 1/60
  1/120 \   /
         v v
          c
          | 120
          v
        target

Finally, when using backwards propagation, we recommend not stopping when we visit a root. This is because our cost function does not actually optimize for what we really want to optimize for: we are interested in the valid path with the highest trust amount, but the cost function optimizes for the shortest, valid path. By not stopping when we reach a root, we open up the possibility that we find a longer path with a higher trust amount.

5.1. Example

Consider the following web of trust:

         alice
           | 2/100
           v
          bob
 255/120 /   \
        v     `
      carol   |
255/120 |     | 0/30
        v     |
       dave   ,
   0/120 \   /
          v v
          ed

Let's walk through authenticating ed with alice as the sole trusted root using the algorithm described above.

Dijkstra's algorithm maintains two data structures: a priority queue of nodes that have not yet been visited ordered by their cost (best first); and, a list of back pointers. (Since we are reversing the direction, the back pointers are actually forward pointers in the original network, and that's how we name the variable below.) Initially the priority queue consists of the target.

queue = [ (ed; 0; 120) ];
forward_pointers = [ ];

Each node in the queue includes the cost of the path suffix starting at that node. The cost is the path suffix's length and its trust amount. These values may be updated while the node is in the queue, but once the node is visited, they won't be updated further; at that point we've found the optimal path from that node to the target.

We start with ed, and consider each certification made on ed: dave - ed and bob - ed.

Say we start with dave - ed (the order doesn't matter). Since dave doesn't yet have a forward pointer, we set his forward pointer to ed and add dave to the queue. Then we consider bob - ed. Since bob also doesn't have a forward pointer, we also just set his forward pointer to ed, and we add him to the queue.

queue = [ (dave; 1; 120), (bob; 1; 30) ];
forward_pointers = [ (bob -> ed), (dave -> ed) ];

Next, we pop the certificate with the best path suffix from the queue. Because bob's and dave's current paths are the same length (1), we compare the trust amount of each path suffix. dave's trust amount is 120 whereas bob's is only 30. So, we pop dave.

dave is only certified by carol. Looking at carol, we see that she doesn't yet have a forward pointer so we set her forward pointer to dave, and we add carol to the queue.

queue = [ (bob; 1; 30), (carol; 2; 120) ];
forward_pointers = [ bob -> ed; carol -> dave; dave -> ed ];

The queue now contains bob and carol. We prefer bob, because his current path is shorter (1 vs 2).

bob is certified by alice. Since alice's forward pointer is empty, we set it to point to bob. We don't add alice to the queue, because alice is a root, and we don't consider paths via alice. And, as described above, although we would have a valid path when we visit alice, there may be a path with a higher trust amount, but is longer.

queue = [ (carol; 2; 120) ];
forward_pointers = [ alice -> bob; bob -> ed; carol -> dave;
                     dave -> ed ];

We now pop carol from the queue.

carol is certified by bob. We compare bob's current path to the one via carol.

  • bob's current path: length: 1, amount: 30

  • bob - carol + carol's current path: length: 3, amount: 120

We prefer the existing forward pointer because the path is shorter even though the amount of trust is smaller. If we had taken the longer path, then any forward pointers pointing to bob might become invalid. In fact, that is the case here: the edge alice - bob has a trust depth of 2, which means that alice - bob - carol - dave - ed is not valid. Thus, because we never replace an existing forward pointer with a forward pointer with a longer path, all forward pointers remain---by construction---valid.

We don't add bob to the queue, because bob has already been visited.

queue = [ ];
forward_pointers = [ alice -> bob; bob -> ed; carol -> dave;
                     dave -> ed ];

Since the queue is empty, we must have visited every node reachable from ed. Now we just need to extract the path, which we do by looking at the forward pointers: the best path is alice - bob - ed.

6. Reference implementation

A Rust implementation of this specification is part of Sequoia. See https://gitlab.com/sequoia-pgp/sequoia-wot for the source code.

In practice, this algorithm is able to solve the web of trust problem within milliseconds even for large networks that include large cliques.

7. Security Concerns

This specification assumes that certifications by different certificates are independent. This only holds if an entity has at most a single certificate. But, there are legitimate reasons for this not to be the case. For instance, a user may create a new certificate using newer algorithms, and not revoke their old certificate so that they can continue to communicate with people who use software that can only handle the older certificate.

This can result in the following scenario:

       alice
1/60  /     \ 1/60
     v       v
   bob 1   bob 2
 120 \       / 120
      v     v
       carol

Bob has two certificates and Alice certifies both of them as partially trusted introducers. Now any binding that bob signs with both certificates will be fully trusted by alice. This was not Alice's intent.

Similarly, certifications that use similar verification methods are not actually independent. Consider two Let's Encrypt-like CAs for OpenPGP certificates. If both verify bindings using an email challenge, then the security of both largely relies on the same mechanism. Conceptually, it still makes sense to combine them if the CAs are in different trust domains, but the trust amounts should probably not simply be added together.

One way to improve this situation would be to introduce a set of notations that allow the signer to indicate in a machine-readable way how a binding was verified. Then software could place limits on different types of authentication mechanisms, and better control how they combine.

8. Document Considerations

[ RFC Editor: please remove this section before publication ]

This document is currently edited as markdown. Minor editorial changes can be suggested via merge requests at https://gitlab.com/sequoia-pgp/sequoia-wot or by e-mail to the authors. Please direct all significant commentary to the public IETF OpenPGP mailing list: openpgp@ietf.org

8.1. Document History

This is a first draft that has not been published.

9. Acknowledgements

My thanks go---in particular, but not only---to Justus Winter, Daniel Kahn Gillmor, and Heiko Schaefer for many fruitful discussions about trust models, authentication, and OpenPGP.

10. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC4880]
Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, "OpenPGP Message Format", RFC 4880, DOI 10.17487/RFC4880, , <https://www.rfc-editor.org/rfc/rfc4880>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.

Author's Address

Neal H. Walfield
Sequoia PGP