This post focuses on outlines the vision of web3.storage. The state of the post reflects where the product will be in Q4 2022. For instance, we are currently in Beta of our new upload API that natively uses UCAN for auth, and will have this incorporated in the web3.storage client and website the coming weeks. In the meantime, some of what is outlined in this blog post might not yet be incorporated into the core web3.storage and NFT.Storage products.

Click here to read Part 1 of our introduction to the web3.storage platform!

The DAG House team is building the web3.storage platform to let developers take full advantage of the compelling advantages of the Data Layer and build the next generation of experiences for them and their users. In this section, we introduce the web3.storage platform “stack,” which involves the protocols of the Data Layer, and the specific products of the platform built to make them easy to use for developers.

📣 Today, NFT.Storage uses the web3.storage platform on its path to provide off-chain NFT storage as a public good, so if you use NFT.Storage you can take advantage of the data layer for free!

Protocols we use

The cornerstone of the stack is the protocols of the Data Layer. We gave an overview of these earlier, but getting more in depth gives a picture of all the advantages that these protocols provide. This section gets a bit into the weeds around a lot of acronyms, but the concepts themselves are familiar ones, so please do bear with us!

  • IPFS: Peer-to-peer protocols that references data by its unique content identifier (CID)

  • IPFS itself enables networks where users can host or retrieve data using a CID, which is unique to the data, and access it as long as there is at least one peer on the network hosting a copy of the data

  • web3.storage participates in the public IPFS network, which is what most people reference when talking about IPFS - anyone can get data off the network with its CID

  • IPFS utilizes IPLD under the hood - it allows IPFS to easily link blocks of data together (to represent things like file systems, databases, and JSON, as well as how a single large file gets split up into smaller blocks)

    • As a result, everything in IPFS can be represented as a graph of blocks
    • This can create efficiencies around deduplication, efficient loading and diff’ing of data, and more
  • We generally exchange data in IPFS in the form of CAR files, which is a serialized set of CIDs and block data into a file

    • Since it’s a file, this puts everything one needs to interact with IPLD-structured data into a convenient format that can be sent over any transport (HTTP, etc.)
    • Since client-side CAR generation results in the CID being generated locally, the user can have full confidence no one tampered with it, even when they access it later
  • If you’d like to dive a bit deeper, check this out, which dives deeper into the technical details of the underlying data protocols

  • DID: A type of identifier that enables a verifiable, decentralized digital identity (decentralized identity document)

    • Any private-public keypair can generate a unique DID
    • Since DIDs are globally unique, they can be used to represent any actor in a system, opening things up even further
    • For instance, within the web3.storage platform, users, storage accounts, virtual machines, and more can all be viewed as separate actors
  • UCAN: An auth token that self-contains all information for a service to cryptographically verify that a given actor has authorization to perform a certain action

  • UCAN tokens are issued using a DID and a signature from the DID’s private key; this signature can be used to verify the validity of the token

  • This means no external sources of truth are required to authorize services, and the same UCAN token can be used by any service that recognizes that DID for the services it requests

  • Further, UCANs allow for delegations of permissions across DIDs, meaning a user with a certain set of permissions can sign and delegate a subset of its permissions to other DIDs

  • And since anything with a private-public keypair can generate a DID, UCANs can be an auth layer that naturally plugs into existing identity primitives and services (e.g., crypto wallets)

  • With IPFS’s ability to immutably address any data and link it together, and UCAN’s self-containing authorization, these two protocols can also be used in tandem to unlock even more

    • An IPFS graph can contain all data needed for a service to provide an operation, including the UCAN token (e.g., store this data! run this workload on top of this data!)
    • With a standardized structure to make calls across interfaces and using CAR files as the transport to ensure all data needed is there, this opens up RPC layers that are interoperable across services
    • After a workload is run, a proof containing the input, authorization, and output can be generated using IPFS
    • So there’s no centralization of the workload, and anyone can provide the service as long as they’re compatible

Products in the web3.storage platform

So how does the web3.storage platform use these protocols today? Let’s tick through the products!

  1. w3up: Data storage service

  • Uploaded data is stored on an instance of Elastic IPFS, a cloud-native, scalable, open source implementation of IPFS that we wrote

    • Everything you read on this page is enabled by the reliability and performance that Elastic IPFS provides - this is one of the web3.storage platform’s major innovations!
  • Data stored on w3up is also stored in multiple Filecoin storage deals, with the physical storage itself verifiable using cryptographic proofs and cryptoeconomics

  • w3up implements UCAN based authorization which decentralizes authentication and moves us away from web2 style API keys issued by the provider; any DID that registers can store data and delegate that ability to others

  • w3up comes with some awesome clients

    • CLI and SDK: Makes it super easy for you to upload to web3.storage using UCANs and delegate others to upload on your behalf

      • Like the old client library, a CAR file of the data is generated client-side, meaning you can verify that the CID of your data is the correct one
      • Unlike the old client library, CAR generation is done in a streaming manner (speeding things up and alleviating memory constraints for large files)
    • w3ui: Mobile-first front-end Javascript modules that can be plugged into various applications to give users the ability to do what they need to with UCANs and to interact with the web3.storage platform

      • When the new API is implemented in the web3.storage and NFT.Storage websites, the login and upload consoles will be based on w3ui!
  1. w3link: Public IPFS HTTP Gateway that is optimized for performance

  1. w3name: Cryptographically secure mutable references

  • Since IPFS generates a unique content identifier for a given piece of data using a hash function, just changing one bit of that data generates a completely new identifier

  • However, in modern web use cases, you often want the ability to use the same identifier to reference changing data (something we take for granted with HTTP, where we reference files by their filename)

  • You can do this using w3name, which implements the IPNS protocol that allows users to use a public / private keypair to generate and update a static address

  • There are other ways we’re thinking about doing mutability (e.g., mutable buckets with their own DID) that will come in the future!

  • Not yet public: There are other products that we currently consume internally, and will release public versions in the future

    • w3access + w3wallet

      • These libraries are used to generate and manage a keypair associated with different devices and identities (w3wallet) and register and validate identities for web3.storage services (w3access)
      • They are currently embedded in w3up, but will be modularized so that anyone can easily generate / manage keypairs in their preferred way (e.g., with a crypto wallet) and register corresponding DIDs for use with the web3.storage platform
    • w3query: Engine to run workloads on top of content addressed data + on top of other data that has globally unique identifiers, like blockchain data

      • We use w3query internally to process uploads via web3.storage
      • However, it’s an engine that can run on top of CIDs, contract IDs, NFTs, etc. - eventually, we envision a marketplace where folks can query in 3rd party services and use them to run compute workloads on their data from a single place
      • In addition, the output of the workload can embedded in a verifiable proof that the workload was run, using CIDs to index the output value, the input data, and the query run (the interface for running a unit workload will be called bucket-vm)

And to reiterate the benefit of UCANs:

  • Anyone can delegate other users to use our services
  • Any other service can use the same open protocols to provide users value (including their own storage, compute service, etc.), making using multiple service providers or switching among them frictionless
  • In addition, if you just want to interact with our service directly but benefit from using others’ services, you can delegate permission for us to interact with them on your behalf

Next up: The platform in practice

In this post, we introduced the products and products within the web3.storage platform. In the next post, we will discuss the how the platform is used in practice - how you might build the next killer app on top of it, and what more to expect in the future!