Skip to content

Part 1: Authentication and Crypto Wallet Management – Ethereum Payment

In this series of articles, I’m doing a walkthrough of a hobby project I’ve worked on, a payment system for the Ethereum blockchain. Go read the intro article for more context about this project.

In this part, I will cover the following topics:

  • Connecting a crypto wallet via a frontend, and detecting states in MetaMask
  • Backend endpoints needed to manage wallet connections, wallet signature, and JSON Web Tokens.
  • Data models for users.

You will find that there isn’t much complexity in this article. Authenticating a crypto wallet is most about calling methods from the wallet API from the frontend, and then passing data around between the frontend and the backend for verification.

Connecting a Crypto Wallet

Whatever software you build that will have to interact with the Ethereum blockchain, either reading data or transacting, you’ll need to connect to an Ethereum node to access the Ethereum blockchain.

For most use cases, you won’t be running your own Ethereum node, and you won’t be calling directly any Ethereum node either. Instead you’ll use a third party that offers a centralized point of access to the blockchain: you connect to their backend servers, which are connected to the blockchain. It’s kind of counter-intuitive that we have to use centralized third party providers when you think about it given that one of the core tenets of the blockchain is that it’s fully decentralized, but that’s what we have to deal with these days.

As a human user of the blockchain, you need a crypto wallet to initiate various actions and execute smart contracts. Every Ethereum client implements the same JSON-RPC specifications, which is the standard that applications can rely on to interact with the blockchain.

A very common wallet is MetaMask, which is a browser extension and phone app, and it enables you as a user to own a crypto wallet, and also to own and interact with assets inside of your wallet that are stored on the blockchain. This is the crypto wallet that I have chosen to support for this project.

The figure below is a complete representation of the sign-in, verification, and token management part of the payment system. You can click on it if you want a larger version of this image.

System digram of the steps involved in connecting and verifying an Ethereum crypto wallet (click to zoom)

For my needs, I will only use a few of the actions offered by MetaMask, and in this article it will be only the actions eth_requestAccount and personal_sign. But note that MetaMask offers more actions, and also, ways to add event listeners in case the user clicks and changes things in the MetaMask extension itself while browsing your website. For example, MetaMask supports the storing of multiple wallets within the extension, and the user can pick which wallets he wants to use for a particular website or interaction. The action of selecting a wallet triggers an event that you can create a listener for in your frontend code, so that you can react to it the right way. This is the event accountsChanged. There are other actions, you can check the full list in the MetaMask documentation at this link.

One important thing I want to mention is that via MetaMask, or via any other wallet, you are never interacting directly with the blockchain, or with nodes that are directly hosting blockchain data. MetaMask being a centralized for-profit provider, they have created their own infrastructure of backend and caching servers that are connected to the blockchain. When you use MetaMask, your web browser is calling those privately-owned backend servers.

Authentication and Verification

For a payment system, I would want to verify transactions on the blockchain anyway before giving rights to a user or customer. However, even before doing so, I would also want the system to verify that the user is actually the owner of the wallet, as a form of fraud prevention. This can be done by a crypto wallet by signing a random piece of information provided by the server, and sending it back to the server for verification. This is one of the core ideas of asymmetric cryptography, which is prevalent in blockchain technology.

Steps 1 to 14 are representing this authentication phase. In step 1, the user clicks the sign-on button, which leads the frontend code into step 2 and triggers an action on the local MetaMask browser extension to retrieve the hash of the wallet address that the user wants to use.

Then in step 3, the frontend client passes this wallet address by calling the /api/user/wallet_nonce endpoint, which in step 4 creates an entry for this user in the database if it doesn’t already exists, followed by the generation of a nonce in step 5 that will be used by the wallet for the verification—also stored in the backend database—which gets returned to the frontend client in step 6. A nonce, or number used once, is a cryptographic term that refers to a random number that is generated for a specific use, generally for single use, and is then discarded once its purpose has been fulfilled. You can see the source code for this endpoint here.

The User schema in the MongoDB database.

With this nonce, the frontend client can ask the MetaMask wallet to use the cryptographic key for this wallet to create a signature. This signature will be unique to the pair [wallet_address, nonce], and therefore is a trustable guarantee that whoever created the signature is the rightful owner of the wallet. Step 7 in the diagram represents the creation of this signature, which the frontend client then sends to the backend via the /api/user/wallet_verify endpoint in step 8.

While in the /api/user/wallet_verify endpoint, in step 9, the nonce is retrieved from the database for the wallet address that the user claims he’s the owner of, and in step 10, the signature is tested to verify whether it matches the pair [wallet_address, nonce]. Below is what steps 9 and 10 look like in the source code, which you can also directly check here in the repository.

let user = await User.findOne({ eth_wallet_address: new RegExp(request.body.address, 'i') });
if (user?.auth?.nonce === null || user?.auth?.nonce === undefined) {
    return response.sendStatus(500);
}

const recovered_address = recoverPersonalSignature({
    data: `0x${toHex( user.auth.nonce )}`,
    signature: request.body.signature,
}).toLowerCase();

// Updating the nonce to prevent replay attack
const generated_nonce = Math.floor(Math.random() * 1000000).toString();

if (recovered_address !== request.body.address) {
    return response.sendStatus(500);
}

// Here continue the flow of the program

If step 10 is successful, then in step 11, a new nonce is generated right away and stored in the database. This is to prevent replay attacks, which is when an attacker might have captured traffic between the frontend client and the backend, and would use the signature himself to call the server and make himself pass for the real owner of the wallet.

Handling the JSON Web Token

In step 12, the behavior of the application becomes more similar to a regular web2 application, and the backend generates a JWT, or JSON Web Token, which will be used to keep the user logged in and manage rights from the client. This is pretty much what many applications are doing after they have logged in users via a classic login/password or email/password process. So from that point on, it’s as if the user logged in via email, except that as a website owner, you don’t know or store any login, email, or password about the user, you only store their crypto wallet address, and you identify them this way. That’s a huge deal, because this means that crypto wallets could and will eventually replace passwords as the way to authenticate into any system.

In step 12, the backend endpoint returns the JWT to the frontend client. At that moment, the client knows that the backend has accepted the wallet as rightful, and that the adequate rights have been granted via the JWT. So in step 13, the client stores the JWT and the identified wallet in the localStorage of the web browser, so that it can be reused for any future request to the backend.

Finally, the last sequence of steps is for the client to trigger a polling mechanism for the JWT. Indeed, once the user has been authenticated, the JWT needs to be renewed every hour to make sure that the user keeps access to the website, and after one week of inactivity, the user will be automatically logged out and will have to re-authenticate to access the content or product protected by access rights.

What’s next

In the next article, I will cover the data models used to represent products, and to display past and possible products to users. I will also cover how the JWT is used to verify that the user has the access rights he needs to access the correct products.

Join my mailing list below so you can be notified every time I post something new!

Join my email list

References

Published inAlgorithms and Programming

One Comment

  1. Kgeila Charles Tjikana Kgeila Charles Tjikana

    Very educational

Leave a Reply

Your email address will not be published. Required fields are marked *