Architecting & BUIDLing the GTC Token Distributor

As a developer, one of the beautiful things about open source code is that it tells a story. As you look through the release notes, commit history, features/bugs/issues, a picture emerges. It might not always be a clear picture, but at the very least, you can usually establish a foundation for how and why certain features exist and what they intend to do.

Now that the Gitcoin GTC token drop has come to an end and we open source a couple of critical components of the system, we thought it would be fun to open source the logic and design process of the Quadratic Lands token distribution too.

In doing so we hope to supplement the repositories and paint a more clear picture of Gitcoin’s first big step towards progressive decentralization.

We will focus primarily on the backend and web3 components of building the Quadratic Lands and not the UI. If you’d like to learn more about the design of fantastic Quadratic Lands UI feel free to let Richard and Octavian know.

Design Philosophy

Gitcoin fosters an open source collaborative development environment that encourages resources, ideas, and input from broad and diverse groups to shape and direct a project.

On the one hand this can be challenging – Especially with the swirl around Gitcoin’s contemporaneous spinout from Consensys + subsequent independence. At times I felt an impalpable sense of chaos working through such a divergent stream of ideas. Features and integrations were proposed, started, worked, then dropped. As an engineer I think a lot about how to minimize tech debt. At times I found our style and my obsession with minimal tech debt at odds.

On the other hand this development style set the stage for a remarkably creative Quadratic Lands. While this is evident in the beautiful UX/UI itself, I also believe it facilitated innovation on the backend and web3 sides. With an open-ended build and design process anchored in feedback and input from diverse groups, the project had room to grow and morph in ways I didn’t anticipate. I will highlight some examples of this as we explore the base engineering requirements.

Exploring The Build Requirements

In the early phases of the project, the goal was to build a one-and-done token distribution experience, not an app that would persist past the token drop. At some point, that changed. I think maybe after we fell in love with the UI. Also, as we added more missions for users to do after the drop itself, it made sense for the QL app to outlive the token claim experience.

In a perfect and decentralized web3 world, the QL interface would exist as a stand-alone decentralized app (DApp), independent of the Web2 Gitcoin.co site. Users would log in with their Ethereum account and broadcast their token claim on-chain through a fully decentralized stack.

Indeed that is what we would have preferred. However, as with any semi-complex engineering puzzle, your choices may impact other parts of the system.

In this case, we were also thinking about how we would be decentralizing other elements of the Gitcoin platform moving forward. We didn’t like the idea of prematurely committing to a web3 tech stack without a better understanding of the next step in the progressive decentralization of Gitcoin.

One of the more critical build requirements that emerged was that roughly 50% of Gitcoin users didn’t have an Ethereum address associated with their profile. This point became a crucial factor in several essential engineering decisions, including our decision to go with the hybrid web2/3 interface. If we had an Ethereum address for every eligible recipient, the case for a new and fully decentralized token distribution UI made a lot more sense.

As we couldn’t ask users to connect addresses to their accounts without tipping them off that a drop was incoming, we decided that a hybrid web2/web3 interface would be the ideal route.

To stand alone or not stand alone?

From there we faced another critical choice. Should we build a stand-alone app or develop the Quadratic Lands as a Django sub-app on the main Gitcoin.co site.

We weighed out several different factors in this decision. The vital element came down to authentication.

At the time of this writing and the original GTC token drop, Gitcoin requires a Github account to log in. That is to say, there is no way to create a Gitcoin account without a Github account, and you can’t log in to Gitcoin without your Github account.

As the drops would require a user to authenticate into Gitcoin to initiate the claim, from an auth perspective it would be more straightforward to leverage the existing Gitcoin site auth with a new Django sub-app in favor of a new stand-alone app.

With those decisions semi-solidified, we moved on to more exciting stuff like the token and distribution contract. Also, the big question: How can we most securely issue drops to users without having their Ethereum address ahead of time? Before we jump into that though, we should talk about the initial distribution list.

Crafting the Initial Distribution List

From an engineering perspective, the token distribution list is pretty cut and dry. The base distribution list is a CSV file with two required fields: user_id and amount. User_id maps to an internal user id on Gitcoin.co, and the amount is the token claim amount in WEI.

After creating the final list, it’s good to use version control and track the file hash to ensure that all working parties are using the same list. The security of the GTC token drop did not require us to keep the list private (more on that later), but up until this point, we had not released the list publicly.

The final list was ratified by a working group, and once ratified, was imported into the Quadratic Lands Postgres DB with a script so that the site knows how much each user is eligible to claim. We also used the list to seed a Merkle tree for the token distributor but we’ll talk more about that later.

While discussing the algorithm used to create the list would also be interesting, it’s probably worth a post of its own and beyond the scope here.

Selecting the GTC Token Contract

The original design premise was straightforward: Establish and distribute an ERC20 token to seed decentralized governance for the Gitcoin ecosystem.

At that point, we didn’t have governance contracts in scope for the project. The idea was that we could use Snapshot for gas-free signature-based voting, and the Gitcoin MultSig would execute any proposals that resulted in on-chain actions in good faith.

We decided to fork the UNI token contract as a base for GTC because it had delegation, minting, and we could easily fork the governance contracts too if we decided to add governance down the road (which we did end up doing).

Unfortunately, because we didn’t have Ethereum addresses for each user, we couldn’t clearly fork and use the battle-tested Uniswap Merkle distributor.

Enter Ethereum Signed Message Service

The going plan at this point was to build the Quadratic Lands app as a sub-app on the main Gitcoin.co website. The GTC token would be a fork of the UNI token contract, and we would use Snapshot for secure voting. We went ahead and implemented the Uniswap governance contracts for on-chain voting and more secure decentralized treasury management.

Next, we set out to structure the token distribution workflow to process claims and issue drops securely from the Quadratic Lands app. Rather than walk through each iteration of the design (it would take a while and probably not provide much value), we’re going to fast forward to our final solution.

Without knowing a user’s address ahead of time, we needed a secure way to verify on-chain that a given claim was valid. At the same time, we didn’t want to rely on trusting the Gitcoin.co web app. If someone found an exploit on Gitcoin.co, we needed to make it as difficult as possible for them to process a fraudulent claim successfully.
We built the Ethereum Signed Message Service (ESMS) to ensure that only token claims issued by Gitcoin could be processed on-chain. In its most simple form, the ESMS is a Flask based microservice that accepts a token claim request, verifies the request, signs the token claim (EIP712), and returns the signed claim along with the relevant metadata and Merkle proofs. We did end up adding a Merkle tree to the token distributor to ensure the distributor would only approve claims from a predefined tree. More on that in a bit.

ESMS Deep Dive

Here’s a visual look at how the claim system works:

When a user presses the claim button from the Quadratic lands app, a post request is sent to the backend QL app to initiate the claim process. That endpoint runs a series of checks to make sure the requesting user is authenticated, a valid CSFR token is present, the Ethereum addresses are properly checksummed, and so on.

If all checks pass, the claim endpoint on QL Django app will then POSTs a request to the ESMS with the necessary metadata required to receive a signed message claim in return.

Here’s an example of what the ESMS will see coming in:

{
'user_id': 42, 
'user_address': '0x00000000000000000000000000000000000000', 'delegate_address':'0x00000000000000000000000000000000000000', 
'user_amount': 65543100000000000000
}

We used IP restrictions to prevent anyone other than the Gitcoin.co server from hitting the ESMS at the network level. We used HMAC signature authentication at app level to establish the integrity of incoming claim requests. You can see in the QL app where we construct the HMAC signature of the POST body contents and then in the ESMS where the signature is validated.

The ESMS also runs a host of other checks to ensure the incoming token claim is valid and legit. You can find more information about how the ESMS processes a claim in the ESMS repo here.

Assuming all checks pass, the ESMS will reply to the QL claims endpoint with an object like this:

{     
   "user_address" : <user_address>,     
   "delegate_address" : <delegate_address>,     
   "user_id" : <user_id>,     
   "user_amount" : <claim_amount_in_eth>,     
   "eth_signed_message_hash_hex" : <eth_signed_message_hash_hex>,     
   "eth_signed_signature_hex" : <eth_signed_signature_hex>,     
   "leaf" : <leaf>,     
   "proof" : <proof>     
}

All this happens quickly in the background after users click the claim button. They are then immediately prompted to broadcast a transaction to the Token Distributor contract with the above metadata.

Now that we’ve seen how the ESMS generates a signed token claim, let’s look at how the Token Distributor validates the claim and delivers GTC.

Reversing the Token Distribution Contract

I’m not ashamed to say I have a bit of a crush on the Uniswap Merkle Distributor Contract. The simplicity and elegance are something to see. At the same time, the supporting code for generating the Merkle tree is well documented and easy to follow.

Unfortunately, because we lacked an Ethereum address for all token drop recipients, a fork of the UNI Merkle distributor wouldn’t be a clean lift for us. So, for the GTC token distribution contract, we started from scratch.

Also, after researching the Uniswap airdrop, we noticed that some recipients were confused about what token delegation is and how it works. Specifically, there was frustration because they didn’t know they had to delegate their tokens (even to themselves) before voting.

For this reason, we built delegation into the Quadratic Lands app and executed the delegate function as a part of the token claim process. While this did end up being the single most significant gas cost for the token claim, it also laid the foundation for liquid democracy and the Gitcoin Stewards program. Adding delegation to the drop did require us to refactor major elements of the drop, but it resulted in a strong and positive impact on the initial distribution of the tokens. We hope to see other projects do this in the future.

Let’s take a look at the first successful token drop claim against the GTC token distributor as an example. We can see the incoming claim metadata on Etherescan if we expand “click to see more” and “Decode Input Data.”

0 user_id 49909
1 user_address 0x25468E86ED8eC296de39FcB798C7f212924443AB
2 user_amount 9208256099999000035328
3 delegate_address 0x4C0a466DF0628FE8699051b3Ac6506653191cc21
4 eth_signed_message_hash_hex 0xf2686ee09b7bb6e7fdd8d20ad762a13535bb95bc1db1fdbe257e0afd706b510c
5 eth_signed_signature_hex 0x0e96e69aaef18506779d897725b91852f855cfb8a3b574356423eb859f953e5d0223eebda7225e5f3bb073e3071c5248c935aa379dbe7c24ed5daa451984b22e1b
6 merkleProof 0xc142a41d8cff1e3be16eeb8da966f867efffc3bcaaf993b8d6174e914a6dbd26
0x7332f27fc88e975d15d82aae4aa15ea5d01e9596650866c379fd7656fe514f90
.....
7 leaf    0x6832b71385c4d15b19a35ba4af189bbbfc252090166153387ebaaa70209b8fb3

As we can see, the fields above match with what the ESMS returns and a user broadcasts for a claim.
Let’s walk through each check in the claimTokens function to see how a claim is verified.

// only accept claim if msg.sender address is in signed claim      
require(msg.sender== user_address, 'TokenDistributor: Must be msg sender.');

Here we confirm that the address supplied by the user is also the same as msg.sender. If not, the claim will fail. There was a more efficient but convoluted way to run this check as a part of another require statement below but we opted to keep this more readable instead.

Here we make sure a user can only claim once:

// one claim per user      
require(!isClaimed(user_id), 'TokenDistributor: Tokens already claimed.');

We borrowed the logic for isClaimed from the Uniswap Merkel Distributor, but we mapped it to a Gitcoin user-id instead. Once a user claims, they can’t claim again.

// claim must provide a message signed by defined <signer>      
require(isSigned(eth_signed_message_hash_hex, eth_signed_signature_hex), 'TokenDistributor: Valid Signature Required.');

Above we check to see if the ESMS issued the claim signature. If not, reject the claim.

// can we reproduce the same hash from the raw claim metadata?      
require(digest == eth_signed_message_hash_hex, 'TokenDistributor: Claim Hash Mismatch.');

Here we take all the provided metadata and hash it together on-chain to confirm/deny that it will produce the same hash. For example, this check will prevent someone from submitting a claim with a valid signature but a different user_id.

The final two checks are related to the Merkle tree proofs, so now would be a good time to talk about what those are for and how they work.

Merkel Tree Generator

At one point during the build process we had a call with Noah from Uniswap. We walked him through our token distribution setup to see if he had any feedback or suggestions. He did. He said that even if we didn’t have user addresses for the claims ahead of time, it might still make sense to use a Merkle tree.

We had ruled out a Merkle tree in favor of the signed message claims early on, and it hadn’t occurred to me that it might be valuable to use both.

By creating a Merkle tree of the hash of user_id and claim_amount, and verifying that leaf exists on our tree on-chain, we could now guarantee that the token distribution contract would only ever process claims that exist on the original final distribution list. This was a win.

If an attacker were to own Gitcoin.co, and send malicious claims to the ESMS, they would have to iterate through any remaining/unclaimed drops one by one. Not only would that cost them gas, but it would also likely trigger some logging alerts we set up for the ESMS to monitor for suspicious drops. If we caught an attack in progress, we could temporarily suspend the ESMS to prevent loss. This could be considered one small benefit of a non-eth-account based drop.

Adding the Merkle tree would make an attack more expensive, easier to detect, and reduce the attack vector to only any unclaimed drops. For these reasons, we decided to go ahead and implement the Merkle tree into the token distributor. Also, the gas cost for verifying the proof on-chain is pretty low, so that wasn’t a problem!

With this configuration, we didn’t have to keep our distribution file a secret while the drop was live, as access to the drop list would not allow an attacker to make fraudulent claims.
You can find our Merkle Tree Generator code here if you’d like to see how that works.

The Audit

Most of the contracts we planned to deploy were forks of battle-tested contracts with minor edits. The token distribution system and Ethereum Signed Message Service being an exception. For this reason, the audit focused on these components of the system.

Working with the ConsenSys Diligence team was excellent. The process was fluid and easy to follow. One of their initial recommendations was to streamline the compiler version across all contracts. Updating the compiler version required us also to upgrade some legacy functions on the battle-tested contracts. We were able to get the compiler version up to 0.6.12

You can find the full audit report here if you’d like to see detailed information on the audit findings and fixes.

Wrapping Up

From chaos comes beauty. While a more standard and streamlined development process undoubtedly makes sense in some cases, for example, if you’re building an enterprise security solution. If your premise is less structured and the final product is more fluid, there is a lot to be gained from the Owocki yolo style

After 30 days of the drop being live, we’re pleased to report back that we had no security issues with the token drop. We took time before the drop to ensure log collection was set up and also monitored them accordingly. Indeed we saw some clever attempts to circumvent the system, but nothing crazy enough to require action on our part.

From a functional perspective, there is one thing that stands out that we do differently. After a user broadcasts their claim transaction, there is a short period of time where it can be tricky to track state in the app. The old pending transaction problem.

Anyone who’s dealt with a pending transaction state in an app is probably familiar with this challenge. Once a claim TX is broadcasted to the network, it can go into a pending state for an unknown amount of time (especially if the transaction was submitted with a low fee).

Because we didn’t make the QL app smart enough to recognize claims in a pending state, we noticed some users attempting to broadcast redundant claims. While some of these were probably malicious, most of these requests likely stemmed from us not tracking the pending state in the app and preventing users from hitting the claim button again.

In addition to the evidence we see in the ESMS logs of this behavior, we can also see duplicate claim attempts failing on the contract periodically.

Frustratingly, we had this on our radar before launch but didn’t make it a priority. If we did it again, we would certainly do a better job tracking the pending claim state and preventing users from attempting duplicate claims!

Overall, we’re quite happy with how the token drop went. Hopefully this post will help to add some color to the repositories and provide guidance for others that might want to implement a similar system.