Why is Sharding Hard? NEAR Protocol Founder, Alexander Skidanov. Episode 7: No Sharding Podcast
About Solana: We are a lightning-fast distributed ledger technology for mission-critical decentralized apps. This podcast is a discussion between our staff, community developers, and industry leaders. You can follow us on Twitter @solana or GitHub @solana-labs. Subscribe on Spotify, Apple Podcasts, Google Podcasts, or the direct rss feed.
Anatoly Yakovenko: Welcome to Solana. We are a super-fast block chain project, bringing proof of history and in turn 100,000x speeds to the blockchain ecosystem. This podcast is a discussion between our four staff, industry leaders, and top contributors to our open-source project. Find out more at Solana.com; that’s S-O-L-A-N-A dot com. And you can also follow us on Twitter, @Solana. Now on to the show.
Alexander Skidanov: Hi everyone, this is Alex from Near. I am in this awesome podcast today so let’s see how it goes.
Anatoly Yakovenko: Hi, this is Anatoly and you’re on the No Sharding podcast.
Anatoly Yakovenko: So, Alex, I think we met last year at a conference and you guys were just starting and I think we were just maybe two months ahead of you, right? Solana, we got the company going and I was already kind of building stuff and I met you guys and you were like, “Hey, we are going to do this crazy scaling solution” and I was like, “Hey, we are competing with you to the death.” But it ended up, you guys were cool and friendly and you were working on something totally different than what we were doing. So you want to talk about Near a bit?
Alexander Skidanov: The primary focus of Near, I guess, is to build something that people can actually use. To build a block chain for which you can develop easier, but also so that the applications you build are easy to use. But from the technology perspective, under the hood one of the parts of usability with the blockchain needs to be faster than what is available today. So, the way we make it faster is through sharding. We make it such that not everyone who is operating the blockchain is validating every shard; is validating every transaction, right?
Alexander Skidanov: So if you have 10 million transactions we don’t force every single validator to validate all of them. We split them in some way and make some of the validators validate some of them and some other will validate some others. And then, you tie it all together in such a way that it’s very unlikely that something goes wrong and an invalid transaction goes through.
Anatoly Yakovenko: I mean, that’s really like the primary difference, you know? When I started Solana, I didn’t even think of sharding because I was really focused on this, “I know hardware and I’m going to make it work faster with hardware”, right? To me, it seemed like the only way to do it and I was terrified that everyone else was thinking the same thing and I need to move as fast as I can. But it turns out, no one else is thinking the same thing and we might be the only ones crazy enough to do it.
Anatoly Yakovenko: So why is sharding hard? Why is it such a hard problem that there isn’t this one design that everyone is like, “Oh yeah, that’s the sharding design?”
Alexander Skidanov: First, I think there is sort of the one design that everybody comes up with. Like if you go online, and find some sharded blockchains then load their white papers they will be very similar. But the complexity lays in the fact that because not everybody validates every transaction, you need to be meaningfully certain that if someone at some point did try to approve a transaction which is invalid, that was noticed and the transaction didn’t actually settle on the blockchain.
Alexander Skidanov: So the idea here is that on any blockchain which is not sharded, on any blockchain where everybody validates every transaction, the concept of an invalid transaction doesn’t exist. Like with an invalid finalized transaction. In the sense that, if you’re on Solana or if you’re on BitCoin for that sake, if you’re on a note you will go through the beginning of time and you will validate every single transaction that ever happened, you know the blockchain is valid.
Alexander Skidanov: Though I guess, it becomes a little bit hard even if you’re not running a sharded blockchain, right? If Solana is processing, I don’t know, several hundred thousand transactions per second and it’s actually at capacity, then ten years in replaying all of the transactions will become pretty hard. Right?
Anatoly Yakovenko: It would only take two years to compute at any given time, at most.
Alexander Skidanov: To validate all of the transactions. But the number of transactions grows, right?
Anatoly Yakovenko: Sure, but compute doubles every two years.
Alexander Skidanov: Well, yeah. If that happens then yes.
Anatoly Yakovenko: I mean [inaudible 00:04:06] right because it is truly paralyzable the number of cores will double the amount processing you need.
Alexander Skidanov: Right. But you will never know when it will actually plateau in the amount of computable stuff doubling every two years.
Anatoly Yakovenko: So, single core performance has definitely plateaued but the square area of silicon that people ship is going to continue doubling.
Alexander Skidanov: Well, yeah. I’m by no means an expert there so I cannot argue here. So even today, when people spin off on a theory note, majority of them are not validating all of the transactions from the beginning of time. What they do is they say, “Well, here is the block which I know is the highest today. So presuming I’m not eclipsed in any way, it got to be that it corresponds to the valid chain, or at least if I go one day back from the block that means that either that block as of one day ago is valid or someone for one day controlled more than 51% of miners and was building on top of an invalid block.” Which, for a majority of people it is sufficiently convincing to choose the chain and not to validate everything that happened before that block.
Alexander Skidanov: So sharding kind of uses the same idea. We say that in the moment when a block is produced with some subset of transactions, a subset of all of the validators of all of the block producers in the system they are test to the block. And if the block is invalid you can provide cryptographic proof and you can deprive those people from some value.
Alexander Skidanov: So the idea is that if you are looking at some chain that has been built for a year; if everything is built properly you sort of know that if that chain has anything broken in it, someone would have lost a lot of money.
Anatoly Yakovenko: So you’re still relying on slashing to validate-
Alexander Skidanov: Yes.
Anatoly Yakovenko: To provide security between the shards?
Alexander Skidanov: I think in the blockchain space, sharding historically can mean one of two different things. One is, what we built or what Ethereum is building, where you make an assumption that it is impossible for anyone to validate all of the shards, or at least it’s impossible for a majority of actors. Maybe a particular coin base who does need to be certain that everything is valid, they might buy a huge cluster and invalidate all of the shards. But you expect the majority of validators in a decentralized system and they cannot validate most of the shards.
Alexander Skidanov: And there is another school of sharding with like QuarkChain and CADENA, where the idea is that it is the expectation that the majority of entities are validating a majority of shards. Some of them might be validating on the subset. At least in CADENA’s case, I think in QuarkChain they actually expect everyone to validate all of them. But it is expected that most of them validate all of the shards.
Alexander Skidanov: So in this case, you can still rely on yourself replaying all of the transactions. So that’s called sharding, it’s not relying on slashing, they use proof-of-work they’re not using proof-of-stake. But in the school of sharding that Near and Ethereum are in, you do need to rely on slashing because someone needs to lose something that has a lot of value if they misbehave.
Anatoly Yakovenko: You’re also relying on slashing just for the internal shard validation, right? Because each shard is running its own consensus.
Alexander Skidanov: So not in the latest Near design. So what we have been building since April is a little different.
Anatoly Yakovenko: Oh cool.
Alexander Skidanov: So when we started Near, our initial design was very similar to what many others built, right? Like if you unload any sharding papers, if you unload Harmony for example, or if you unload MultiVAC, they’re all sort of similar and we had the same design. That’s what you would come up with if you just sit with a piece of paper and try to design sharding. And the idea would be something along the lines of let’s have not one blockchain but multiple blockchains. Every blockchain will run its own consensus and there’s going to be one overarching blockchain. It’s usually called beacon-chain, I think it comes from Ethereum and many people just use the term. And on the beacon-chain, the idea is that every time the block is produced on the shard chain, the hash of it goes to the beacon-chain and the fork-choose rule on the shard-chain respects the last blog that was snatched to the beacon-chain.
Alexander Skidanov: So for example, if there is a longer chain that does not have a block which is on the beacon-chain that chain will not be respected. Right and so the idea then is that unless beacon-chain works out, if a block is finalized from the beacon-chain from the shard-chain then the shard-chain can not fork out. They call this concept shard security.
Anatoly Yakovenko: So the beacon-chain is the linearizer for the blocks from the shards? Okay.
Alexander Skidanov: Yeah. That’s the design of sharding that many protocols use. And that design, we spent a lot of time this year, last year, sort of promoting the idea that that design has a huge security hole. Which is, in that design, I guess what is important to mention is that the block producers, the people who produce blocks in every shard, if they choose which shard to produce blocks for, that is obviously insecure.
Alexander Skidanov: Because an adversary with very little cash power or stake power can just throw all of their resources to one shard and just overtake it. Right and so they need to be assigned randomly and here the problem is that while you can do some math and prove that, let’s say if the total population of block producers has less than 25% of malicious actors then every shard will have less than 30 guarantee it if you choose constants properly. Just due to the fact that you sampled from a… if you sample from a collection which has certain distribution, your sample will have a similar distribution. Plus/minus variance, right.
Anatoly Yakovenko: So if you do 10,000 choose 100. That’s a huge number. So the combinatorial that you should be able to choose random 100-slots, right?
Alexander Skidanov: Right.
Anatoly Yakovenko: Out of the 10,000 that represented distribution that is very unlikely to have a-
Alexander Skidanov: Exactly. Yeah. So that’s usually the idea. Every protocol will choose two constants the size of the total population of validators inside of a single shard and then they will run a formula to say “yeah, you know”. If the total population has less than 25%, we are certain that the sample has less than 33.
Anatoly Yakovenko: But don’t you end up with any significant stakeholder basically validating every shard? Because if I have 10% then that means I have a thousand slots, right?
Alexander Skidanov: Yes. And so the idea there is that if you are a significant stakeholder you’re expected to have hardware, you should be capable of validating all of the shards. But in a sufficiently decentralized system those significant stakeholders should not control too large of a percentage. You want them to control less than 30, maybe less than 40% combined and you want a long tail of the smaller stakeholders. So those small stakeholders, they can not validate all of the shards. They have a smaller stake, they have less hardware available to them.
Alexander Skidanov: But then, the security issue in that model, where in practice it probably doesn’t work very well, is that an adversary will not try to control a large percentage of the population before the sampling happens. The adversary will wait for sampling to happen, wait until they know who is validating which shard, and then they will just try to identify people in the specific shard, go to them and try to adaptively corrupt them. And that is going to very cheap if you have hundreds of shards, those people in one of the shards, that’s only 1% of the total stake. Right? And that stake is the only thing that is valuable to people, that’s the only thing they can lose. So you can go there with that stake x2 and just bribe them. Give them the stake multiplied by two. Which will-
Anatoly Yakovenko: Remote code execution, right?
Alexander Skidanov: Well, that is a separate topic. That’s more-
Anatoly Yakovenko: That’s the more likely attack, right? Where we just wrote this code over the last year, coding till 3:00AM drinking a lot of coffee, right? What’s going to happen with all of these chains when they launch?
Alexander Skidanov: Right. So it’s a little different because if there is remote code execution, it’s likely that most of the shards will be attacked. So it’s more likely that the society will be okay with a hard fork. So, at least in my mind, this is less of a risk than if you corrupt a single shard that might not be revealed for a while and it might not be sufficient for the-
Anatoly Yakovenko: But a sophisticated attacker, right? One is anyone with a significant stake has higher levels of security, right? To do some sort of proxy-sider that’s in a sandbox and stuff like that. But if you’re talking about an extremely large distributed network of unsophisticated low-stake nodes, they’re likely storing their keys on their computer that they’re running this, right?
Alexander Skidanov: Yeah.
Anatoly Yakovenko: Right. It’s just a bunch of poorly configured devices and you’re hoping the aggregate security of that is higher.
Alexander Skidanov: Yeah. But I think the bigger point here is that if a single shard gets corrupted, it doesn’t matter if it was corrupted by bribery or remote code execution. You need the system to be able to recover from it. But remote code execution is a slightly different quantum, because if your remote code execution is discovered, you’re doing it hard fork no matter what. Because once you discover the remote code execution for you and attack code zero, so you can continuously attack the system and then the system loses-
Anatoly Yakovenko: But it might not be an attack in your code, right? We’re talking about hard [crosstalk 00:12:35]. Right the entire stack needs to be hardened. We’re booting up our Tour de SOL right now and there’s a set of professional validators that understand that they are storing keys that are custody of a lot of value, right? And those need to be stored properly but there’s not 10 thousand of them. And if you want a network with 10 thousand, you’re talking about poorly configured devices.
Alexander Skidanov: Yeah, that’s a risk. And so from here so let’s say the adversary does manage to corrupt an entire shard, then they can not do a fork to the past from the last block which was snatched onto the beacon chain. But what they can do is they can create an invalid block and if they manage to corrupt two-thirds of the validators, they will sign on it, and they will snatch unto the beacon chain.
Alexander Skidanov: The beacon chain has no way to validate it. The beacon chain doesn’t have capacity to validate all of the shards. And so now we have a block finalized that is completely invalid. And if you do some cross shard communication to another shard, again another shard does not have the capacity to validate all of the shards from which cross-shard communication is happening. So, also they will respect the block, they will check the numerical proofs, they will say, “Yeah, yeah that’s great. Let’s execute it.”
Anatoly Yakovenko: If over 30 or 33% of the stakeholders are the large validators that validate every shard. Then the network should halt, right? Because they’re like, “Oh screw this”, right, “We’re not validating this block.”
Alexander Skidanov: Right. Yes. So if over 33% are validating every shard, yeah, at least it will help. Right. Which is also very bad because you sort-
Anatoly Yakovenko: Not as bad as somebody selling a token, converting it to Z-cash and being gone, right?
Alexander Skidanov: So technically if more than 33% validate all the shards, that should be extremely complex to actually corrupt 66% of a shard. So likely you will corrupt at least a few of the bigger stakeholders in that case, right? So that is not as bad. So that problem is called state validity problem. You need some extra measures to validate state. There are some common solutions to that, the most common solution is you say that there is some period after the block is produced, when everyone can challenge it.
Alexander Skidanov: And so that problem brings us to a second problem which is way more complex problem which is called data viability problem. Which is, if you actually do control 66% of a shard, you will produce then invalid block but you will not send it to anyone. You will only send the header of the block and the numerical proofs that are needed for the tower shard communication you are initiating.
Alexander Skidanov: But you will not send a block to anyone and so the honest actors in the shard, they cannot validate the block, they cannot prove it is wrong and so they can not challenge it, right? So we need some way of ensuring that every block that is produced on the shard chains is available to everyone. So that is significantly more complex problem. And very few protocols are actually actively addressing it, and Ethereum Serenity has a solution, PolkaDot has a solution, and we have a solution, which is almost identical to PolkaDot’s.
Anatoly Yakovenko: So what is the solution?
Alexander Skidanov: Both Ethereum, ours, and PolkaDot’s solution, they all rely on erasure code in one way or another. When the block is produced, it’s erasure coded. The shares of the erasure code is distributed to some large percentage of people. And you somehow make yourself convinced by sampling, that there is sufficient shares to reconstruct the block. And the difference is that in Serenity, it’s called fraud-proofs, it’s a paper by the [inaudible 00:15:53] I don’t remember the second author, they actually do sampling from a wide range of lite clients.
Alexander Skidanov: While in Polkadot and Near, the recipients of the shares are the actual block producers, the whole set. Not just the block producers of a shard but the entire set of block producers. And you need 16% in our case, to reproduce the block. Which means that if you have a block which has 50% [inaudible 00:16:14], if you make an assumption that less than 33% will be concealing their shares, you have enough to reproduce the block.
Anatoly Yakovenko: But you still can’t validate the block, right? You’re just trying to fish for-
Alexander Skidanov: So now, if you have a block which has 50% of signatures of all of the validators of the global set, you know that 16% are available you can get that 16%, now you have the full block. Right?
Anatoly Yakovenko: Sure, but somebody still needs to go look at the block and decide, “Oh, this is a bad block.”
Alexander Skidanov: So this is where, I think, our approach is slightly different from, let’s say, PolkaDot. In fact, in PolkaDot they say that either at least one of the validators on the shard is still honest or there’s an external validator, there’s someone external who just cares about shards who exists, who will challenge it. We formalized it a little more, what we are saying is, we have this set of hidden validators. We have a set of, let’s say, there is 100 shards. So if you have 100 shards there’s 10,000 of them, if it’s 10 shards you have a thousand of them, et cetera. Those validators are unknown.
Alexander Skidanov: What you don’t know which shards they are validating. So for a particular validator at the beginning of an epoch and let’s say epoch is like once per day, they internally run a VRF, which tells them which shards to validate. And during the epoch they will be validating those shards. After validating the shards, the blocks, they are not signing on those particular smaller blocks, they are signing on the whole, let’s call it the Beacon Chain block. Right? They sign on the Beacon Chain block saying, “Whatever shards made it up to the Beacon Chain block, I validated them and everything was good.” All right?
Alexander Skidanov: And there’s a couple more measures there because A, they can be lazy, that’s a common problem with fisherman, that fisherman can be just observing the network and C, if nobody challenges then they will not challenge as well, they will say, “Yeah, probably if the block was invalid some other fisherman would have fished. So because they don’t see anything I will just-” and so we have a measure against that.
Alexander Skidanov: Effectively, first commute, yeah there’s going to be [inaudible 00:18:10] scheme, and there’s a couple more smaller attacks. Like for example for data availability, you might want to sign on the block actually not having a share for one of the shards, because if you don’t sign off the block then you’re not going to get the reward if the block actually does make it in. So it is efficient to sign on every block. And if someone tries to unload your share just to pretend the network is slow or something and not give it, so we have some protection against that.
Alexander Skidanov: Also, we don’t model it as a beacon chain and shard chains because there are some complexities with modeling in this way. You have multiple [inaudible 00:18:47] rules, if an invalid block is detected you need to unroll all of them, the interaction becomes complex.
Anatoly Yakovenko: So what happens if you have an invalid block?
Alexander Skidanov: Right. If you have an invalid block you unroll.
Anatoly Yakovenko: So the beacon chain unrolls?
Alexander Skidanov: The beacon chain unrolls as well, yes. And so the way we model it, we just model it as a single blockchain. Instead of saying there are shard chains on the Beacon Chain we are saying there’s one big blockchain with humongous blocks, well the block contains all the transactions for all the shards logically but physically the block is split into those small parts. So you can think of it as multiple shard chains but they are producing blocks as frequently as the beacon chain. But the modeling of this way is easier because I only have one [inaudible 00:19:24] there’s one blockchain and that [inaudible 00:19:27] rule. Now you can experiment with specific [inaudible 00:19:30] without thinking how it interacts with the shard chains.
Alexander Skidanov: But yeah, the beacon chain can unroll if there is an invalid block. So, if actually you see a block, you either optimistically assume that there is nothing invalid in there because someone will lose a lot of money. Or you wait. There is a certain period beyond which it will not unroll.
Anatoly Yakovenko: How do you guys think about slashing? My view is that you need, eventually, networks need to be at 100% slashing.
Alexander Skidanov: Right. So we were building 100% slashing, we just now started thinking that that’s too harsh.
Anatoly Yakovenko: It’s scary right?
Alexander Skidanov: Yeah. And people can get slashed for… But it also depends slashing for what? Slashing for being offline? Initially we did not plan to slash at all for being offline, we just kick you out after a couple epochs. We are thinking right now whether it should be the case or not, whether there should be some slashing for being offline. The problem with it is that you can not effectively prove that someone was offline. Someone can just get censored, you know?
Anatoly Yakovenko: If the network doesn’t like you then-
Alexander Skidanov: Yeah. But you shouldn’t lose money because the network doesn’t like you.
Anatoly Yakovenko: Right, yeah. Right.
Alexander Skidanov: And so maybe kicking out is just good enough and then just not allowing you to rejoin. Which is sort of like a small slashing but it’s more depriving you from a reward. Which people, I think, psychologically are more okay with than actually losing money, even though it’s kind of the same thing. But then, if you produce a straight-up an invalid chunk, chunk is the term we used for the small part of the block which most of a particular shard, so that’s something which would be a shard block in a more classic design. So if you get slashed, the amount you get slashed for should be pretty large.
Anatoly Yakovenko: That you can prove cryptographically?
Alexander Skidanov: That you can prove cryptographically, yeah. Or double signing, right.
Alexander Skidanov: And there the biggest concern is that if there is a bug in the code you can get slashed. But the idea is if you get slashed due to a bug in the code then the chains would probably pay you back. There should be some process built into the chain where the chain can agree… like simple things like that. Such hard forks, quote-unquote, hard forks they should be on the protocol level, simple to do. Like the society should be able to agree on the fact that someone was slashed due to the bug.
Anatoly Yakovenko: If it’s like a catastrophic bug, I think the danger is that if you have an exploit that somebody is very clever and just kind of siphons off, you know, like the superman 3 attack where they get the rounding error of every transaction, or something like that. But then it becomes very hard to undo that state.
Alexander Skidanov: Right. So, if that happens, if someone just applies some invalid state transition and it slowly builds up something, that, again, should go through the social consensus. The social consensus could be like, “Well, you know? We messed up, let’s keep the money.” Or just do something surgical, like remove money. Just subtract the amount we believe that was clearly maliciously acquired. But there are already proof-of-stake systems running. You have people who already get slashed. People got slashed for double signing. Couple of times, people got slashed for… I don’t think anyone ever got slashed yet for producing invalid blocks. Right? But double signing, you can address it on` a level that even though not very sophisticated block producer would not create double signing.
Alexander Skidanov: For example, in our case we don’t use BFT consensus for blocks. On BFT consensus it’s a little hard. Because what was happening is that the note would sign off something, go offline, go online back, it will not have any on-chain information about their signature and they messed it up and they don’t have local information because it was a different device, and so they signed again. And they got slashed, right? But in reality, in our case we use, what is called a Finality Gadget, it’s a slightly different construction which censors the messages over time in the blockchain. They’re all on-chain, there’s no off-chain communication and so when you recover again you synchronize the chain, you’ll know everything you’ve done-
Anatoly Yakovenko: Well, minus any lost votes.
Alexander Skidanov: Any lost votes. But the votes can only go one block ahead. And so, if you speed up the note, as a safety you can wait for one block and then start participating, right? But it also has its own issues. So if everybody went down, everybody recover what you stole-
Anatoly Yakovenko: Yeah, we have a similar approach. You know what’s interesting about that is that, I don’t know if anyone’s coined this term but I think I heard Saki use it, called a synchronous safety. Like with Cosmos or Algorand if I understand correctly, the blocks are fully finalized from the single round. If you don’t do that, then the attacker observes, kind of, a larger portion of the consensus state and my intuition is that the safety of those networks is lower than 33%. Simply because a dynamic attacker observes a larger portion of everyone’s stake and can kind of pick and choose and force a network to, create more forks than necessary.
Anatoly Yakovenko: So at least with denial of service, it seems obvious to me that if I’m an attacker and observe the network trying to come to a consensus in some forks, I always choose the fork that is least likely to come to a consensus.
Alexander Skidanov: Right, but then it’s not safety it’s liveness, right?
Anatoly Yakovenko: Sure.
Alexander Skidanov: Usually for any consensus or finality gadget you can easily prove the safety to be 33% but liveness, yes, liveness is a bigger problem.
Alexander Skidanov: For example, in the case of tenderment, you can show liveness is also provable. You can show that no matter what sort of delays that adversary tried to cause, a block will eventually get finalized. Assuming parts of the synchronous network which is a very meaningful assumption in practice.
Alexander Skidanov: Finality gadgets are worse in this sense. Not a single finality gadget in existence today has a liveness proof. At least not that I know of. Both Caspers have liveness proofs and Grandpa doesn’t have a liveness proof. They all have what is called plausible liveness. They show that they will never get into a state from which no set of messages can finalize a block, but beyond that they just saying from any state you can finalize a block. But an adversary can be doing something funky, right?
Alexander Skidanov: And so we use a finality gadget. I’m playing with the likeness proof right now. And we don’t have it either so there is a good chance that an adversary can, if they control the network, do a sufficient extent and just continue producing onto different forks. But it’s never been done in practice so the proof-of-stake chain has two competing forks building up concurrently. It could be possible but it was never pulled off yet.
Anatoly Yakovenko: I have done a bunch of simulations but that’s… simulations are only simulations, right? So you guys, do you want to talk about the VDF thing, that you guys are-
Alexander Skidanov: Oh, the proof of… Yeah, we can talk about it. At this point it’s not clear if we are going to use it but we are looking into the proof of space time. Why we’re looking into is because proof of stake has this attack called long-range attack, which is pretty painful. And it feels to be very practical, so many people describe long-range attack as something that is hard to pull off but it feels to me that it’s actually not that hard. Because, all you need to do is acquire 66% of the keys at some particular point in the past. And in the past, those keys cost nothing for the people. And so, yeah, it’s probably hard to do but it doesn’t sound completely impossible to do.
Anatoly Yakovenko: Somebody can run a decentralizing exchange for old keys.
Alexander Skidanov: Right. And so there are two solutions, there are two big solutions that exist today. If a protocol builds a proof-of-stake and actually do care about showing some security they will present to you one of the top projects. One is called weak subjectivity, where effectively you say all that line notes, if they see a fork, which diverge more than some time ago, let’s say four months ago, they will not respect that fork even if it’s heavier. Therefore, if all the people who stake or other, if clients go online more frequently than that period of time, so let’s say they have to go online every two months, then nobody will be get fooled. Because everyone will observe the current chain.
Anatoly Yakovenko: If you use old keys, right, the heaviness goes away because you simply observe a totally new chain.
Alexander Skidanov: Right. There some fractures rule, right? The BFT consensus, per say, can not be the fractures rule anymore because both blocks are finalized in both chains.
Anatoly Yakovenko: Right, so my client would effectively see a new genesis block from their perspective. If I can ask and I am on chain-A, I see effectively an old-school verified certificate chain of validators.
Alexander Skidanov: So Cosmos, for example, will just shut down if there is two competing chains, which is one way to do that. They are effectively saying that we favor of safety, you have two competing chains, safety clearly violated, so let’s just shut down, it’s the safer thing to do. But other blockchains, Ethereum for example, they use this concept of weak subjectivity. They are saying that there is some fractures rule, which is effectively the highest block finalizer by Casper.
Alexander Skidanov: But the adversary, because it’s proof-of-stake, there’s no resource that they spent. They can easily, once they did acquire blocks in the past, sorry the keys in the past, they can build a chain significantly longer than the chain built by honest people because honest people produce blocks every ten seconds, and adversary can produce blocks very frequently. And so their chain, the fracture rule will be favoring their chain.
Alexander Skidanov: But weak subjectivity says that if I saw some chain, and there’s now a heavier chain which diverges more than four months ago, I just disregard the heavier chain because-
Anatoly Yakovenko: So the problem is, if you try to use a VDF and fork-choice rule, you have to have all of your validators run a liquid nitrogen cooled [inaudible 00:28:41].
Alexander Skidanov: But before we go into that idea. The reason why weak subjectivity doesn’t work is because the existing clients will not get fooled but when a new client starts, the new client needs to learn that this odd behavior chain is not respected by anyone. So you need some sort of social consensus. Effectively, you need to have a new genesis block every now and then.
Anatoly Yakovenko: My view is that people that discount these attacks just don’t understand the scope of it. Imagine if you have, right now, let’s say there’s $100 million worth of volume on some chain, it’s a large number to me and you but a relatively small number to the rest of the economy. Now imagine if there’s a $1 trillion worth of volume now, then these attacks become very much worthwhile to pull off, right?
Alexander Skidanov: Yeah, exactly.
Anatoly Yakovenko: If you’re a bunch of smart kids out in Eastern Europe that have nothing better to do, then why not?
Alexander Skidanov: So there is a second solution, which I would rather trust a network which uses the second solution, and the second solution is used by Algorand and Oracles, so Cordano, which is effectively that the binary that is running, the default binary to even load, after the epoch ends destroys the keys. Right? So now, to acquire 66% of keys in the past, at that point in the past 66% needed to alter their binaries not to destroy the keys. Like a reasonable actor would do that, the questions is how many-
Anatoly Yakovenko: But not if there was $1 trillion at stake.
Alexander Skidanov: Yeah. Yeah. Right. For the future benefit of you being able to sell the keys but it’s beneficial for you to compile the binary engine.
Alexander Skidanov: And so because of that, we want to walk away from the proof-of-stake and we want some resource. But proof-of-work is disastrously bad. The environment damage from proof-of-work is disastrous. And then you have all the problems from basic resistance and pulling… because of pulling how many pulls do you need to corrupt today to do a fork on Ethereum? Three or four? Right, so that’s a big problem and it’s very hard to discourage pulling.
Anatoly Yakovenko: But a fork is different from a long-range attack.
Alexander Skidanov: Right, but you want neither.
Anatoly Yakovenko: Even if I take over a 51% of Bitcoin right now, I can produce blocks and it costs me the electricity of 51% of the network. But eventually my money runs out.
Alexander Skidanov: Right. But if you control 51% of cash power short term, it’s sufficient for you to pull off an attack where you pay someone a huge amount of money, wait for whatever number of blocks they think is sufficient, they provide you some service. For example they pay you some cash so it’s an OTC transaction, and then you use your 51% to fork it, right? That’s a pretty big attack, you want to avoid it as well.
Anatoly Yakovenko: Except somebody can also write a transaction based on a block before the attack and bribe the miners to ignore my fork, right? So-
Alexander Skidanov: Yeah. There are many things at play, yes.
Anatoly Yakovenko: So it’s not as simple. I honestly think that proof-of-work strikes this weird balance where we might never see this kind of attack.
Alexander Skidanov: It’s very disappointing that Binance did not publish the keys when they got hacked. When they lost seven thousand bitcoins, if they just published the private key it would have been an awesome social experiment. Just to see if miners would have picked up on it and forked. It’s unfortunate they didn’t do that, that would have been great.
Alexander Skidanov: But anyway, so what we are looking into is there’s this concept of proof-of-space-time, which I think was pioneered by SpaceMesh, but it’s also something that Chia Network is doing. And we are trying to get meetings with both of them to discuss their latest advances. The idea is that proof-of-space-time is similar to proof-of-work in many senses, in the sense that there is some resource which you use to create blocks but it doesn’t burn electricity. Like you occupy a lot of space for a large period of time but the electricity is not getting burned and you are just computing a VDF in the meantime.
Alexander Skidanov: And you don’t really need the VDF, you need proof-of-elapsed-time which is easier than a VDF.
Anatoly Yakovenko: I mean, what do they use Equihash which is like proof-of-RAM, basically?
Alexander Skidanov: Proof-of-RAM? Well I didn’t look into it. But that’s another alternative. Like we want some proof-of-resource which doesn’t burn electricity. I guess Trime is using electricity, right?
Anatoly Yakovenko: Yeah, but-
Alexander Skidanov: Not as much?
Anatoly Yakovenko: Yeah.
Alexander Skidanov: Right. But then designing the full system so all those proof-of-spaces they don’t have the same properties of proof-of-work, it’s not a lottery which is very short-term. You need to do that for a long time. And so the dynamics are very different. And so Chia’s green paper, they are construction, which is pretty sophisticated, it only works if honest actors control more than 61%. Which is worse than 15.
Alexander Skidanov: We have some construction which we are going to be publishing, soon ish. We sort of merged proof-of-space-time and proof-of-stake and create something very meaningful, I think, which doesn’t have long-range attacks but also you can not pull at all, which is a nice benefit. And there’s no lottery so you know how much you are going to make. But this is not set in stone yet, it is unclear if we are going to be using it.
Anatoly Yakovenko: So it’s both using both space and time and stake?
Alexander Skidanov: When you say proof-of-space-time, if you compare it to proof-of-work, proof-of-work in this case would be proof of CPU time. So it uses space over some period of time. Same way as [crosstalk 00:33:37].
Alexander Skidanov: But you need storage. But it also uses proof-of-stake. So there is a finality gadget and so the idea is that short-term to revert the block you need, and by short-term I mean one, two epochs, you need to corrupt 33% of the total stake, which is super expensive. Once the stake is unstaked this is when space-time kicks in. And there, the biggest problem we need to tackle is what I call medium-term attack, where for someone it would be cheaper to buy space short-term. Short-term they control almost 50% of stake, of space, and so they can revert one day worth of blocks.
Anatoly Yakovenko: Do you think these systems that have so many complex pieces can actually ever function?
Alexander Skidanov: So the idea is that it’s actually not that complex. If you look at the design, if I show it to you on the whiteboard, it’s actually pretty simple. It has only two moving parts: the proof-of-space and the proof-of-stake. And the way they interact is not that complex. And it is pretty easy to reason about what happens if adversary has certain amount of stake, or certain amount of space, or a combination of the two and at which thresholds they can perform long-range attack while never short-term attack, you need 33% of stake, at least. Or medium-range attack in which case you need either very large percentage of space or smaller percentage of space but also a large percentage of stake. Right? And the thresholds are very well understood and the interactions is not that complex.
Alexander Skidanov: But the problem there is that proof-of-space-time research is not that advanced yet. For example, proof-of-time is not as resistant and proof-of-space is also not that simple. So there are problems there.
Anatoly Yakovenko: VDFs are hard. I mean the real VDFs. What we’re using is kind of a trick but if you’re talking about constructions that use an RSA group, or anything like that, 2048 bit multiplication is a whole pile of computer science problems.
Alexander Skidanov: But the beauty is that we don’t need the VDF, we need the proof-of-elapsed-time. And proof-of-elapsed-time there’s this construction, again by Chia Network, which is very similar to what you use, where you just compute sequential trials. But you don’t need to present all of them, and validate all of them, you can sample them in a smart way that you can be certain that at least that many were computed. But it is not a VDF in the sense that the adversary might have at some point like messed up one computation stat. And so the algorithm is not unique. But the chief instruction does not allow that adversary to compute them in parallel. They still have to be serial.
Anatoly Yakovenko: You’re relying on some trusted set-up, right?
Alexander Skidanov: No.
Anatoly Yakovenko: You need the input to be secure, I think in Chia. So they are using BQF proofs, or some class group.
Alexander Skidanov: Oh, no, no. You’re thinking of different thing. Yes, Chia has too much research. So, you’re thinking about the one which is the alternative to their simul-duplication. But without the trusted set-up. They use some fancy groups. But that’s an actual VDF because Chia does need the VDF. It’s ironic that they have a great paper and proof-of-elapsed-time that they cannot use themself. They do need the VDF.
Alexander Skidanov: But they have another paper, which is specifically about proof-of-elapsed-time, which is just sequential shards. It’s very similar to what you guys do. So they can put sequential shards, they numericalize them, and there’s a little trick they do where a particular shard doesn’t just depend on the previous shard, it depends on the previous shard and some other thing, some numerical tree which is being built iteratively.
Alexander Skidanov: And while you can not prove that every shard was computed properly, for that you obviously need to compute all of them, and so that’s why the output is not unique. You can prove that, at least, on the order of that many shards were computed serially. Which is all you care for proof-of-elapsed-time. But because the output is not unique it’s not a VDF so it’s not something that Ethereum would be able to use. Or they themselves because they do need the unique output.
Alexander Skidanov: But we don’t care, all we want to show is that time was spent. So we can use that construction. And that construction, again, like you know, it’s not that easy to… I only know it from you. But you’re saying that try is almost as fast as it can get, right?
Anatoly Yakovenko: I mean, within 10x. And to get to 10x would take $100 million and would be really remarkable.
Alexander Skidanov: Yeah. Right. And so, even if someone gets 10x, then they still need to have space which is, I mean now they can do 10% attack, right?
Anatoly Yakovenko: Yeah.
Alexander Skidanov: So that’s something we’re looking into but whether we are going to use it or not isn’t clear yet. The paper we will publish no mater what because, why not? Maybe someone else will find it fascinating and will want to build it, right? So it’s probably happening this or next week. By the time this podcast is out it could already be published.
Anatoly Yakovenko: I mean the space is so weird. Especially being in Silicon Valley because outside of crypto startups have to show growth, not research papers. What they need to show is month-over-month active user growth that’s within 200%.
Alexander Skidanov: Yeah. But you know in [inaudible 00:38:25] day and night, working with people who have made a program who have like 20% growth week-over-week but it is obviously going to plateau soon. It cannot stay forever. And I assume you also you guys do that, right? But we do some business development, we are on board users.
Anatoly Yakovenko: My theory is that this space is not going to grow unless the number of peer-to-peer non-custodial wallets actual grows. Because that’s effectively the size of the current decentralized internet. It’s just the number of those wallets.
Alexander Skidanov: Yeah, yeah, of course. Our hypothesis is that, I think blockchains bring a lot of value. I personally care a lot about decentralized I don’t care a lot about [inaudible 00:39:05]. Though I think [inaudible 00:39:06] is going to be the first biggest breakthrough that is already happening. But there is other value that people can use, the problem is that it is so hard to use blockchain today that majority of people just cannot do that. [inaudible 00:39:17] process.
Anatoly Yakovenko: I tried to use Auger and I couldn’t get it to work.
Alexander Skidanov: We talk to projects a lot that build in Ethereum today, right? And they know that I am happy with Ethereum but they still see this very high percentage drop-off rate, it’s very high. If they monitor from people who come to the webpage to people who actually start using the service the drop-off rate could be between 95-97%. You cannot build a reliable business if only 2% of your users actually stay after going through all of the hurdles. So I think usability is a bigger problem to tackle. The blockchains need to be usable.
Alexander Skidanov: There are many efforts but I think those efforts are separate. Right, so there are efforts to… like gas stations allow you not to pay for gas. There’s like [inaudible 00:40:05] and a couple others who built better on-boarding experience, et cetera. But there is [inaudible 00:40:11] which makes it faster, transactions don’t take one minute to complete because they are running on layer two. But together they work pretty poorly unless some of them start integrating with each other. SO I think a good centralized, and by centralized here I mean from a development perspective, a good centralized effort is needed to build a very usable blockchain. So that’s what we do.
Alexander Skidanov: Scalability is kind of secondary, scalability is more interesting to tackle. And people are more interested to hear about the scalability uses and solutions. But usability is what will actually make the difference at the end of the day.
Anatoly Yakovenko: Yeah, I agree. It’s much easier to do usability if you can use a faster chain.
Alexander Skidanov: Yeah, of course. So that’s why we’re not building-
Anatoly Yakovenko: You’d have to solve another set of computer science problems, right?
Alexander Skidanov: But also I think why we see so many papers and now we see more and more good papers being published. Like Chia Network, every paper they publish is actually pretty good.
Anatoly Yakovenko: Yeah.
Alexander Skidanov: Right. I think in 2017 most of the papers that were published in the blockchain space were not that good. But that’s because the actual space is very new and a lot of problems are not solved. So that way you do expect a lot of papers to be published. While if you go to the database world, if you go to [inaudible 00:41:23] one of the bigger conferences, right? Not many breakthroughs are there because databases has been around for a while and most of the hard problems are already tackled. So there you cannot do much. I think those are not conflicting things, publishing papers is great but acquiring users is also very good.
Anatoly Yakovenko: Yeah, I agree. When are you guys going to launch, do you think?
Alexander Skidanov: So there’s one deadline we have, which is we need a foundation that takes time. We want to do a proper foundation. We wanted to do [inaudible 00:41:52] foundation but optics of [inaudible 00:41:53] foundation are terrible. And it’s not just optics it’s generally a bad ideas so we are doing a Swiss foundation. That takes time, right? But we want to do it properly. So that is likely to be the biggest blocking factor. But I think we are going to be code-complete this year, whether we can launch this year or not that depends-
Anatoly Yakovenko: Famous last words?
Alexander Skidanov: Like in terms of code, we’re very close. Most of the sharding is running. All of the code is open. People can come and see. There’s a couple of things which are not finished and then we need to stabilize it. And we are already talking to the security of a few companies to initiate the security review. Right? So, it’s very close to the launch and it is very easy to monitor where we are.
Anatoly Yakovenko: Yeah, I mean you guys are making a ton of progress. I poke around the Near protocol discords sometimes.
Alexander Skidanov: Yeah, when are you guys going to launch?
Anatoly Yakovenko: Well, we are actually trying to launch.
Alexander Skidanov: As we speak?
Anatoly Yakovenko: Well, we boot the network we’re doing dry runs and if we can crash it, we’ll try it again until we can’t crash it. That’s basically what launching a permission-less network is, right? You try to gt a group of validators across the world to beat your code. And as soon you can’t crash it, it’s done.
Alexander Skidanov: I see. So you’re running an endless game of stakes?
Anatoly Yakovenko: You know, going from Google Cloud to the real internet means you’re dealing with real configuration issues. A bunch of empty u-paths that are at 500 bytes, stuff like that. Fixing those bugs took a couple of weeks. And our last dry run stopped when people went to bed. Which is kind of interesting. So it wasn’t like a code failure it was a human failure.
Alexander Skidanov: Oh they just shut down the node?
Anatoly Yakovenko: Yeah. And when you boot up the network we’re booting everyone with equal stakes, right? So whoever just happens to be there. So enough of them shut down to where we lost about 40% of the stake and then the network’s like, “Hey, I can’t find super majority so I’m going to stop voting,” right?
Alexander Skidanov: Yeah. So one thing we designed is that we make the system to favor availability. So even if you lose 90% of the stake, Near will continue operating. And for a day or two, depends on how we configure the length of the epoch, it will not finalize blocks using the finality gadget but there’s still low-level [inaudible 00:44:14] rule. So blocks still produce from any user, so that’s good enough.
Anatoly Yakovenko: Yeah, we’re still producing blocks but the question is what happens, right?
Alexander Skidanov: But the low-level fractures rule, for example if you’re using LMG Ghost, we didn’t figure out exactly what it’s going to be but probably LMG Ghost. LMG Ghost is very sticky. Meaning that unless you do believe that majority whose offline when they go online will want to behave, you know, weirdly and not respect the chain, that the chain is probably going to stay. And then in a couple of days you start finalizing blocks again. But if you favor consistency you should just stall. Like, for example, in Cosmos everyone will stall at that point.
Anatoly Yakovenko: So, we’re still producing blocks. Just nobody will see a super majority on them. That means when the validators come back online then they actually join the network. Then that entire fork can be fully finalized.
Alexander Skidanov: But the question is, let’s say they never coming back online. Will at any point, will they be removed from the validator set so that there’s a new supermajority?
Anatoly Yakovenko: That is the censorship question, right? Do we allow the network to censor offline validators or not? Right now, we don’t. We basically decide that, I think when we boot… if you look at Cosmos and their up-time, they have 98% up-time for validators.
Alexander Skidanov: But all of their validators are professional validators, right? So 98 even sounds low, you know?
Anatoly Yakovenko: And for our particular approach, since there is no sharding it means the machines are bigger. Effectively The pool of validators we’re going after are the professional validators.
Alexander Skidanov: Yeah. That’s a valid approach. In this case, favoring consistency is very meaningful. Right? We’re more in the Ethereum camp where we’re saying, “Let’s favor reliability and if someone does want to favor consistency they can do it locally.” You can always locally say if the block doesn’t have… if there’s a full epoch without super majority you just don’t respect anything built on top of it. Right? Because the future epochs, they have a smaller validator set so [inaudible 00:46:09].
Alexander Skidanov: So that’s our current philosophy. But I think some sort of, let’s call it Cosmos school of consistent networks. It’s obviously also a valid way to build a network.
Anatoly Yakovenko: Yeah. I mean it’s crazy that we have to make computer science design decisions like every day.
Alexander Skidanov: That’s why blockchain is fun.
Anatoly Yakovenko: Yeah, yeah, yup.
Alexander Skidanov: Cool.
Anatoly Yakovenko: Cool. Do you want to talk about anything else? Do you guys have any announcements you want to make?
Alexander Skidanov: We have a Beta program running. So if you’re building an application, check it out. Nearprotocal.com/beta. We are also running an ambassador program, we want people all over the globe to talk about Near, to promote Near. So if you’re somewhere in the world, check it out. Just google Nearprotocal.com/ambassadorprogram.
Alexander Skidanov: Yeah, that’s where we are.
Anatoly Yakovenko: Cool.
Anatoly Yakovenko: Hey everybody, thanks for listening to this episode. If you have any questions for our guests or want to continue this discussion, please check out our website at Solana.com, that’s S-O-L-A-N-A dot com. There are links to our Discord, where most of our communication happens in the company.
Anatoly Yakovenko: Also you should check out our GetHub page where we post all of our code for you to check out and even help out with. GetHub.com/Solana-Labs.
Anatoly Yakovenko: You can also follow us on Twitter @Solana.
Anatoly Yakovenko: Thanks for listening. See you next week.