I’m still very new to deep learning, and I don’t yet have the skill to experiment with this, but I thought it was a cool idea that someone else might find interesting. Basically, I think the bitcoin blockchain is the perfect set of training data for someone who wants to break the SHA-256 algorithm, and I think deep learning is perfectly suited to this task.
You could use the blockchain as your training data and the SHA-256 algorithm as your loss function. If successful, you would be able to give your model a hash from the blockchain, along with a timestamp and list of transactions, and it should be able to generate the nonce required to mine a block.
If you’re not at least a little familiar with how bitcoin mining works, maybe none of that makes sense. To get a sense of what I’m talking about, you can play with the SHA-256 hash online here:
Type in random stuff, and imagine that you are trying to get an output that starts with 10 zeros in a row. This is basically what’s stored in the blockchain: (hash with 10 leading zeros) + (transactions and timestamp) + (random number) = another hash with 10 leading zeros. These outputs are considered to be random, but because they are based on an algorithm, I suspect that they are not truly random.
The blockchain is basically a recursive set of low entropy outputs from the SHA-256 algorithm, which seems like the ideal training data for someone who wants to break this algorithm. I have no animosity towards bitcoin, but I think this would be an amazing accomplishment.
Edit- I considered deleting this because if someone succeeds, my bitcoin will be worthless!
Generally, the blockchain algorithm is very interesting, because this future of all kinds of credentials, confirmations. In fact, you can sha256 for a long time here, there. I’m thinking of document authentication. This year bitcoin has not entered a deflationary phase?
See this post: hash - Computational requirements for breaking SHA-256? - Cryptography Stack Exchange.
I don’t see how what you’re proposing gets around the computational limitations of deterministic (non-quantum), even highly parallelized, computers – but I’m interested to hear more.
- Is breaking the hash itself required, or is it something about the way the hash is used that would make it vulnerable to attack?
- Does how the hash is used provide an angle for breaking the hash without brute-forcing, say, 2^2000+ guesses?
Well, presumably if it were possible, someone would have done it. I just wonder if, given enough examples of certain inputs leading to certain types of outputs, could deep learning yield some insight into the nature of the hash which reveals it to not be truly random?
Are the new hashes even related to previous ones? I mean, wouldn’t that be the same as trying to figure out the numbers of this week’s Euromillions (similar powerball in US?) ?
It’s a little different than finding poweball numbers, because there are many possible “right” answers. You’re looking for a hash with a bunch of leading zeros, and you don’t care what the other characters are, so there are a LOT of right answers.
So the question is, is there some discernible pattern here? Given a hash with specific characteristics (in this case leading zeros) is there a way to know what to add so that I can run it all through SHA256 again and get a hash with the same characteristics?
Why I think this is an interesting deep learning question is, we already have the perfect set of training data in the blockchain itself. And that blockchain required ridiculous amounts of processing power to create, so if there’s a pattern there, we can make use of that massive dataset to maybe gain a deeper understanding of the hash algorithm that underlies it.
I guess I need to study more so I can test this myself