How does GPT-2 or LSTMs store 40GB in just 800MB?

  • 0 Replies
  • 272 Views
*

LOCKSUIT

  • Emerged from nothing
  • Trusty Member
  • ****************
  • Admiral
  • *
  • 3470
  • First it wiggles, then it is rewarded.
How does GPT-2 or LSTMs store 40GB in just 800MB?
« on: July 22, 2019, 12:36:12 pm »
GPT-2 was trained on 40GB of text into its neural network, being a Transformer. LSTMs are another type of net. Anyhow, my Trie cannot even hold 1GB (before eliminating repitituos words and phrases...then it becomes ex. 0.2GB) without it becoming 20x larger into 20GB really when should be 0.2GB. Sooo....GPT-2 knows 40GB basically but only makes my RAM go 0.8GB higher.....GPT-2 354M makes it go twice higher - by 1.7GB. This is opposite effect for me, I am seeing my projects give me 20x more than i put in, not less. The parameters hold the data, but, what kind of Trie is this?? How can I emulate such compression?

I suppose it is the layer 1 nodes, certain ones make the next layer light up, and so on......sneaky alien storage compression correct? I already know the answer then maybe.. Or something else?
Emergent

 


Users Online

14 Guests, 0 Users

Most Online Today: 26. Most Online Ever: 340 (March 26, 2019, 09:47:57 pm)

Articles