Ai Dreams Forum

Artificial Intelligence => General AI Discussion => Topic started by: hainaa on May 05, 2019, 01:52:57 pm

Title: How to get fixed size BERT representations for sentences
Post by: hainaa on May 05, 2019, 01:52:57 pm
Hi,

I am trying to get fixed sized bert representation for all the sentences. However I am getting fixed size bert representation for words only.

If I have 7 words in a sentence and size of bert representation corresponding to a word is 768, then size of bert reoresentation for my sentence = 7* 768 which is variable and depends on number of words in the sentence.

Can anyone please help me how can I get fixed size bert representations for sentences.

Thank You,
Title: Re: How to get fixed size BERT representations for sentences
Post by: goaty on May 05, 2019, 04:15:41 pm
I've never heard of BERT before,  but I had the idea that when your reading you store both sides of the surrounds of your token, and when playing back you only use the left side,  and its better for collection of both overloads+synonyms!   its because language can happen in presupposition or postsupposition, when it comes to meaning,  and if your only getting the history your missing out on useful data on the other side!

overloading = one word means two different things.
synonym    = two different words mean the same thing.

they are opposites of each other,  and only count as pointless difference reduction in my bible, but if its better why not add it.

One more thing to say,  is if you ever collect a synonym,  keep it as the exact same index,  but then on the side learn context of how to use it,  but it still counts as dead the same.  :)  because u want to keep the learning acceleration ur going to get,  don't throw that away.

I=me
are=is...  but you might as well keep the basic are you of English,  but you can still retain the compounding your going to get.
Title: Re: How to get fixed size BERT representations for sentences
Post by: Art on May 06, 2019, 03:06:33 am
Just putting this in with Goaty's remarks, there are routines that can count words or sentences used in a paragraph or story, some by watching the punctuation (exclamation, question mark or period). How or why you are using this 768 representation per word is beyond my pay grade but there should be a formula to count the total words * 768 or applied to the sentences, etc.

Perhaps if you were to explain the purpose or pertinence of that number, if might shed more light for those wishing to help.
Title: Re: How to get fixed size BERT representations for sentences
Post by: goaty on May 06, 2019, 12:52:00 pm
Yes theres lots of word counting in it.  8)