Ai Dreams Forum
Artificial Intelligence => General AI Discussion => Topic started by: keghn on December 03, 2016, 05:50:28 pm
-
MusicNet:
http://homes.cs.washington.edu/~thickstn/musicnet.html (http://homes.cs.washington.edu/~thickstn/musicnet.html)
-
Sweet post... I found this very interesting.
:)
-
I new it would resonate with you:) and i liked it too.
-
Sounds good to me too! O0
-
How to Make a Simple Tensorflow Speech Recognizer:
https://www.youtube.com/watch?v=u9FPqkuoEJ8 (https://www.youtube.com/watch?v=u9FPqkuoEJ8)
-
Machine Learning is Fun Part 6: How to do Speech Recognition with Deep Learning:
https://medium.com/@ageitgey/machine-learning-is-fun-part-6-how-to-do-speech-recognition-with-deep-learning-28293c162f7a#.yi0kmebpy
-
I-SED: Interactive Sound Event Detector:
http://www.bongjunkim.com/ised/ (http://www.bongjunkim.com/ised/)
http://music.cs.northwestern.edu/publications/Kim_Pardo_IUI2017.pdf (http://music.cs.northwestern.edu/publications/Kim_Pardo_IUI2017.pdf)
-
I really want to understand this but the guy in the video just speaks fast.
does the neural networks simply find the significant similarities between many sound recordings of the
same sound ?
also I didn't understand which code gets the volume meter level ?
-
Well English is my first language and I had difficulty understanding Bongjunkim.
His pronunciation of certain words were quite different from the way I (and others) pronounce them.
Unfortunately, languages are not all defined by their words, spellings or usage but also by pronunciation.
-
The work Mr Kim and Mr Pardo is doing very close in style to my work so I need no words to understand it.
It does not use neural net. Just pattern matching of frequencies and volume. SVM style. And FFT logic.
The way it work is to load a sound file. Then view the sound wave. Use a mouse to select out a sound fragment.
Then give that fragment a label or a name to tag it. Then search for repeating fragment in a many hours to recording.
Then have the new found fragment put into a data base with time stamp.
-
@keghn
I wrote a simple project similar to this many years ago.
I normalised the audio and ran the whole audio file through a kind of convolution filter that mapped the power/ volume to a simple linear gradient; so I ended up with just… up, up, down, up, up up, (110111), etc; the exact numbers were irrelevant… just if it moved up or down since the last value.
Once this was also applied to the selected sample it made searching for similar sound bites much faster.
If I remember correctly I then applied a FFT to the found sample and applied the same normalisation/ convolution filter to the frequencies within each ms of the sample.
Interesting post.
:)
-
@korrelan I know what you mean. Every algorithm in the world i have applied and more in this format. With 20,000
hours of thought.
This is the front end of my AGI logic. It is not the way the brain does it. But it is a down and dirty way that will
work just as good. It has it up and down. It is easier to trouble shoot. A real NN has all of its temporal pattern up
in weight space. Natural encryption by default. if you crack it, it is no big deal because the next person will be completely
different in weight space.
In my front end AGI logic it record sound and images and motor position in Mr Kim's format.
-
could someone explain those circles shown when he talks about neural networks
-
Which circles?
Do you mean the actual graphical representation of a neuron?
https://en.wikipedia.org/wiki/Artificial_neural_network
:)
-
Circles are nodes and the lines to them are edges in machine learning. In Neural network it one artificial neuron.
But since artificial neural network is a sub directory of machine learning there could cross over.
-
VoCo: Text-based Insertion and Replacement in Audio Narration:
https://www.youtube.com/watch?v=RB7upq8nzIU (https://www.youtube.com/watch?v=RB7upq8nzIU)
-
A.I. Experiments: Bird Sounds:
https://www.youtube.com/watch?v=31PWjb7Do1s (https://www.youtube.com/watch?v=31PWjb7Do1s)