Merge pull request #2 from mertyildiran/example-video

Add an example to demonstrate the training over an audio-visual data
mertyildiran · Jul 25, 2020 · f2c416a · f2c416a
2 parents 7bac48c + c786b2d
commit f2c416a
Show file tree

Hide file tree

Showing 6 changed files with 149 additions and 17 deletions.
diff --git a/.gitignore b/.gitignore
@@ -126,3 +126,6 @@ examples/*_test.py
 
 # Visual Studio Code
 .vscode/
+
+# Training videos
+examples/videos
diff --git a/README.md b/README.md
@@ -4,23 +4,23 @@
 
 ## Core Principles
 
-These are the core principles of **exceptionally bio-inspired**, a revolutionary approach to the artificial neural networks:
+These are the core principles of **object-oriented** approach to the current state of artificial neural networks that is inspired by **synaptic plasticity** between **biological** neurons:
 
- - **Neurons** must be **objects** not tensors between matrices.
- - **Neurons** should be **GPU accelerated**. (ideally).
- - **Network** must be **architecture-free** (i.e. adaptive).
- - Network must have a **layerless design**.
+ - Unlike the current ANN implementations, **neurons** must be **objects** not tensors between matrices.
+ - Just the current ANN implementations, **neurons** should be **GPU accelerated** (ideally) to provide the necessary parallelism.
+ - While the current ANN implementations can only create special cases, a **Plexus Network** must be **architecture-free** (i.e. adaptive) to create a generalized solution of all machine learning problems.
+ - Instead of dealing with decision of choosing an ANN layer combination(such as Convolution, Pooling or Recurrent layers), the network must have a **layerless design**.
  - There must be fundamentally two types of neurons: **sensory neuron**, **interneuron**.
- - Input of the network must be made of sensory neurons. Any interneuron can be picked as a **motor neuron** (an element of the output). There is literally no difference between an interneuron and a motor neuron except the intervene of the network for igniting the wick of learning process through the motor neurons. Any non-motor interneuron can be assumed as a **cognitive neuron** which collectively forms the cognition of network.
+ - Input of the network must be made of sensory neurons. Any interneuron can be picked as a **motor neuron** (an element of the output). There are literally no difference between an interneuron and a motor neuron except the intervene of the network for igniting the wick of learning process through the motor neurons. Any non-motor interneuron can be assumed as a **cognitive neuron** which collectively forms the cognition of network.
  - There can be arbitrary amount of I/O groups in a single network.
- - Forget about batch size, iteration, and epoch concepts, training examples must be fed on time basis with a manner like; *learn first sample for ten seconds, OK done? then learn second sample for twenty seconds*. By this approach, you can assign importance factors to your samples with maximum flexibility.
+ - Instead of batch size, iteration, and epoch concepts, training examples must be fed on time basis with a manner like; *learn first sample for X seconds, OK done? then learn second sample for Y seconds*. By this approach, you can assign importance factors to your samples with maximum flexibility.
  - **Network** must be **retrainable**.
  - Network must be **modular**. In other words: You must be able to train a small network and then plug that network into a bigger network (we are talking about some kind of **self-fusing** here).
- - Neurons must exhibit the characteristics of **cellular automata**.
+ - Neurons must exhibit the characteristics of **cellular automata** just like Conway's Game of Life.
  - **Number of neurons** in the network can be increased or decreased (**scalability**).
  - There must be **no** need for a network-wide **oscillation**. Yet the execution of neurons should follow a path very similar to flow of electric current nevertheless.
- - Network should use **randomness** and/or **uncertainty principle** flawlessly.
- - Most importantly, the network **must and can not iterate** through the whole dataset. Besides that, it's also generally impossible to iterate whole the dataset on real life situations if the system is continuous like in robotics. Because of that; the network must be designed to handle such a **continuous data stream** that literally endless and must be designed to handle that data stream chunk by chunk. Therefore, when you are feeding the network, use a diverse feed but not a grouped feed (*like 123123123123123123 but not like 111111222222333333*).
+ - Network should use **randomness** and/or **uncertainty principle** flawlessly. Consciousness is an emergent property from cellular level to macro scale, the network. But it's also an emergent property for the neuron from quantum level uncertainty to cellular mechanisms. In such a way that **randomness** is the cause of the illusion of consciousness.
+ - Most importantly, the network **must and can not iterate** through the whole dataset. Besides that, it's also generally impossible to iterate the whole dataset on real life situations if the system is continuous like in robotics. Because of that; the network must be designed to handle such a **continuous data stream** that literally endless and must be designed to handle that data stream chunk by chunk. Therefore, when you are feeding the network, use a diverse feed but not a grouped feed (*like 123123123123123123 but not like 111111222222333333*).
 
 ### Activation function
 
@@ -87,26 +87,26 @@ Functionality of a neuron is relative to its type.
 
 **publications** holds literally the mirror data of *subscriptions* in the target neurons. In other words; any subscription creates also a publication reference in the target neuron. Similarly, *publications* is the Plexus Network equivalent of **Axons** in biological neurons.
 
-**potential** is the overall total potential value of all subscriptions multiplied by the corresponding weights. Only in sensory neurons, it is directly assigned by the network. Value of **potential** may only be updated by the neuron's itself and its being calculated by this simple formula each time when the neuron is fired:
+**potential** *`p`* is the overall total potential value of all subscriptions multiplied by the corresponding weights. Only in sensory neurons, it is directly assigned by the network. Value of **potential** may only be updated by the neuron's itself and its being calculated by this simple formula each time when the neuron is fired:
 
 <p align="center">
   <img src="https://raw.githubusercontent.com/mertyildiran/Plexus/master/docs/img/total_potential.png" alt="Total potential"/>
 </p>
-<!-- LaTeX of above image:  \underline{t}otal = ( \underline{p}otential_{0} \times \underline{w}eight_{0} )\ +\ ( p_{1} \times w_{1} )\ +\ ( p_{2} \times w_{2} )\ +\ ...\ +\ ( p_{n} \times w_{n} )  -->
+<!-- LaTeX of above image:  t = \sum_{i=0}^{n} p_{i} \times w_{i}  -->
 
 <p align="center">
   <img src="https://raw.githubusercontent.com/mertyildiran/Plexus/master/docs/img/apply_activation.png" alt="Apply activation"/>
 </p>
-<!-- LaTeX of above image:  \underline{p}otential =  \varphi (t)  -->
+<!-- LaTeX of above image:  p =  \varphi (t)  -->
 
-**desired_potential** is the ideal value of the neuron's potential that is desired to eventually reach. For sensory neurons, it is meaningless. For motor neurons, it is assigned by the network. If it's **None** then the neuron does not learn anything and just calculates potential when it's fired.
+**desired_potential** *`p'`* is the ideal value of the neuron's potential that is desired to eventually reach. For sensory neurons, it is meaningless. For motor neurons, it is assigned by the network. If it's **None** then the neuron does not learn anything and just calculates potential when it's fired.
 
-**loss** is calculated not just at the output but in every neuron except sensory ones and it is equal to absolute difference (*distance*) between desired potential and current potential.
+**loss** *`l`* is calculated not just at the output but in every neuron except sensory ones and it is equal to absolute difference (*distance*) between desired potential and current potential.
 
 <p align="center">
   <img src="https://raw.githubusercontent.com/mertyildiran/Plexus/master/docs/img/calc_of_loss.png" alt="Calculation of loss"/>
 </p>
-<!-- LaTeX of above image:  \underline{f} ault = \left | \Delta p \right |  -->
+<!-- LaTeX of above image:  l = \left | \Delta p \right | = \left | p' - p \right |  -->
 
 All numerical values inside a neuron are floating point numbers and all the calculations obey to the precision that given at start.
 
@@ -129,7 +129,7 @@ On the second phase of the network initiation, any non-sensory neurons are force
 
 ## Algorithm
 
-Even so the Python implementation of Plexus Network is easy to understand, it will be helpful for readers to explain the algorithm in pseudocode.
+Even so the Python implementation of Plexus Network is easy to understand, it will be helpful for readers to explain the algorithm in pseudocode;
 
 ### Initiation
 

diff --git a/docs/img/apply_activation.png b/docs/img/apply_activation.png
diff --git a/docs/img/calc_of_loss.png b/docs/img/calc_of_loss.png
diff --git a/docs/img/total_potential.png b/docs/img/total_potential.png
diff --git a/examples/audio_visual.py b/examples/audio_visual.py
@@ -0,0 +1,129 @@
+import time
+from pathlib import Path
+import cv2
+import numpy as np
+from pydub import AudioSegment
+from pydub.playback import play
+import cplexus as plexus
+
+VIDEO_FILE = 'videos/lower3.mp4'
+
+Path("videos/output/original").mkdir(parents=True, exist_ok=True)
+Path("videos/output/training").mkdir(parents=True, exist_ok=True)
+Path("videos/output/evaluation").mkdir(parents=True, exist_ok=True)
+
+audio = AudioSegment.from_file(VIDEO_FILE, "mp4")
+audio_samples = audio.get_array_of_samples()
+
+print('Normalizing audio...')
+max_hz = max(audio_samples)
+min_hz = min(audio_samples)
+#audio_samples = [x + (abs(min_hz)) for x in audio_samples]
+#audio_samples = [x / (max_hz + abs(min_hz)) for x in audio_samples]
+audio_samples = np.array(audio_samples)
+audio_samples = audio_samples + (abs(min_hz))
+audio_samples = np.true_divide(audio_samples, (max_hz + abs(min_hz)))
+
+cap = cv2.VideoCapture(VIDEO_FILE)
+frameCount = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+frameWidth = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+frameHeight = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+
+buf = np.empty((frameHeight, frameWidth, 3), np.dtype('uint8'))
+
+ret = True
+chunk_size = int((1) / frameCount * len(audio_samples)) - int(0 / frameCount * len(audio_samples))
+
+SIZE = chunk_size + frameWidth * frameHeight * 3 + 2048
+INPUT_SIZE = chunk_size
+OUTPUT_SIZE = frameWidth * frameHeight * 3
+CONNECTIVITY = 16 / SIZE
+PRECISION = 3
+
+TRAINING_DURATION = 0.01
+RANDOMLY_FIRE = False
+DYNAMIC_OUTPUT = False
+VISUALIZATION = False
+
+net = plexus.Network(
+    SIZE,
+    INPUT_SIZE,
+    OUTPUT_SIZE,
+    CONNECTIVITY,
+    PRECISION,
+    RANDOMLY_FIRE,
+    DYNAMIC_OUTPUT,
+    VISUALIZATION
+)
+
+print("\n*** LEARNING ***")
+for n in range(1):
+    cap.set(cv2.CAP_PROP_POS_FRAMES, 0)
+    print("Doing iteration {0}.".format(str(n + 1)))
+    fc = 0
+    while (fc < frameCount and ret):
+        print(fc)
+        (i, j) = ([fc * chunk_size, (fc + 1) * chunk_size])
+        chunk = audio_samples[i:j]
+
+        ret, buf = cap.read()
+        buf_normalized = np.true_divide(buf, 255).flatten()
+
+        # Load data into network
+        net.load(chunk, buf_normalized)
+
+        cv2.namedWindow('video')
+        cv2.imshow('video', buf)
+
+        output = net.output
+        output = np.array(output) * 255
+        learn = output.reshape((frameHeight, frameWidth, 3))
+        learn = learn.astype(np.uint8)
+        cv2.namedWindow('learn')
+        cv2.imshow('learn', learn)
+
+        cv2.imwrite("videos/output/original/{0}.png".format(str(fc)), buf)
+        cv2.imwrite("videos/output/training/{0}.png".format(str(fc)), learn)
+
+        cv2.waitKey(int(1000 * TRAINING_DURATION))
+        fc += 1
+    net.load(chunk)
+
+#cap.release()
+
+print("\n\n*** TESTING ***")
+
+fc = 0
+
+while (fc < frameCount):
+    print(fc)
+    (i, j) = ([fc * chunk_size, (fc + 1) * chunk_size])
+    chunk = audio_samples[i:j]
+
+    # Wait for the data to propagate and get the output
+    net.load(chunk)
+    output = net.output
+
+    output = np.array(output) * 255
+    buf = output.reshape((frameHeight, frameWidth, 3))
+    buf = buf.astype(np.uint8)
+
+    cv2.namedWindow('output')
+    cv2.imshow('output', buf)
+
+    cv2.imwrite("videos/output/evaluation/{0}.png".format(str(fc)), buf)
+
+    cv2.waitKey(int(1000 * TRAINING_DURATION))
+    fc += 1
+
+net.freeze()
+
+print("\n{0} waves are executed throughout the network".format(
+    str(net.wave_counter)
+))
+
+print("\nIn total: {0} times a random non-sensory neuron is fired\n".format(
+    str(net.fire_counter)
+))
+
+print("Exit the program")