LockSuit pointed me to this book "Natural Language Cognitive Architecture" by David Shapiro:
The book focuses on natural language generative Transformers that Shapiro hails as a milestone in AI. He extrapolates transformers to a generalized AGI described in concept by the diagram shown below:
Where the outer loop is a context-driven process that has inputs from external influences of an environment. And the inner loop is a form of streaming consciousness that can reflect on influences of the outer-loop as well as the inner loop's inferences and both share a common database. The book doesn't go into too much detail in terms of architecture and he states that current approaches with transformers are best suited for single-threaded approaches. He also uses SQL relational databases. While the generative transformers are able to confabulate and/or extrapolate contextual patterns from prompts and cues that can provide a context or goal, they still suffer from codifications that are cumbersome because of how neural networks code information. The whole approach starts at such a small datapoint knernal which are the characters of a language! Then the ANNs build up structural elements within its layers that allow for identifying concepts. The idea of ANNs being burdened with such data, IMO, is an inefficient approach to using data and ANNs!
An alternative approach would be to remove the burden of actual structural data of concepts and use databases to provide that resource. But a SQL database is the wrong approach. Why? Relational databases were designed to remove ambiguity and as such form relationships of data that are very brittle. Tables have fixed columns and language deals with data types that will vary in context data and fields. That's a problem, to make relational databases more flexible you'd have to use join tables to represent varying fieldsets for different concepts. So you'd have a concept table that points to a field table. The problem with that is you now have to do very computationally expensive joins that become a nightmare to associate across varying ideas and contexts.
A better approach is to use an object-oriented data model. Looking at the diagram below:
We can see a word or concept is described with vectors that need to be exposed so associations and comparisons can be made. This approach thrives on parallelism because we can build hash sets of the feature vectors that provide a time-complexity of O1 lookup advantage where 100s of thousands of lookups can happen in parallel! This removes the need for an ANN to form codifications that structure data into relationships. Those feature vectors include grammar, context, and a plethora of other concepts which would allow for an ANN to do a much simpler process and that is use the structured data and functions that can search, compare and process by virtue of patterns of using external functions rather than having to perform them. So, take for example GPT-3 learned to add, but why learn to add if it's a functional process a machine already can do and is coded much more efficiently than an ANN to do? Wouldn't it be better to train an ANN to use a calculator rather than form internal logic to do arithmetic?
Now look at Amanda's AGI approach:
The diagram depicts a concept of time that has varying degrees of depth, this is very similar to what human brains do. Where Shapiro obsesses with time-stamping data that effectively turns events into points of time which is not as effective as what nature invented. Time depth gives us a sense of work-effort which biology relies on to conserve energy. Not only that but the architecture of Amanda implements parallelism and cross-platform capability so when it needs to use a GPU, it can and that function has descriptors that describe it like any other concept. This allows for the search for a function to be easily associated with the concepts it processes by virtue of hash set lookups of feature vectors!
Training an ANN to use functional capabilities, such as looking up data, comparing for appropriate word use and document evaluation removes the need for knowledge as stored information to be solely an ANN responsibility and focuses the ANN to find patterns of efficacy using functional processes and externally(external to the ANN) stored data! With this approach, there's no need to re-train or fine-tune for the memorization of new data. Nor does this approach require an ANN to learn gigabytes of data, it learns spontaneously by interacting in its environment and can grow its vocabulary and experiences but relies on other processes to do the heavy lifting of comparing, weeding out, and assessing best fit data.
At least that's the theory, let's hope I'm right...