A research startup to add to introduce Dynamic Transformers architecture for Large Language Models.
Research startup idea is to create a Dynamic Transformers architecture where all weights can change in time depending on the input. So model can learn continuously, a bit similar to human brain. This approach allows to "Unlock" static frozen LLM weights and totally eliminate LLM context window issue.
Current issue with all modern Large Language Models is their lock in a static state which is not changed after initial model creation. Such models should be trained again from scratch to reflect any changes in its weights. Existing approaches such as LoRa which allow to change small amount of model layer cannot provide comprehensive model update, it's still the same model for most of internal layers.