Three Ways to Address High Calculation Costs
While traditional contextual learning methods experience a square-scale increase in computational cost when dealing with external knowledge, KBLaM achieves linear growth through the following innovative mechanism:
- Key-Value Vector Conversion Technique: Converts the knowledge base into key-value pairs, storing the knowledge vectors only once instead of double-counting them.
- Rectangular Attention Mechanism: activate only relevant vector regions during knowledge queries through an improved attention layer structure
- Adapter fine-tuning program: Only lightweight adapters that account for only 0.11 TP3T parameters of the original model need to be trained (Adapter)
This can be optimized in three steps: 1) Use thegenerate_kb_embeddings.pyscript precomputed knowledge vectors; 2) selecting theall-MiniLM-L6-v2and other lightweight embedding models; 3) the use of an incremental coding model when updating knowledge (see the officialdelta_update(Parameters). Experimental data show that KBLaM saves 83% of computational resources over traditional methods when processing 1 million pieces of knowledge.
This answer comes from the articleKBLaM: An Open Source Enhanced Tool for Embedding External Knowledge in Large ModelsThe































