Bernard_YesZero · 2023/05/27
Can quantum computer substantially boost the LLM training? #300
#ideathon-quantumcomputing #ideathon-quantum #ideathon-LLM
LLMs, e.g. chatGPT really caused a sensation recently. But, the training cost also excludes most companies and players. Can we train LLMs using quantum computers? How about the complexity? In recent arxiv paper (2303.03428, https://arxiv.org/abs/2303.03428v2), the authors propose an efficient solution for generic (or stochastic) gradient descent algorithms, scaling as O(T^2*polylog(n)), where T is the number of iteration and n is the number of parameter, as long as the models are both sufficiently dissipative and sparse, with small learning rates. The authors also benchmark the solution using ResNet with different parameters and find it's possible to obtain quantum enhancement after model pruning. Even though, we do not know about the parameters in GPT-3.5 or GPT-4, but if it's dissipative and sparse just like ResNet, it's possible to REALLY decrease the hugh cost of training. It's really a promising application for quantum computation.