School of Computer Science and Technology and Microsoft Research Asia Held Joint Seminar Focusing on Industry Hotspots
On April 12, a joint seminar between the School of Computer Science and Technology, HUST and Microsoft Research Asia was successfully held. The event was hosted by the School of Computer Science and Technology, and was presided over by Professor Kun He from Hopcroft Center for Computer Science and Associate Professor Long Hu from the Data Engineering Institute of the School. Nearly 100 teachers and students from HUST participated in the event.
Dan Feng, Dean of the School, warmly welcomed the scholars and talents at the opening ceremony of the seminar. She pointed out that the seminar would help deepen the cooperative relationship between the School of Computer Science and Technology and Microsoft Research Asia, and promote closer academic exchanges between the two sides. She hoped that this exchange would promote in-depth cooperation between the two sides in academic research, benefit more scholars and students from academic exchanges, and stimulate innovative inspiration.
Fan Yang, a senior researcher at Microsoft Research Asia, introduced the background and development of "Tensor with Sparsity Attribute (TeSA) for End-to-End Deep Learning Model Sparsity", which shows that many parameters of the network have zero weight during the training process of deep learning, indicating that the deep learning model has sparsity attribute. In order to improve the computational performance of the tensor, he proposed that more than 90% of the TeSAs need to be pruned to achieve significant computational performance overhead. In his presentation, Fan Yang shared his valuable experience in his research process and said that their framework has good generality on different operating systems with good potential for development. Han Hu, a senior researcher at Microsoft Research Asia, gave a presentation on "Unification-oriented Vision and Language Modeling and Learning". He analyzed the development and unification process of large models vividly and thoroughly from the perspectives of both computer vision and natural language processing, combined with related work analogous to the unification process of the four classical mechanics in physics, and introduced the framework and structure of ViT. Through Hu Han's report, the students present had a deeper understanding and knowledge of the large model of computer vision. Teng Zhang,an associate professor of the School of Computer Science and Technology, HUST, presented "Learning Methods Based on Optimization of Interval Distribution". Machine learning often uses empirical risk minimization to optimize models, because the large number of samples compared to the VC dimension makes the upper bound loose, which results in an effective approximation. He introduced wide interval distribution optimization and used detailed mathematical derivation to induce the maximum interval principle and minimum parametric assumption, which are ingeniously linked with the regularization constraint we are familiar with, making the students present interested in listening. Professor Yuchong Hu from the School gave a presentation on "Research on the combination of AI and Storage", describing how to combine hardware storage optimization with AI, making full use of AI to predict the hardware to be read, and adopting a series of optimization methods to accelerate the speed of access in order to achieve optimization. Ran Shu, a principal researcher at Microsoft Research Asia, gave a presentation on "Hardware-based storage decoupling". Ran Shu introduced the current coupling phenomenon in hardware and showed the latency and contradiction between software and hardware. Ran Shu pointed out that the current CPU has a long read cycle to the hardware, and introduced the related research and solutions to such problems. Tao Ge, a senior researcher at Microsoft Research Asia, introduced the development process of big language models and detailed the differences and development process of GPT1, GPT2, GPT3, ChatGPT, and GPT4 models. The training of big language models gradually shifted from supervised training to unsupervised training, and performed better on zero samples, one sample and few samples.There are unlimited expectations and prospects for the development of big language models.
This seminar focused on research fields such as intelligent systems, computer storage, and artificial intelligence, promoting academic exchanges and cooperative research between Huazhong University of Science and Technology and Microsoft Research Asia, broadening the international horizons of teachers and students, and laying a good foundation for deep cooperation between the School and Microsoft Research Asia.