Hi, I’m Yuanhao, this is my first blog post. I want to briefly introduce myself, my story with AI, and my recent experience with Z by HP Workstations, which are powered with Nvidia technology.
I live in Shanghai, China and work as a NLP Engineer and over the past few years, have been anactive Kaggler. I’ve participated in more than 20 Kaggle competitions and have won several gold medals. I even became a Kaggle Competition Grandmaster in 2019, and my highest Kaggle rank is 36 worldwide.
I can still remember the first time I tried to seriously learn something about the artificial neural network was in 2016. I picked up the famous course, Neural Network for Machine Learning, from Hinton. At that time, I had no access to GPUs, and I actually managed to finish all the homework with only access to CPU.
In 2017, I graduated from the college, and then finally had access to dedicated GPU resources at work. That was an HP Z4 workstation with an NvidiaQuadro P4000 GPU. Thanks to the technology, I began to explore advanced models and joined Kaggle competitions. That Z4 workstation was definitely an important starting point for me.
For even better Kaggle competition performance, I built my owb desktop in 2019. I built this machine with an Intel i7 CPU, 32GB Ram and an Nvidia 2080Ti GPU. The 11GB VRAM is what enabled me to train most of my models at that time, such as BERT-base and BERT-large. This machine worked pretty well, except for the cooling. I chose a MATX case, which is very compact, and not very good for heat dissipation. In addition, my GPU is for gaming and it uses the common open-air cooler. An open-air cooler expels hot air directly into the case. The glass side panel of the case feels warm when training. As a result, I have to open the case to make sure the GPU temperature is below the temp wall.
My previous DIY desktop. I always keep the side panel open
In these years, the transformer models have achieved great progress. The models are getting bigger and bigger. You may have read many news articles about the GPT-3 model which has 175 BILLION parameters. Though we will hardly use such a giant model in both production and Kaggle competitions, winning a Kaggle competition does require larger models now. In the Jigsaw Multilingual Toxic Comment Classification competition, the winners all used a model named XLM-Roberta-Large. This model can handle more than 100 languages, as a result, it has a big vocab and a big embedding layer (XLM-Roberta-large has a vocab size of 250002, while the vocab size of Roberta is 50265). It is difficult to train the model with my own 2080Ti GPU due to the VRAM limit.
People use much more powerful machines to train models in the industry. For example, my previous company has several DGX-1 servers. Each of the DGX-1 servers has 8 Nvidia V100 GPUs with 32GB VRAM. Almost all the models can be loaded into such a large VRAM, however, the price of the V100 GPU is much higher than what an average person can afford.
VRAM will no longer limit me. I became a Z by HP and Nvidia Data Science Ambassador in 2020, and have been using a brand new Z4 Workstation and ZBook Studio provided by the team for a while. It is really amazing to me that the Z Series Workstation came back to my life again!
The Workstation, i9 10980XE, 128GB RAM, 2*RTX 6000 GPU with 24GB VRAM, 2T SSD+6T HDD, is very powerful and balanced. I think the RTX 6000 GPU is exactly a wise choice for small to medium-sized AI applications. The 24 GB VRAM is very helpful when you want to use big models such as the XLM-Roberta mentioned above and its price is much more reasonable than the V100.
What’s more, I have to say this machine is well built! When you open the case (You can easily open the side panel with a switch), you can see how compact and beautiful this machine is. The cooling system is very effective. The temperature and noise are both well controlled even when two GPUs are running at full load.
This is HP Z4 workstation. It is very compact and the cooling system is very effective.
As for the ZBook Studio, it provides a very different developing experience to me. It often bothered me when I needed to implement or debug a deep learning model with no convenient GPU access. This situation is very common in companies, where the GPU resources are on the cloud and employees are only equipped with a laptop without GPU. Thanks to the RTX 5000 GPU, I can easily design and debug a NN model on the ZBook Studio, no matter where I am. Believe it or not, the VRAM of this laptop is 16GB, even larger than my previous 2080Ti desktop!
In addition to the hardware, the preloaded softwares also surprised me. HP has done a great job to help users make the best use of their products. It really took me a lot of time to build up a developing environment on my previous DIY machine. Now, the preloaded Data Science SoftwareStack includes almost everything you could need, such as GPU driver, CUDA, developer tools and popular libraries. You can start working on the machine as soon as you receive it. HP also offers a number of useful tools to enhance the productivity of the machine. Some of these are also exclusive to the Z series
It has been a while since my last Kaggle competition. This year, with the help of the powerful Z by HP Workstation and ZBook Studio, I plan to pay more attention back to the variant community. I will also provide more contents about Kaggle, NLP and AI. I have started to work on the RANZCR CLiP - Catheter and Line Position Challenge. The training speed on the Workstation is very satisfactory. Look forward to keeping you updated on my progress.