NVIDIA has just announced their latest TTT based on Turing GPU. In an event GTC 2018 Japan keynote NVIDIA’s CEO, Jensen Huang unveiled the first Tesla based graphics card featuring the brand new Turing GPU.
The graphics card is designed to accelerate deep performance by a magnitude and going to deliver breakthrough performance for AI video applications over its predecessors. On the previous generation enabling decode up to 38 full-HD video streams in video processing was not possible, but with new Tesla T4, it is now possible.
Nvidia’s New Tesla T4 Is Announced To Deliver Multi-TFLOPs of Performance at 75W.
The specifications include its single-slot PCI-e form factor, Turing TU104 GPU with 2560 CUDA cores and 320 tensor Cores. The graphics card delivers 8.1 TFLOPs of FP32 performance, 65 TFLOPs of FP16 mixed-precision, 130 TOPs of INT8 and 260 TOPs of INT4 performance at just 75W. Multi-TFLOPs of Performance at 75W means the graphics card doesn’t require any external power source, in case of shortage it will be consuming power47 from the PCIe slot and can be put inside a 1U, 4U since the small form factor design will allow large-scale server compatibility.
The NVIDIA Tesla T4 GPU is the most advanced inference accelerator, which is powered by NVIDIA Turing Tensor Cores. T4 brings revolutionary multi-precision inference performance to accelerate the diverse application of modern AI. The package is an energy efficient 75W, small PCIe form factor and optimized for scale-out servers and announced to deliver inference in real time.
As the continuous growth of online videos demands solutions to efficiently search and gain insights from the video. Tesla T4 delivers breakthrough performance for AI video applications with transcoding engines that gives a double performance. T4 can decode up to 38 full-HD video streams which make it easy to integrate scalable deep learning into video pipelines to deliver innovative, smart video services.
The graphics card packs 16GB of DDR6 memory to deliver bandwidth of more than 320 GB/s. The NV TensorRT Hyperscale Platform optimized for powerful, highly efficient inference. The NV TensorRt Hyperscale Platform includes a comprehensive set of hardware and software offering.
Key Element of T4:
- The graphics card featuring 320 Turing Tensor Cores and 2560 CUDA cores, the breakthrough performance with flexible, multi-precision capabilities from FP32 to FP16 to INT8, as well as INT4.
- The package is energy-efficient and offers 65 teraflops of peak performance for FP16.
- NVIDIA TensorRT 5 – An inference optimizer and runtime engine and supports Turing Tensor Cores and expands the set of neural network optimizations. The Package also includes NVIDIA TensorRT inference server which has microservice software enables applications to use AI models in data center productions.
- NVIDIA GPU Cloud availability, all popular AI models and framework support, Kubernetes and Docker integration.