What does your typical work day as a (software) developer look like?
You probably spend some time writing actual code, a lot of time integrating it with existing systems and finally spend weeks delivering it to production.
In this short post I want to emphasize the importance of unified tooling.
By “tooling” I mean everything — from debuggers to monitoring systems.
By “unified” I mean that the tooling stack is explicitly defined i.e. set as a “standard” across the company.
The working process without “unified tooling” is deplorable. At best you’ll have a long feedback loop and will not be able to scale development of your product, at worst your product quality will suffer. From my professional experience, if you are trying to build a scalable product, you definitely need a unified tooling. …
In this blog post i want to share a quick way (one command) of recompiling Linux kernel with PREEMPT_RT patch for NVIDIA Jetson AGX Xavier.
Supported version: Jetpack 4.2.1 (L4T 32.2.1), kernel 4.9.140
Supported hw: Jetson AGX Xavier
New platforms (Nano, TX2, etc.), new versions are welcomed here.
To build RT kernel for Xavier execute following commands on your x86_64 laptop
git clone https://github.com/r7vme/xavier-base-docker-images
docker build -t xavier-rt-kernel:32.2.1 -f Dockerfile.l4t_32_2_1 …
In part 1 i’ve described how to convert neural network with supported layers to TensorRT plan. In this part i’ll try to describe how to create a custom layer for TensorRT. Example will be a “l2norm_helper” plugin that i created to support TensorFlow l2_normalize operation.
Source code: https://github.com/r7vme/tensorrt_l2norm_helper
TensorRT plugin requires two major parts to be implemented:
This is a reasonable question. First let’s check what l2_normalize is in TensorFlow.
TensorRT is a framework from NVIDIA that allows significantly speed-up inference performance of neural network. TensorRT does this by fusing multiple layers together and selecting optimized (cuda) kernels. In addition, lower-precision data type can be used (e.g. float16 or int8). In the end, up to 4x-5x performance boost can be achieved, which is critical for real-time applications.
TensorRT usually comes together with JetPack (NVIDIA’s software for Jetson series embedded devices). For non-Jetpack installations checkout TensorRT installation guide.
DISCLAIMER: This post describes specific pitfalls and advanced example with custom layer. For “out-of-the-box” TensorRT examples please check out tf_to_trt_image_classification repo. …
This post describes mathematics behind TensorFlow’s tutorial example “Partial differential equations”.
Please find a Jupyter notebook with blog post under the link below.