A project template for testing C-code with system calls see on github. It uses FFF for mocking and Catch2 as a testing framework. The solution is based on linker
If you ever wrote a low-level library that uses system calls you know that it’s not easy to test such code. In this short post I want to share a “turn key” template with FFF and Catch2 testing frameworks that allows easy test such code.
Let’s say we have a function that checks if we run on a multi or single-core system.
What does your typical work day as a (software) developer look like?
You probably spend some time writing actual code, a lot of time integrating it with existing systems and finally spend weeks delivering it to production.
In this short post I want to emphasize the importance of unified tooling.
By “tooling” I mean everything — from debuggers to monitoring systems.
By “unified” I mean that the tooling stack is explicitly defined i.e. set as a “standard” across the company.
The working process without “unified tooling” is deplorable. At best you’ll have a long feedback loop and will not be…
In this blog post i want to share a quick way (one command) of recompiling Linux kernel with PREEMPT_RT patch for NVIDIA Jetson AGX Xavier.
Supported version: Jetpack 4.2.1 (L4T 32.2.1), kernel 4.9.140
Supported hw: Jetson AGX Xavier
New platforms (Nano, TX2, etc.), new versions are welcomed here.
To build RT kernel for Xavier execute following commands on your x86_64 laptop
git clone https://github.com/r7vme/xavier-base-docker-images
docker build -t xavier-rt-kernel:32.2.1 -f Dockerfile.l4t_32_2_1…
In part 1 i’ve described how to convert neural network with supported layers to TensorRT plan. In this part i’ll try to describe how to create a custom layer for TensorRT. Example will be a “l2norm_helper” plugin that i created to support TensorFlow l2_normalize operation.
Source code: https://github.com/r7vme/tensorrt_l2norm_helper
TensorRT plugin requires two major parts to be implemented:
This is a reasonable question. First let’s check what l2_normalize is in TensorFlow.
TensorRT is a framework from NVIDIA that allows significantly speed-up inference performance of neural network. TensorRT does this by fusing multiple layers together and selecting optimized (cuda) kernels. In addition, lower-precision data type can be used (e.g. float16 or int8). In the end, up to 4x-5x performance boost can be achieved, which is critical for real-time applications.
TensorRT usually comes together with JetPack (NVIDIA’s software for Jetson series embedded devices). For non-Jetpack installations checkout TensorRT installation guide.
DISCLAIMER: This post describes specific pitfalls and advanced example with custom layer. For “out-of-the-box” TensorRT examples please check out tf_to_trt_image_classification repo. …
This post describes mathematics behind TensorFlow’s tutorial example “Partial differential equations”.
Please find a Jupyter notebook with blog post under the link below.
Systems Software Engineer