The requests library is arguably the mostly widely used HTTP library for Python. However, what I believe most of its users are not aware of is that its current stable version happily accepts responses whose length is less than what is given in the Content-Length header. If you are not careful enough to check this by yourself, you may end up using corrupted data without even noticing. I have witnessed this first-hand, which is the reason for the present blog post. Let’s see why the current requests version does not do this checking (spoiler: it is a feature, not a bug) and how to check this manually in your scripts.
Read More
Category Archives: Programming
Implementing multiprocessing.pool.ThreadPool from Python in Rust
In this post, we will implement multiprocessing.pool.ThreadPool
from Python in Rust. It represents a thread-oriented version of multiprocessing.Pool
, which offers a convenient means of parallelizing the execution of a function across multiple input values by distributing the input data across processes. We will use an existing thread-pool implementation and focus on adjusting its interface to match that of multiprocessing.pool.ThreadPool
.
Read More
When auto Seemingly Deduces a Reference in C++
One of the first things that C++ programmers learn about auto
is that bare auto never deduces a reference. In the present post, I show you two examples when it seemingly does. The first one involves proxy objects. The second one concerns structured bindings from C++17.
Read More
Ensuring That a Linux Program Is Running at Most Once by Using Abstract Sockets
It is often useful to have a way of ensuring that a program is running at most once (e.g. a system daemon or Cron job). Unfortunately, most commonly used solutions are not without problems. In this post, I show a simple, reliable, Linux-only solution that utilizes Unix domain sockets and the abstract socket namespace. The post includes a sample implementation in the Rust programming language.
Read More
Consuming and Publishing Celery Tasks in C++ via AMQP
Celery is an asynchronous task queue based on distributed message passing. It is written in Python, but the protocol can be implemented in any language. However, there is currently no C++ client that is able to publish (send) and consume (receive) tasks. This is needed when your project is written in a combination of Python and C++, and you would like to process tasks in both of these languages. In the present post, I describe a way of interoperating between Python and C++ workers via the AMQP back-end (RabbitMQ).
Read More
Pros and Cons of Alternative Function Syntax in C++
C++11 introduced an alternative syntax for writing function declarations. Instead of putting the return type before the name of the function (e.g. int func()
), the new syntax allows us to write it after the parameters (e.g. auto func() -> int
). This leads to a couple of questions: Why was such an alternative syntax added? Is it meant to be a replacement for the original syntax? To help you with these questions, the present blog post tries to summarize the advantages and disadvantages of this newly added syntax.
Read More
Implementing DBSCAN from Distance Matrix in Rust
We will implement the DBSCAN clustering algorithm in Rust. As its input, the algorithm will take a distance matrix rather than a set of points or feature vectors. This will make the implemented algorithm useful in situations when the dataset is not formed by points or when features cannot be easily extracted. The complete source code, including comments and tests, is available on GitHub.
Read More
Computing Context Triggered Piecewise Hashes in Rust
This post introduces my Rust wrapper for ssdeep by Jesse Kornblum, which is a C library and program for computing context triggered piecewise hashes (CTPH). Also called fuzzy hashes, CTPH can match inputs that have homologies. Such inputs have sequences of identical bytes in the same order, although bytes in between these sequences may be different in both content and length. In contrast to standard hashing algorithms, CTPH can be used to identify files that are highly similar but not identical.
Read More
Universal vs Forwarding References in C++
When talking about T&&
in C++, you may have heard about universal references and forwarding references. This may get you wonder. Why there are two names for an apparently same concept? Is there any difference between them? Which one should I use? Let’s find out.
Read More
Auto Type Deduction in Range-Based For Loops
Have you ever wondered which of the following variants you should use in range-based for loops and when? auto
, const auto
, auto&
, const auto&
, auto&&
, const auto&&
, or decltype(auto)
? This post tries to present rules of thumb that you can use in day-to-day coding. As you will see, only four of these variants are generally useful.
Read More