cpp-bencoding: A New C++ Bencoding Library

Today, I have finished the implementation of a new open-source bencoding library. It is named cpp-bencoding and written in C++11. In the present blog post, I would like to briefly introduce it.

Bencoding

Bencoding is a type of encoding that is mostly used by BitTorrent, a peer-to-peer file sharing system. It uses bencoding to store and transmit loosely structured data, such as .torrent files.

The following four data types are supported: strings, integers, lists, and dictionaries. For example, the integer 10 is encoded as i10e, the string “hello” is encoded as 5:hello, and a list containing two integers 1 and 2 is encoded as li1ei2ee. Binary strings, i.e. strings containing non-printable characters, are also supported.

To give a more complex example, a very basic .torrent file may look like this:

d8:announce18:http://tracker.com10:created by14:KTorrent 2.1.413:creation datei1182163277e4:infod6:lengthi6e4:name8:file.txt12:piece lengthi32768e6:pieces12:binary dataee

When decoded, the content of such a file looks like this (the output is as produced by a tool in the developed library):

{
    "announce": "http://tracker.com",
    "created by": "KTorrent 2.1.4",
    "creation date": 1182163277,
    "info": {
        "length": 6,
        "name": "file.txt",
        "piece length": 32768,
        "pieces": "binary data"
    }
}

If you are curious, the specification can be found here.

cpp-bencoding

cpp-bencoding is a C++11 bencoding library supporting both decoding and encoding. It provides a simple and extensible API for decoding, encoding, and pretty-printing of bencoded data.

#include "bencoding/bencoding.h"

// Decode data stored in a std::string.
auto decodedData = bencoding::decode(str);

// Decode data directly from a stream.
auto decodedData = bencoding::decode(stream);

// Encode the data into a std::string.
std::string encodedData = bencoding::encode(decodedData);

// Get a pretty representation of the decoded data.
std::string prettyRepr = bencoding::getPrettyRepr(decodedData);

Requirements

The following software is required:

  • A compiler supporting C++11, such as GCC 4.9.
  • CMake to build and install the library.

Optional:

  • Doxygen to generate API documentation.
  • Google Test to build and run tests.
  • LCOV to generate code coverage statistics.

Build, Installation, and Usage

Please, see the project page on GitHub.

API Documentation

The API documentation of the library can be generated during the library build and viewed in your favorite web browser. Doxygen is used to write and generate the documentation.

Tests

The library is thoroughly tested. In fact, over 99% of the library source code is covered by unit tests, written by using the Google Test framework.

Extending the Library

The library is extensible so you can write your own manipulation of the decoded data. In a greater detail, the BItemVisitor class implements the Visitor design pattern. By using it as a base class, you can create your own class that manipulates the bencoded data in any way you want. Two examples of using the BItemVisitor class are the Encoder and PrettyPrinter classes. Lets take a closer look on the first one.

As I have already said, it inherits from BItemVisitor and implements all its visitation member functions:

class Encoder: private BItemVisitor {
	// Public interface.
	// ...

private:
	virtual void visit(BDictionary *bDictionary) override;
	virtual void visit(BInteger *bInteger) override;
	virtual void visit(BList *bList) override;
	virtual void visit(BString *bString) override;
};

where the BDictionary, BInteger, BList, and BString classes form a representation of the four supported data types. All of them inherit from the base class BItem.

The encoding of data is simply implemented as follows:

std::string Encoder::encode(std::shared_ptr<BItem> data) {
	data->accept(this);
	return encodedData;
}

where encodedData is a member variable of type std::string, and accept() simply calls a proper visit() member function on the encoder (a typical Visitor design pattern implementation). The visitation member functions are implemented as follows:

void Encoder::visit(BDictionary *bDictionary) {
	encodedData += "d";
	for (auto item : *bDictionary) {
		item.first->accept(this);
		item.second->accept(this);
	}
	encodedData += "e";
}

void Encoder::visit(BInteger *bInteger) {
	std::string encodedInteger("i" + std::to_string(bInteger->value()) + "e");
	encodedData += encodedInteger;
}

void Encoder::visit(BList *bList) {
	encodedData += "l";
	for (auto bItem : *bList) {
		bItem->accept(this);
	}
	encodedData += "e";
}

void Encoder::visit(BString *bString) {
	std::string encodedString(
		std::to_string(bString->length()) + ":" + bString->value()
	);
	encodedData += encodedString;
}

So, if you want to process the decoded data in a custom way, simply subclass BItemVisitor and use an analogical approach to implement the desired functionality.

License

The library is distributed under the BSD 3-clause license. See the LICENSE file for more details.

2 Comments

  1. Hi Petr,

    Have you had any experience with (c)making this as a Visual Studio solution? I did so after modifying the CMakeLists.txt slightly, and I successfully built the .lib file (src\Release\bencoding.lib)

    However when I try to link this .lib file to my other Visual Studio project it doesn’t seem to work

    error LNK2019: unresolved external symbol “class std::unique_ptr<class bencoding::BItem,struct std::default_delete > …”

    Thanks in advance!

    Reply
    • Hi Joakim,

      no, I have not tried that. What changes did you make in CMakeLists.txt to make it compilable under VS? Maybe I can incorporate the changes into the project to help other people who use VS.

      With regards to the error, can you please post the complete error message? What version of VS do you use? How do you use the library in your project? Can you post some code?

      Reply

Leave a Comment.