Today, I have finished the implementation of a new open-source bencoding library. It is named cpp-bencoding and written in C++11. In the present blog post, I would like to briefly introduce it.
Bencoding
Bencoding is a type of encoding that is mostly used by BitTorrent, a peer-to-peer file sharing system. It uses bencoding to store and transmit loosely structured data, such as .torrent files.
The following four data types are supported: strings, integers, lists, and dictionaries. For example, the integer 10 is encoded as i10e
, the string “hello” is encoded as 5:hello
, and a list containing two integers 1 and 2 is encoded as li1ei2ee
. Binary strings, i.e. strings containing non-printable characters, are also supported.
To give a more complex example, a very basic .torrent file may look like this:
d8:announce18:http://tracker.com10:created by14:KTorrent 2.1.413:creation datei1182163277e4:infod6:lengthi6e4:name8:file.txt12:piece lengthi32768e6:pieces12:binary dataee
When decoded, the content of such a file looks like this (the output is as produced by a tool in the developed library):
{ "announce": "http://tracker.com", "created by": "KTorrent 2.1.4", "creation date": 1182163277, "info": { "length": 6, "name": "file.txt", "piece length": 32768, "pieces": "binary data" } }
If you are curious, the specification can be found here.
cpp-bencoding
cpp-bencoding is a C++11 bencoding library supporting both decoding and encoding. It provides a simple and extensible API for decoding, encoding, and pretty-printing of bencoded data.
#include "bencoding/bencoding.h" // Decode data stored in a std::string. auto decodedData = bencoding::decode(str); // Decode data directly from a stream. auto decodedData = bencoding::decode(stream); // Encode the data into a std::string. std::string encodedData = bencoding::encode(decodedData); // Get a pretty representation of the decoded data. std::string prettyRepr = bencoding::getPrettyRepr(decodedData);
Requirements
The following software is required:
Optional:
- Doxygen to generate API documentation.
- Google Test to build and run tests.
- LCOV to generate code coverage statistics.
Build, Installation, and Usage
Please, see the project page on GitHub.
API Documentation
The API documentation of the library can be generated during the library build and viewed in your favorite web browser. Doxygen is used to write and generate the documentation.
Tests
The library is thoroughly tested. In fact, over 99% of the library source code is covered by unit tests, written by using the Google Test framework.
Extending the Library
The library is extensible so you can write your own manipulation of the decoded data. In a greater detail, the BItemVisitor
class implements the Visitor design pattern. By using it as a base class, you can create your own class that manipulates the bencoded data in any way you want. Two examples of using the BItemVisitor
class are the Encoder
and PrettyPrinter
classes. Lets take a closer look on the first one.
As I have already said, it inherits from BItemVisitor
and implements all its visitation member functions:
class Encoder: private BItemVisitor { // Public interface. // ... private: virtual void visit(BDictionary *bDictionary) override; virtual void visit(BInteger *bInteger) override; virtual void visit(BList *bList) override; virtual void visit(BString *bString) override; };
where the BDictionary
, BInteger
, BList
, and BString
classes form a representation of the four supported data types. All of them inherit from the base class BItem
.
The encoding of data is simply implemented as follows:
std::string Encoder::encode(std::shared_ptr<BItem> data) { data->accept(this); return encodedData; }
where encodedData
is a member variable of type std::string
, and accept()
simply calls a proper visit()
member function on the encoder (a typical Visitor design pattern implementation). The visitation member functions are implemented as follows:
void Encoder::visit(BDictionary *bDictionary) { encodedData += "d"; for (auto item : *bDictionary) { item.first->accept(this); item.second->accept(this); } encodedData += "e"; } void Encoder::visit(BInteger *bInteger) { std::string encodedInteger("i" + std::to_string(bInteger->value()) + "e"); encodedData += encodedInteger; } void Encoder::visit(BList *bList) { encodedData += "l"; for (auto bItem : *bList) { bItem->accept(this); } encodedData += "e"; } void Encoder::visit(BString *bString) { std::string encodedString( std::to_string(bString->length()) + ":" + bString->value() ); encodedData += encodedString; }
So, if you want to process the decoded data in a custom way, simply subclass BItemVisitor
and use an analogical approach to implement the desired functionality.
License
The library is distributed under the BSD 3-clause license. See the LICENSE file for more details.
Hi Petr,
Have you had any experience with (c)making this as a Visual Studio solution? I did so after modifying the CMakeLists.txt slightly, and I successfully built the .lib file (src\Release\bencoding.lib)
However when I try to link this .lib file to my other Visual Studio project it doesn’t seem to work
error LNK2019: unresolved external symbol “class std::unique_ptr<class bencoding::BItem,struct std::default_delete > …”
Thanks in advance!
Hi Joakim,
no, I have not tried that. What changes did you make in
CMakeLists.txt
to make it compilable under VS? Maybe I can incorporate the changes into the project to help other people who use VS.With regards to the error, can you please post the complete error message? What version of VS do you use? How do you use the library in your project? Can you post some code?