2014/03/02

Numpy.hpp - Simple Header-only Library for handling .npy files

Introduction

As you know, C++ is a suitable language for computing the large-scale problems, and python is a great language in terms of plotting and visualizing these results by using several attractive libraries such as scipy and matplotlib. For that reason, it is quite a common approach for researchers to use both languages – C++ and python – for the large-scale computation and the visualization.

The important thing in this case is an existence of a C++ library for reading/writing .npy files (a default file extension of numpy). Although several libraries have been proposed to handle such .npy files, they require precompiling and cannot work well in several environments (e.g. MSVC).

To handle these problems, I reimplemented a new C++ library called Numpy.hpp. Comparing with the existing libraries, Numpy.hpp has two advantages:

  • Header-only: you do not need to precompile numpy.hpp. All you have to do is just add #include "Numpy.hpp" into your source file.
  • No external dependencies: numpy.hpp uses only STL libraries in C++03. This means that numpy.hpp can work well in any environments including MSVC.

You can download Numpy.hpp from the following URL:
https://gist.github.com/rezoo/5656056

Usage

In this section I will briefly introduce the usage of Numpy.hpp.

Load .npy files

template<typename Scalar>
void LoadArrayFromNumpy(
    const std::string& filename, std::vector<int>& shape,
    std::vector<Scalar>& data);

If you wish to load .npy files in C++, please use aoba::LoadArrayFromNumpy() functions.

  1. const std::string& filename: the filename of the input.
  2. std::vector<int>& shape: the output of the tensor’s rank and its lengths.
  3. std::vector<Scalar>& data: the destination of the data.

Please note that the type of std::vector<T> must be equivalent to the type of the specified .npy file. Numpy.hpp will send an exception if you specify the different type.

A simple example of aoba::LoadArrayFromNumpy<T> function is as follows:

#include <iostream>
#include "Numpy.hpp"

// Now I assume np.load("file.npy").shape == (4, 5)
std::vector<int> s;
std::vector<double> data;
aoba::LoadArrayFromNumpy("/path/to/file.npy", s, data);
std::cout << s[0] << " " << s[1] << std::endl; // 4 5
std::cout << data.size() << std::endl; // 20

Numpy.hpp also provides several shortcuts that can be used where the order of tensor is fixed.

#include "Numpy.hpp"

int rows, cols;
std::vector<double> data;
aoba::LoadArrayFromNumpy("/path/to/file1.npy", rows, cols, data);
std::cout << rows << " " << cols << std::endl;

aoba::LoadArrayFromNumpy(
    "/path/to/file2.npy", a0, data); // 1st-order tensor
aoba::LoadArrayFromNumpy(
    "/path/to/file3.npy", b0, b1, b2, data); // 3rd-order tensor
aoba::LoadArrayFromNumpy(
    "/path/to/file4.npy", c0, c1, c2, c3, data); // 4th-order tensor

Save .npy files

template<typename Scalar>
void SaveArrayAsNumpy(
    const std::string& filename, bool fortran_order,
    int n_dims, const int shape[], const Scalar* data);

If you with to save .npy files, please use aoba::SaveArrayAsNumpy() functions.

  1. const std::string& filename: the destination of the .npy file.
  2. bool fortran_order: if you store the data as column-major order, please set this flag as true.
  3. int n_dims: the rank of the tensor.
  4. const int shape[]: the array that contains the tensor’s lengths.
  5. const Scalar* data: the raw data of the tensor.

A simple example of aoba::SaveArrayAsNumpy<T> function is as follows:

#include "Numpy.hpp"

// Now I assume s.size() == 2
aoba::SaveArrayAsNumpy("/path/to/file.npy", false, 2, &s[0], &data[0]);

As with aoba::LoadArrayFromNumpy<T>(), Numpy.hpp also provides several shortcut functions.

// the following functions assume that fortran_order == false.
aoba::SaveArrayAsNumpy("/path/to/file.npy", rows, cols, &data[0]);

aoba::SaveArrayAsNumpy(
    "/path/to/file2.npy", a0, &data[0]); // 1st-order tensor
aoba::SaveArrayAsNumpy(
    "/path/to/file3.npy", b0, b1, b2 &data[0]); // 3nd-order tensor
aoba::SaveArrayAsNumpy(
    "/path/to/file4.npy", c0, c1, c2, c3, &data[0]); // 4th-order tensor

If you have any questions, do not hesitate to ask me (rezoolab at gmail.com) :) Enjoy!

0 件のコメント: