Warm tip: This article is reproduced from stackoverflow.com, please click
concatenation h5py hdf5 numpy python

Custom column names in HDF5 file using h5py

发布于 2020-05-13 15:28:19

I have the following code snippet:

import h5py
import numpy

## Data set with shape (5, 5) and numpy array containing column names as string
data = numpy.random.random((5, 5))
column_names = numpy.array(["a", "b", "c", "d", "e"])

## Create file pointer
fp = h5py.File("data_set.HDF5", "w")

## Store data
fp["sub"] = data

## Close file
fp.close()

How do I add the names for the columns in the HDF5 file as indicated by the arrow in the included figure?

enter image description here

Questioner
The Dude
Viewed
117
kcw78 2020-02-29 00:05

The trick is to use a Numpy dtype to define the field/column names, then use it to define a record array. You can also mix variable types (say if you want to mix ints, floats and strings on the same line).

Modified example below:

import h5py
import numpy as np

## Data set with shape (5, 5) and list containing column names as string
data = np.random.rand(5, 5)
col_names = ["a", "b", "c", "d", "e"]
## Create file pointer
with h5py.File("data_set_2.HDF5", "w") as fp :
    ds_dt = np.dtype( { 'names':col_names,
                        'formats':[ (float), (float), (float), (float), (float)] } )
    rec_arr = np.rec.array(data,dtype=ds_dt)        
    ## Store data
    ##fp["sub"] = data
    ds1 = fp.create_dataset('sub', data=rec_arr )