Save gramMatrix to a csv file in hex.pca.daal.PCA_DAAL_SVD_DenseBatch)

Description

The PCA Implementation hex.pca.daal.PCA_DAAL_SVD_DenseBatch (namely its constructor at https://gist.github.com/mathemage/b0ca1760b59dd453973e0e8fada36e97#file-pca_daal_svd_densebatch-java-L31 ) receives the data by reading a file. Nonetheless, the data for PCA in H2O is passed in an double array in-memory.

Since I haven't figured out how to create an in-memory double array data adapter, which would implement the interface com.intel.daal.data_management.data_source.DataSource, I suggest to a quick hack:

  1. take the in-memory array

  2. store it in a csv file

  3. read from the file using com.intel.daal.data_management.data_source.FileDataSource

Then one can proceed to actual PCA computation implemented by Intel DAAL...

Activity

Show:
Олег Кремнёв
February 20, 2018, 9:04 AM
Edited

Hi,

You could try to do it in following way:

private static DaalContext context = new DaalContext();
double[] data = your_data (size=nVectorsHomogen x nFeaturesHomogen);
HomogenNumericTable dataTable = new HomogenNumericTable(context, data, nFeaturesHomogen, nVectorsHomogen);

See https://github.com/intel/daal/blob/daal_2018_update1/examples/java/com/intel/daal/examples/datasource/DataStructuresHomogen.java as example

Please let me know if this method helped you.

Assignee

New H2O Bugs

Fix versions

None

Reporter

Karel Ha

Support ticket URL

None

Labels

Affected Spark version

None

Customer Request Type

None

Task progress

None

ReleaseNotesHidden

None

CustomerVisible

No

Priority

Blocker