The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Blerim Emruli. Foto

Blerim Emruli

Senior lecturer

Blerim Emruli. Foto

Random indexing of multidimensional data

Author

  • Fredrik Sandin
  • Blerim Emruli
  • Magnus Sahlgren

Summary, in English

Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation, and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided.

Publishing year

2017

Language

English

Pages

267-290

Publication/Series

Knowledge and Information Systems

Volume

52

Document type

Journal article

Topic

  • Information Systems, Social aspects (including Human Aspects of ICT)

Status

Published

ISBN/ISSN/Other

  • ISSN: 0219-3116