Skip to content

[RFC]: achieve ndarray API parity with built-in JavaScript arrays #61

Closed
@rxbryan

Description

@rxbryan

Full name

Bryan Elee Atonye

University status

Yes

University name

University of Port Harcourt

University program

Mathematics and Computer Science

Expected graduation

July

Short biography

My name is Bryan Elee. I am in my final year pursuing a degree in Mathematics and Computer Science. I recently completed my final exam and project defense, hence I'm awaiting graduation in the next couple months.
I possess a strong foundation in various programming languages, with over 5 years of programming experience with C/C++, Python, and JavaScript, honed through academic studies, practical projects and internships. I have previously participated in the Google Summer of Code 2022 under the Metacall Organization and in the summer of Bitcoin program last year working under the Ledger Organization. This experiences solidified my ability to work effectively within open-source communities and collaborate with experienced developers.
I'm interested in machine learning, especially the field of reinforcement learning. I did some work on reinforcement learning last year and I am very excited about the possibilities offered by this technology.

Specific Achievements:

  • Summer of Bitcoin: Developed "Resigner," a general-purpose hot signing service in Python for Ledger Organization. This project demonstrates my ability to tackle complex tasks (Miniscript language, cryptographic functionalities) and deliver real-world solutions. Project Link

  • Google Summer of Code: Refactored the Metacall core library to a plugin architecture. This experience showcases my proficiency in C/C++ development Project Documentation:

Timezone

UTC +1

Contact details

rxbryn@gmail.com

Platform

Linux

Editor

Sublime Text

Programming experience

I began programming before university in 2018. I started out writing shell scripts, moved on to C/C++, then the Python programming language, Javascript and NodeJS. I am self taught in the above languages, I was usually motivated by some project I was developing. I have worked on a couple projects but I am most proud of Resigner.

Resigner is an easy to program hot signing service for miniscript policies. The Resigner countersigns transactions (according to some rules (spending conditions), set in advance in the configuration file, for example “no more than 1 million satoshis per day” before the transaction is broadcast to the bitcoin network. It provides the following features:

  • Enforce preset rules (spending conditions) on transactions.
  • Stealing a user's key is not sufficient to steal funds.
  • The user can recover funds if the service is no longer available, after a given period of time (as specified by the locking script).

It acts as a trusted third party in multiparty transactions enforcing previously agreed conditions

JavaScript experience

I have about 3 years of experience writing javascript programs. I have two published npm packages http-date and http-preconditions. I also have some experience doing backend web development using NodeJS, Express.
I have contributed Javascript to a few open source projects such as

My favourite feature in javascript would be function prototypes. While this pattern has fallen out of favour being replaced by the class syntax, the prototype pattern provides an interesting approach for dynamic inheritance of object properties and behaviour.

My least favourite feature in Javascript is the event loop. While the event loop is responsible for the asynchronous behaviour in javascript, it is also makes writing true multithreaded javascript applications very difficult. Any attempt at optimising javascript code requires deep understanding of the nature of the event loop and how it affects the specific code being optimised. This experience is not readily available.

Node.js experience

My experience with NodeJS is quite extensive. I have some experience modifying NodeJS source code and compiling the Library for embedding purposes. Some of my experience developing node native addons and embedding NodeJS comes from contributing to the development of the node loader in metacall core. This draft PR contains a lot of my work in embedding nodejs. It was used as the base for implementing the feature for exporting classes and objects form nodejs to metacall.
I also have some experience developing web applications using nodejs, express.js. I have also published some npm packages as I have elaborated on in the javascript section

C/Fortran experience

The C programming language is the first language I learnt, the second being C++. It is the language that I have clocked the most years of experience. I used C extensively while paticipating in the summer of code 2022 under metacall and I also worked on some personal projects using C.
Some of my contributions to open source projects using C include
metacall/core#289
metacall/core#270
metacall/core#287
metacall/core#298
Some of these merged PRs include C++ code. But still demonstrates my the requisite skill

Interest in stdlib

My interest in Stdlib is twofold.

  • I am a mathematics undergraduate and I'm interested in projects that allow me to utilise my mathematical knowledge. Hence the fact that Stdlib is involved in implementing linear algebra, statistical and numerical analysis routines piqued my interest.
  • Stdlib is a polygot project, utilising three of my favourite programming languages and technologies C, JavaScript, NodeJS in ways that not only test my skill but improves my understanding of them.

Version control

Yes

Contributions to stdlib

Merged contributions

refactor: update blas/ext/base/sapxsumpw to follow current project conventions
refactor: update blas/ext/base/scusumors to follow current project conventions
refactor: update blas/ext/base/scusumpw to follow current project conventions
refactor: update blas/ext/base/sapx to follow current project conventions

Goals

The goal of this project is to achieve API parity for Stdlib native ndarray with built-in JavaScript Array. Of all the existing JavaScript array method only the at and slice methods exist in ndarray.

Each of the APIs is a standalone package in either the @stdlib/ndarray/base or @stdlib/ndarray directory

Each package would have this file structure

Package_name
             |benchmark
             |docs
                       |types
                        repl.txt 
             |examples
             |lib
             |test
             Readme
             package.json

The following APIs will be implemented during the course of this project:

ndarray slice semantics for representing indices

APIs taking an Index or multiple indices will utilise the slice semantics. We shall use the slice API as it is, hence APIs such as fill, copywithin, splice etc shall take a slice object, array of slice objects or a multislice object.

Dimensionality Reduction

In APIs which it would be suitable to support operating over specific axes, we will be utilising approach used by numpy.
A null axis, (the default) is would perform the operation over all the dimensions of the input ndarray. If this is an array of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.
For example, given a three dimentional ndarray, axis = 0 represent reducing along the depth. 1 represents represent reducing along the row and 2 represents represent reducing along the column

Accessors

ndarray APIs taking a callback such as unary implement optimised accessors for dimensions upto the 10d. We shall use this approach while implementing the APIs requiring callbacks

APIs

/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {ndarrayLike}y - input ndarray
* @throws {TypeError} first argument must be an ndarray
* @throws {TypeError} second argument must be an ndarray
* @returns {ndarray} ndarray view
*/
concat(x, y)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray 
* @param {...integer} target - Zero-based index at which to copy the sequence to.
* @param  {...*} s - slice arguments: a MultiSlice instance, an array of slice arguments, or slice arguments as separate arguments.
* @throws {TypeError} first argument must be an ndarray
* @throws {TypeError} index arguments must be integers
* @throws {RangeError} number of index arguments must equal the number of dimensions
* @returns {ndarray} target - ndarray view
*/
copywithin(x, target, s)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} target - input ndarray
* @param {Number} value
* @param  {...*} s - slice arguments: a MultiSlice instance, an array of slice arguments, or slice arguments as separate arguments.
* @throws {TypeError} first argument must be an ndarray
* @throws {TypeError} value and index arguments must be integers
* @throws {RangeError} number of index arguments must equal the number of dimensions
* @returns {ndarray} target - ndarray view
*/
fill( target, value[, s] )

APIs that take a callback

/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
* @param {...*} axis - null or int or array of ints, optional
* @throws {TypeError} first argument must be an ndarray
* @returns {ndarray} target - ndarray view
*/
filter(x, fcn[, axis])
/**
* Returns an ndarray element.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - unary callback
*  @param {...*} axis - null or int or array of ints, optional
* @throws {TypeError} first argument must be an ndarray
* @returns {*} ndarray element
*/
find(x, fcn[, axis])
/**
* Returns an ndarray element.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
* @param {...*} axis - null or int or array of ints, optional
* @throws {TypeError} first argument must be an ndarray
* @returns {*} ndarray element
*/
findlast(x, fcn[, axis])
/**
* Executes a provided function once for each array element.
* 
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
* @throws {TypeError} first argument must be an ndarray
* @returns {void} 
*/
foreach(x, fcn)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
* @returns {ndarray} ndarray 
*/
from(x [, fcn])
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
*  @param {...*} axis - null or int or array of ints, optional
* @throws {TypeError} first argument must be an ndarray
* @returns {ndarray} output ndarray 
*/
map(x, fcn[, axis])
/**
* Returns the value that results from running the "reducer" callback function to completion over the entire array..
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
* @param {...*} axis - null or int or array of ints, optional
* @returns {Number}  
*/
reduce(x, fcn [, initialvalue, axis])
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
* @param {...*} axis - null or int or array of ints, optional
* @returns {ndarray} output ndarray 
*/
reduceright(x, fcn [, initialvalue, axis])
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} [fcn] - optional callback
* @param {...*} axis - null or int or array of ints, optional
* @returns {ndarray} reference to input array
*/
sort(x, [, fcn, axis])
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} [fcn] - optional callback
* @param {...*} axis - null or int or array of ints, optional
* @returns {ndarray} A new array with the elements sorted in ascending order.
*/
tosorted(x [, fcn, axis])
/**
* Returns a boolean .
*
* we don't have support for `bool` in ndarray, hence we flatten the input ndarray and operate over all the single dimension
* 
* @param {ndarrayLike} target - input ndarray
* @param {Number} searchElement - The value to search for. 
 * @param  {...*} s - slice arguments: a MultiSlice instance, an array of slice arguments, or slice arguments as separate arguments.
* @throws {TypeError} first argument must be an ndarray
* @throws {TypeError} index arguments must be integers
* @throws {RangeError} number of index arguments must equal the number of dimensions
* @returns {boolean} 
*/
includes( x, searchElement [, s])
/**
* Returns a string.
*
* @param {ndarrayLike} x - input ndarray
* @param {String} seperator
* @throws {TypeError} first argument must be an ndarray
* @returns {String} 
*/
join(x, seperator)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @returns {ndarray} reference to input array
*/
reverse(x)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {...integer} [start] - Zero-based index at which to start changing the array, converted to an integer.
* @param {integer} [deletecount] - An integer indicating the number of elements in the array to remove from start.
* @param {*} [ item1, item2, /* …, */ itemN] - The elements to add to the array, beginning from start.
* @returns {ndarray} reference to input array
*/
splice(x, start [, deleteCount, item1, item2, /* …, */ itemN])
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @returns {ndarray} A new array containing the elements in reversed order.
*/
toreversed(x)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {...integer} [start] - Zero-based index at which to start changing the array, converted to an integer.
* @param {integer} [deletecount] - An integer indicating the number of elements in the array to remove from start.
* @param {*} [ item1, item2, /* …, */ itemN] - The elements to add to the array, beginning from start.
* @throws {TypeError} first argument must be an ndarray
* @returns {ndarray} A new array that consists of all elements before start, item1, item2, , itemN, and all elements after start + deleteCount.
*/
tospliced(x, start [, deleteCount, item1, item2, /* …, */ itemN])
/**
* Returns a string.
*
* @param {ndarrayLike} x - input ndarray
* @throws {TypeError} first argument must be an ndarray
* @returns {String} A string representing the elements of the array.
*/
tostring(x)
/**
* Returns A new iterable iterator object.
*
* @param {ndarray} x - input array
* @returns {Iterator} iterator
*/
values(x)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {Callback} fcn - A function to execute for each element in the array.
* @returns {ndarray} ndarray 
*/
flatmap(x , fcn)
/**
* Returns an ndarray.
*
* @param {ndarrayLike} x - input ndarray
* @param {...*} axis - null or int or array of ints, optional
* @param {integer} [depth]  
* @returns {*} ndarray element
*/
flat(x [,depth, axis])

Why this project?

Ndarrays are foundational to working with the stdlib library. They provide an efficient way to work with multi-dimensional numerical data. This project is a high priority for Stdlib for the fore-mentioned reason. It adds APIs that would be utilised in every package in the library.
The Knowledge of working with multi-dimensional numerical data is a highly valuable skill for data science and machine learning, career paths I intend on pursuing. A significant portion of data science and machine learning involves working with numerical data, often organized in multi-dimensional structures like matrices and tensors. These structures represent complex relationships between features and observations. Understanding how to manipulate, analyze, and interpret this data is very important, this project hence affords me first hand experience with the ndarray object.
I also stand to gain knowledge optimal techniques and patterns for iterating multidimensional arrays, possibly other optimisation techniques that might be used during the course the project.

Qualifications

I have completed the course work for a degree in Mathematics and Computer science. The relevant courses to this project would be Linear algebra, Numerical analysis, Data structures and algorithms.

I am also acquainted with the book Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne. It helped develop my understanding of both data structures and algorithms.

I am also quite familiar with the emcascript specification. The definitions and implementations of the APIs will be informed by it

Prior art

The at and slice methods exist in ndarray. Various ndarray APIs have also being implemented. They will inform and guide our implementation of the project

Commitment

As stated in background section, I recently completed my final exam and project defense. Hence I'm free from any major commitments and will be able to give a ~40hr/week to this project

Schedule

Assuming a 12 week schedule,

  • Community Bonding Period: Though I have become acquainted with the stdlib community in this last month and have contributed a fair amount in its activities, the project turned out to be very complex and some key details regarding the APIs still have to be ironed out. Hence I will spend this time engaging the mentors in other to complete the implementation details

Each of the APIs to be implemented is standalone, and will not be considered implemented without its benchmarks, tests, documentations and examples. So rather than having a week for documentation, tests and so on...I intended to submit PRs to atleast 3 APIs per week.

  • Week 1 - Week 3: start coding findLast, includes, join, reduceRight, toreversed, tosorted, toSpliced, values

  • Week 4 - Week 6: (midterm): implement filter, find , forEach, includes, splice, copywithin, concat,sort, reverse

  • Week 7 - Week 9: Implement the remaining APIs , map, reduce, some, join, toString,

  • Week 10 - Week 12: because of the complex nature of the project, I intend to leave the last two weeks for review because I’m expecting a lot of reviews before we can get this code merged

Related issues

No response

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.

Metadata

Metadata

Assignees

No one assigned

    Labels

    20242024 GSoC proposal.rfcProject proposal.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions