From cea8e6b15a8b0ba8fc60769787fdab708844c9bf Mon Sep 17 00:00:00 2001 From: Namandeep Singh Date: Sat, 5 Aug 2023 02:03:21 +0530 Subject: [PATCH] feat: move usage and api to typedocs --- README.md | 164 +------------------------------------ docs/api-details.md | 127 ++++++++++++++++++++++++++++ docs/general-guidelines.md | 36 ++++++++ 3 files changed, 166 insertions(+), 161 deletions(-) create mode 100644 docs/api-details.md create mode 100644 docs/general-guidelines.md diff --git a/README.md b/README.md index 7ae15e10..c708c73e 100644 --- a/README.md +++ b/README.md @@ -160,169 +160,11 @@ Node versions >= 16.14.x are supported. ## [API](https://poolifier.github.io/poolifier/) -### `pool = new FixedThreadPool/FixedClusterPool(numberOfThreads/numberOfWorkers, filePath, opts)` +[**API Details**](./docs/api-details.md) -`numberOfThreads/numberOfWorkers` (mandatory) Number of workers for this pool -`filePath` (mandatory) Path to a file with a worker implementation -`opts` (optional) An object with the pool options properties described below +## General Guideline -### `pool = new DynamicThreadPool/DynamicClusterPool(min, max, filePath, opts)` - -`min` (mandatory) Same as _FixedThreadPool_/_FixedClusterPool_ numberOfThreads/numberOfWorkers, this number of workers will be always active -`max` (mandatory) Max number of workers that this pool can contain, the new created workers will die after a threshold (default is 1 minute, you can override it in your worker implementation). -`filePath` (mandatory) Path to a file with a worker implementation -`opts` (optional) An object with the pool options properties described below - -### `pool.execute(data, name)` - -`data` (optional) An object that you want to pass to your worker implementation -`name` (optional) A string with the task function name that you want to execute on the worker. Default: `'default'` - -This method is available on both pool implementations and returns a promise with the task function execution response. - -### `pool.destroy()` - -This method is available on both pool implementations and will call the terminate method on each worker. - -### `PoolOptions` - -An object with these properties: - -- `messageHandler` (optional) - A function that will listen for message event on each worker -- `errorHandler` (optional) - A function that will listen for error event on each worker -- `onlineHandler` (optional) - A function that will listen for online event on each worker -- `exitHandler` (optional) - A function that will listen for exit event on each worker -- `workerChoiceStrategy` (optional) - The worker choice strategy to use in this pool: - - - `WorkerChoiceStrategies.ROUND_ROBIN`: Submit tasks to worker in a round robin fashion - - `WorkerChoiceStrategies.LEAST_USED`: Submit tasks to the worker with the minimum number of executed, executing and queued tasks - - `WorkerChoiceStrategies.LEAST_BUSY`: Submit tasks to the worker with the minimum tasks total execution and wait time - - `WorkerChoiceStrategies.LEAST_ELU`: Submit tasks to the worker with the minimum event loop utilization (ELU) (experimental) - - `WorkerChoiceStrategies.WEIGHTED_ROUND_ROBIN`: Submit tasks to worker by using a [weighted round robin scheduling algorithm](./src/pools/selection-strategies/README.md#weighted-round-robin) based on tasks execution time - - `WorkerChoiceStrategies.INTERLEAVED_WEIGHTED_ROUND_ROBIN`: Submit tasks to worker by using an [interleaved weighted round robin scheduling algorithm](./src/pools/selection-strategies/README.md#interleaved-weighted-round-robin) based on tasks execution time (experimental) - - `WorkerChoiceStrategies.FAIR_SHARE`: Submit tasks to worker by using a [fair share scheduling algorithm](./src/pools/selection-strategies/README.md#fair-share) based on tasks execution time (the default) or ELU active time - - `WorkerChoiceStrategies.WEIGHTED_ROUND_ROBIN`, `WorkerChoiceStrategies.INTERLEAVED_WEIGHTED_ROUND_ROBIN` and `WorkerChoiceStrategies.FAIR_SHARE` strategies are targeted to heavy and long tasks. - Default: `WorkerChoiceStrategies.ROUND_ROBIN` - -- `workerChoiceStrategyOptions` (optional) - The worker choice strategy options object to use in this pool. - Properties: - - - `measurement` (optional) - The measurement to use in worker choice strategies: `runTime`, `waitTime` or `elu`. - - `runTime` (optional) - Use the tasks [median](./src/pools/selection-strategies/README.md#median) runtime instead of the tasks average runtime in worker choice strategies. - - `waitTime` (optional) - Use the tasks [median](./src/pools/selection-strategies/README.md#median) wait time instead of the tasks average wait time in worker choice strategies. - - `elu` (optional) - Use the tasks [median](./src/pools/selection-strategies/README.md#median) ELU instead of the tasks average ELU in worker choice strategies. - - `weights` (optional) - The worker weights to use in weighted round robin worker choice strategies: `{ 0: 200, 1: 300, ..., n: 100 }`. - - Default: `{ runTime: { median: false }, waitTime: { median: false }, elu: { median: false } }` - -- `restartWorkerOnError` (optional) - Restart worker on uncaught error in this pool. - Default: `true` -- `enableEvents` (optional) - Events emission enablement in this pool. - Default: `true` -- `enableTasksQueue` (optional) - Tasks queue per worker enablement in this pool. - Default: `false` - -- `tasksQueueOptions` (optional) - The worker tasks queue options object to use in this pool. - Properties: - - - `concurrency` (optional) - The maximum number of tasks that can be executed concurrently on a worker. - - Default: `{ concurrency: 1 }` - -#### `ThreadPoolOptions extends PoolOptions` - -- `workerOptions` (optional) - An object with the worker options to pass to worker. See [worker_threads](https://nodejs.org/api/worker_threads.html#worker_threads_new_worker_filename_options) for more details. - -#### `ClusterPoolOptions extends PoolOptions` - -- `env` (optional) - An object with the environment variables to pass to worker. See [cluster](https://nodejs.org/api/cluster.html#cluster_cluster_fork_env) for more details. - -- `settings` (optional) - An object with the cluster settings. See [cluster](https://nodejs.org/api/cluster.html#cluster_cluster_settings) for more details. - -### `class YourWorker extends ThreadWorker/ClusterWorker` - -`taskFunctions` (mandatory) The task function or task functions object `{ name_1: fn_1, ..., name_n: fn_n }` that you want to execute on the worker -`opts` (optional) An object with these properties: - -- `maxInactiveTime` (optional) - Maximum waiting time in milliseconds for tasks on newly created workers. After this time newly created workers will die. - The last active time of your worker will be updated when it terminates a task. - If `killBehavior` is set to `KillBehaviors.HARD` this value represents also the timeout for the tasks that you submit to the pool, when this timeout expires your tasks is interrupted before completion and removed. The worker is killed if is not part of the minimum size of the pool. - If `killBehavior` is set to `KillBehaviors.SOFT` your tasks have no timeout and your workers will not be terminated until your task is completed. - Default: `60000` - -- `killBehavior` (optional) - Dictates if your worker will be deleted in case a task is active on it. - **KillBehaviors.SOFT**: If `currentTime - lastActiveTime` is greater than `maxInactiveTime` but a task is still executing or queued, then the worker **won't** be deleted. - **KillBehaviors.HARD**: If `currentTime - lastActiveTime` is greater than `maxInactiveTime` but a task is still executing or queued, then the worker will be deleted. - This option only apply to the newly created workers. - Default: `KillBehaviors.SOFT` - -#### `YourWorker.hasTaskFunction(name)` - -`name` (mandatory) The task function name - -This method is available on both worker implementations and returns a boolean. - -#### `YourWorker.addTaskFunction(name, fn)` - -`name` (mandatory) The task function name -`fn` (mandatory) The task function - -This method is available on both worker implementations and returns a boolean. - -#### `YourWorker.removeTaskFunction(name)` - -`name` (mandatory) The task function name - -This method is available on both worker implementations and returns a boolean. - -#### `YourWorker.listTaskFunctions()` - -This method is available on both worker implementations and returns an array of the task function names. - -#### `YourWorker.setDefaultTaskFunction(name)` - -`name` (mandatory) The task function name - -This method is available on both worker implementations and returns a boolean. - -## General guidance - -Performance is one of the main target of these worker pool implementations, poolifier team wants to have a strong focus on this. -Poolifier already has a [benchmarks](./benchmarks/) folder where you can find some comparisons. - -### Internal Node.js thread pool - -Before to jump into each poolifier pool type, let highlight that **Node.js comes with a thread pool already**, the libuv thread pool where some particular tasks already run by default. -Please take a look at [which tasks run on the libuv thread pool](https://nodejs.org/en/docs/guides/dont-block-the-event-loop/#what-code-runs-on-the-worker-pool). - -**If your task runs on libuv thread pool**, you can try to: - -- Tune the libuv thread pool size setting the [UV_THREADPOOL_SIZE](https://nodejs.org/api/cli.html#cli_uv_threadpool_size_size). - -and/or - -- Use poolifier cluster pools that are spawning child processes, they will also increase the number of libuv threads since that any new child process comes with a separated libuv thread pool. **More threads does not mean more fast, so please tune your application**. - -### Cluster vs Threads worker pools - -**If your task does not run into libuv thread pool** and is CPU intensive then poolifier **thread pools** (_FixedThreadPool_ and _DynamicThreadPool_) are suggested to run CPU intensive tasks, you can still run I/O intensive tasks into thread pools, but performance enhancement is expected to be minimal. -Thread pools are built on top of Node.js [worker_threads](https://nodejs.org/api/worker_threads.html) module. - -**If your task does not run into libuv thread pool** and is I/O intensive then poolifier **cluster pools** (_FixedClusterPool_ and _DynamicClusterPool_) are suggested to run I/O intensive tasks, again you can still run CPU intensive tasks into cluster pools, but performance enhancement is expected to be minimal. -Consider that by default Node.js already has great performance for I/O tasks (asynchronous I/O). -Cluster pools are built on top of Node.js [cluster](https://nodejs.org/api/cluster.html) module. - -If your task contains code that runs on libuv plus code that is CPU intensive or I/O intensive you either split it either combine more strategies (i.e. tune the number of libuv threads and use cluster/thread pools). -But in general, **always profile your application**. - -### Fixed vs Dynamic pools - -To choose your pool consider first that with a _FixedThreadPool_/_FixedClusterPool_ or a _DynamicThreadPool_/_DynamicClusterPool_ your application memory footprint will increase. -By doing so, your application will be ready to execute in parallel more tasks, but during idle time your application will consume more memory. -One good choice from poolifier team point of view is to profile your application using a fixed or dynamic worker pool, and analyze your application metrics when you increase/decrease the number of workers. -For example you could keep the memory footprint low by choosing a _DynamicThreadPool_/_DynamicClusterPool_ with a minimum of 5 workers, and allowing it to create new workers until a maximum of 50 workers if needed. This is the advantage of using a _DynamicThreadPool_/_DynamicClusterPool_. -But in general, **always profile your application**. +For general guidelines, please refer to [this document](./docs/general-guidelines.md) ## Contribute diff --git a/docs/api-details.md b/docs/api-details.md new file mode 100644 index 00000000..fdba0f9d --- /dev/null +++ b/docs/api-details.md @@ -0,0 +1,127 @@ +## [API](https://poolifier.github.io/poolifier/) + +### `pool = new FixedThreadPool/FixedClusterPool(numberOfThreads/numberOfWorkers, filePath, opts)` + +`numberOfThreads/numberOfWorkers` (mandatory) Number of workers for this pool +`filePath` (mandatory) Path to a file with a worker implementation +`opts` (optional) An object with the pool options properties described below + +### `pool = new DynamicThreadPool/DynamicClusterPool(min, max, filePath, opts)` + +`min` (mandatory) Same as _FixedThreadPool_/_FixedClusterPool_ numberOfThreads/numberOfWorkers, this number of workers will be always active +`max` (mandatory) Max number of workers that this pool can contain, the new created workers will die after a threshold (default is 1 minute, you can override it in your worker implementation). +`filePath` (mandatory) Path to a file with a worker implementation +`opts` (optional) An object with the pool options properties described below + +### `pool.execute(data, name)` + +`data` (optional) An object that you want to pass to your worker implementation +`name` (optional) A string with the task function name that you want to execute on the worker. Default: `'default'` + +This method is available on both pool implementations and returns a promise with the task function execution response. + +### `pool.destroy()` + +This method is available on both pool implementations and will call the terminate method on each worker. + +### `PoolOptions` + +An object with these properties: + +- `messageHandler` (optional) - A function that will listen for message event on each worker +- `errorHandler` (optional) - A function that will listen for error event on each worker +- `onlineHandler` (optional) - A function that will listen for online event on each worker +- `exitHandler` (optional) - A function that will listen for exit event on each worker +- `workerChoiceStrategy` (optional) - The worker choice strategy to use in this pool: + + - `WorkerChoiceStrategies.ROUND_ROBIN`: Submit tasks to worker in a round robin fashion + - `WorkerChoiceStrategies.LEAST_USED`: Submit tasks to the worker with the minimum number of executed, executing and queued tasks + - `WorkerChoiceStrategies.LEAST_BUSY`: Submit tasks to the worker with the minimum tasks total execution and wait time + - `WorkerChoiceStrategies.LEAST_ELU`: Submit tasks to the worker with the minimum event loop utilization (ELU) (experimental) + - `WorkerChoiceStrategies.WEIGHTED_ROUND_ROBIN`: Submit tasks to worker by using a [weighted round robin scheduling algorithm](./src/pools/selection-strategies/README.md#weighted-round-robin) based on tasks execution time + - `WorkerChoiceStrategies.INTERLEAVED_WEIGHTED_ROUND_ROBIN`: Submit tasks to worker by using an [interleaved weighted round robin scheduling algorithm](./src/pools/selection-strategies/README.md#interleaved-weighted-round-robin) based on tasks execution time (experimental) + - `WorkerChoiceStrategies.FAIR_SHARE`: Submit tasks to worker by using a [fair share scheduling algorithm](./src/pools/selection-strategies/README.md#fair-share) based on tasks execution time (the default) or ELU active time + + `WorkerChoiceStrategies.WEIGHTED_ROUND_ROBIN`, `WorkerChoiceStrategies.INTERLEAVED_WEIGHTED_ROUND_ROBIN` and `WorkerChoiceStrategies.FAIR_SHARE` strategies are targeted to heavy and long tasks. + Default: `WorkerChoiceStrategies.ROUND_ROBIN` + +- `workerChoiceStrategyOptions` (optional) - The worker choice strategy options object to use in this pool. + Properties: + + - `measurement` (optional) - The measurement to use in worker choice strategies: `runTime`, `waitTime` or `elu`. + - `runTime` (optional) - Use the tasks [median](./src/pools/selection-strategies/README.md#median) runtime instead of the tasks average runtime in worker choice strategies. + - `waitTime` (optional) - Use the tasks [median](./src/pools/selection-strategies/README.md#median) wait time instead of the tasks average wait time in worker choice strategies. + - `elu` (optional) - Use the tasks [median](./src/pools/selection-strategies/README.md#median) ELU instead of the tasks average ELU in worker choice strategies. + - `weights` (optional) - The worker weights to use in weighted round robin worker choice strategies: `{ 0: 200, 1: 300, ..., n: 100 }`. + + Default: `{ runTime: { median: false }, waitTime: { median: false }, elu: { median: false } }` + +- `restartWorkerOnError` (optional) - Restart worker on uncaught error in this pool. + Default: `true` +- `enableEvents` (optional) - Events emission enablement in this pool. + Default: `true` +- `enableTasksQueue` (optional) - Tasks queue per worker enablement in this pool. + Default: `false` + +- `tasksQueueOptions` (optional) - The worker tasks queue options object to use in this pool. + Properties: + + - `concurrency` (optional) - The maximum number of tasks that can be executed concurrently on a worker. + + Default: `{ concurrency: 1 }` + +#### `ThreadPoolOptions extends PoolOptions` + +- `workerOptions` (optional) - An object with the worker options to pass to worker. See [worker_threads](https://nodejs.org/api/worker_threads.html#worker_threads_new_worker_filename_options) for more details. + +#### `ClusterPoolOptions extends PoolOptions` + +- `env` (optional) - An object with the environment variables to pass to worker. See [cluster](https://nodejs.org/api/cluster.html#cluster_cluster_fork_env) for more details. + +- `settings` (optional) - An object with the cluster settings. See [cluster](https://nodejs.org/api/cluster.html#cluster_cluster_settings) for more details. + +### `class YourWorker extends ThreadWorker/ClusterWorker` + +`taskFunctions` (mandatory) The task function or task functions object `{ name_1: fn_1, ..., name_n: fn_n }` that you want to execute on the worker +`opts` (optional) An object with these properties: + +- `maxInactiveTime` (optional) - Maximum waiting time in milliseconds for tasks on newly created workers. After this time newly created workers will die. + The last active time of your worker will be updated when it terminates a task. + If `killBehavior` is set to `KillBehaviors.HARD` this value represents also the timeout for the tasks that you submit to the pool, when this timeout expires your tasks is interrupted before completion and removed. The worker is killed if is not part of the minimum size of the pool. + If `killBehavior` is set to `KillBehaviors.SOFT` your tasks have no timeout and your workers will not be terminated until your task is completed. + Default: `60000` + +- `killBehavior` (optional) - Dictates if your worker will be deleted in case a task is active on it. + **KillBehaviors.SOFT**: If `currentTime - lastActiveTime` is greater than `maxInactiveTime` but a task is still executing or queued, then the worker **won't** be deleted. + **KillBehaviors.HARD**: If `currentTime - lastActiveTime` is greater than `maxInactiveTime` but a task is still executing or queued, then the worker will be deleted. + This option only apply to the newly created workers. + Default: `KillBehaviors.SOFT` + +#### `YourWorker.hasTaskFunction(name)` + +`name` (mandatory) The task function name + +This method is available on both worker implementations and returns a boolean. + +#### `YourWorker.addTaskFunction(name, fn)` + +`name` (mandatory) The task function name +`fn` (mandatory) The task function + +This method is available on both worker implementations and returns a boolean. + +#### `YourWorker.removeTaskFunction(name)` + +`name` (mandatory) The task function name + +This method is available on both worker implementations and returns a boolean. + +#### `YourWorker.listTaskFunctions()` + +This method is available on both worker implementations and returns an array of the task function names. + +#### `YourWorker.setDefaultTaskFunction(name)` + +`name` (mandatory) The task function name + +This method is available on both worker implementations and returns a boolean. diff --git a/docs/general-guidelines.md b/docs/general-guidelines.md new file mode 100644 index 00000000..c66bda4f --- /dev/null +++ b/docs/general-guidelines.md @@ -0,0 +1,36 @@ +## General Guidelines +Performance is one of the main target of these worker pool implementations, poolifier team wants to have a strong focus on this. +Poolifier already has a [benchmarks](./benchmarks/) folder where you can find some comparisons. + +### Internal Node.js thread pool + +Before to jump into each poolifier pool type, let highlight that **Node.js comes with a thread pool already**, the libuv thread pool where some particular tasks already run by default. +Please take a look at [which tasks run on the libuv thread pool](https://nodejs.org/en/docs/guides/dont-block-the-event-loop/#what-code-runs-on-the-worker-pool). + +**If your task runs on libuv thread pool**, you can try to: + +- Tune the libuv thread pool size setting the [UV_THREADPOOL_SIZE](https://nodejs.org/api/cli.html#cli_uv_threadpool_size_size). + +and/or + +- Use poolifier cluster pools that are spawning child processes, they will also increase the number of libuv threads since that any new child process comes with a separated libuv thread pool. **More threads does not mean more fast, so please tune your application**. + +### Cluster vs Threads worker pools + +**If your task does not run into libuv thread pool** and is CPU intensive then poolifier **thread pools** (_FixedThreadPool_ and _DynamicThreadPool_) are suggested to run CPU intensive tasks, you can still run I/O intensive tasks into thread pools, but performance enhancement is expected to be minimal. +Thread pools are built on top of Node.js [worker_threads](https://nodejs.org/api/worker_threads.html) module. + +**If your task does not run into libuv thread pool** and is I/O intensive then poolifier **cluster pools** (_FixedClusterPool_ and _DynamicClusterPool_) are suggested to run I/O intensive tasks, again you can still run CPU intensive tasks into cluster pools, but performance enhancement is expected to be minimal. +Consider that by default Node.js already has great performance for I/O tasks (asynchronous I/O). +Cluster pools are built on top of Node.js [cluster](https://nodejs.org/api/cluster.html) module. + +If your task contains code that runs on libuv plus code that is CPU intensive or I/O intensive you either split it either combine more strategies (i.e. tune the number of libuv threads and use cluster/thread pools). +But in general, **always profile your application**. + +### Fixed vs Dynamic pools + +To choose your pool consider first that with a _FixedThreadPool_/_FixedClusterPool_ or a _DynamicThreadPool_/_DynamicClusterPool_ your application memory footprint will increase. +By doing so, your application will be ready to execute in parallel more tasks, but during idle time your application will consume more memory. +One good choice from poolifier team point of view is to profile your application using a fixed or dynamic worker pool, and analyze your application metrics when you increase/decrease the number of workers. +For example you could keep the memory footprint low by choosing a _DynamicThreadPool_/_DynamicClusterPool_ with a minimum of 5 workers, and allowing it to create new workers until a maximum of 50 workers if needed. This is the advantage of using a _DynamicThreadPool_/_DynamicClusterPool_. +But in general, **always profile your application**. \ No newline at end of file -- 2.34.1