## Derived Dataset

Another pretty common operation is to transform the dataset into another dataset according to a method of some kind, such as grouping rows according to some criteria. When a dataset undergoes a transoformation like that we call it a derivative. There are currently only two basic derivatives available in Dataset with the plan to add more as they are needed: groupBy and movingAverage.

### GroupBy

A groupBy operation involves splitting the data into groups based on a specific column, applying a function to the rows in each group and combining the results into a single dataset.

For example, when grouping the following dataset by the "state" column:

state | value |
---|---|

MA | 130000 |

MA | 420 |

AZ | 2900 |

AZ | 4 |

The result of the call:

ds.groupBy("state", ["value"]);

state | value |
---|---|

MA | 130420 |

AZ | 2904 |

By default the groupBy will sum up the values in the rows, but you can pass any method as an options argument like so:

### CountBy

A countBy operation involves counting the number of occurances of each unique value in a specified column. Those values are then set as another column called "count."

For example, when counting the following dataset by the "state" column:

state | value |
---|---|

MA | 130000 |

MA | 420 |

AZ | 2900 |

AZ | 4 |

The result of the call:

ds.countBy("state");

state | count |
---|---|

MA | 2 |

AZ | 2 |

### Moving Average

A moving average of size N is a new sequence that is computed by taking the mean (or any other method) of the subsequences of N terms. For example, taking the moving average with a window size of 3 of the following dataset:

key | value |
---|---|

A | 130000 |

B | 420 |

C | 1000 |

D | 200 |

E | 2900 |

F | 4 |

Like so:

ds.movingAverage("value");Will result in the following table:

key | value | (explanation - NOT IN TABLE) |
---|---|---|

C | 43806 | (130000 + 420 + 1000)/3 |

D | 540 | (420 + 1000 + 200)/3 |

E | 1366 | (1000 + 200 + 2900)/3 |

F | 1034 | (200 + 2900 + 4)/3 |

Note that you can also specify multiple columns like so:

ds.movingAverage(["A", "B", "C"]);And an alternate method like so:

ds.movingAverage(["A", "B", "C"], { method : _.sum });

### Syncing Behavior

If you are creating a derived dataset from a dataset that is syncable, you can
subscribe to derived dataset's`change`

event.

Because of the inherent nature of a derived dataset, even the smallest change in your original data can cause many changes in your derived dataset. At the moment, those changes are not encompased in a set of deltas. Instead, the derived dataset gets recomputed. This is an expensive operation, but it reduces the code complexity substantially. We are open to discussing a better way of handling this situation, but for now, this works.