Creating a Dataset

In the quick start guide we created a few basic datasets, but Dataset has a set of powerful facilities that allow for importing data and parsing it in a variety of ways. This section will explain how Dataset actually gets data, and transforms it to a Dataset object you can use.

Instantiating a new Miso.Dataset

To create a new dataset from any source, you must call new Miso.Dataset first. There is a variety of parameters you can pass, all of which are outlined in the API Docs. In this guide, we will outline how some of those properties can be used and the general flow Dataset goes through to create a Dataset instance with your data.

The Dataset Workflow

When you initialize a Dataset and then call fetch, there is a series of operations that take place. It's important to understand them so that you can intercept them where needed.

Specifying your importer and parser

An importer is responsible for getting the data from a source of any kind. Some are very simple and just retrieve it from a local variable. Some are more complex and go get it from a url, for example. To see how to create your own, read the Creating Custom Importer section below.

A parser then takes the data that the importer fetched, and converts it to a format we are expecting in dataset. All of Dataset's parsers must return the same format, which is how we can then convert it to a Dataset object. To see how to create your own, read the Creating Custom Parser section below.

When you initialize your Dataset instance, the appropriate properties are saved and Dataset will pick an importer and parser for you unless you specify them directly as properties. See the Google spreadsheet tutorial for an example of this.

By default, you must provide one of the following:

1. data - A var (or actual data structure) representing your data.

You can edit the code in this block and rerun it.

2. url - A url from which the data will be fetched. If the request should be a jsonp request, also set jsonp:true. It's worth noting that url can also be set to a function that returns a string url.

You can edit the code in this block and rerun it.

Importers and parsers are actually entirely separate from each other, meaning you can have your data on the server or already loaded as a var and still use the same parser. This independence from each other, means you can mix and match them as your application requires.
Available Importers are listed here.
Available Parsers are listed here.

Calling fetch

Calling .fetch on your dataset instance actually starts the data preparation process. Until you do that your dataset will not contain any data!

Even if your data is local, you need to call fetch. This is not only for consistancy, but also because you might have a mix of datasets (some of which are remote) and you want to act on the successful return of them all. fetch gives you a deferred you can use in that case.

Data can be fetched in one of two ways:

1. Pass success/error callbacks:

You can edit the code in this block and rerun it.

2. Using Deferreds - Dataset makes use of the Underscore.Deferred extension. Calling fetch() actualy returns a deferred object that you can then use, for example, if you have more than one dataset you need to wait on. You can use deferreds as follows:

You can edit the code in this block and rerun it.

Passing an error callback is important when fetching remote data.

Extracting actual data from importer

When the data is successfully retrieved by the importer, it passes it to the parser. At this point you can intercept this passing of the result directly by defining an extract method, that is useful if your data array is actually nested somewhere within the response and you need to retrieve it. For example:

You can edit the code in this block and rerun it.

How data is added to Dataset

The parser takes the data in, and parsers into a standard format. It then decides how the data should be put into the dataset:

  • If this is a new dataset - it just populates it with the data.
  • If this is an existing dataset and new data is being added it decides based on flags you can set.
    → By default, the data is added as new rows.
    → If you set the uniqueAgainst to a column name, only rows in which that column value is unique will be added.
    → If you set the resetOnFetch flag, on subsequent fetches, the data will be wiped and the new data will be added.

Overriding Data Types

Once the data is returned from the parser, it needs to be coerced into the appropriate types.

Dataset supports the following prebuilt data types: number, string, boolean and time.

If no types were specified during instantiation, Dataset will attempt to detect the types of the data. Specifying the types is faster and more reliable. You can specify types like so:

You can edit the code in this block and rerun it.

For more information about types, check out the data types tutorial.

Intercepting data before coersion

Sometimes your data comes in with a strage format. For example, maybe you have a column with dollar amounts in it but you really want to treat that column as a numeric column. Without creating a custom type (which is a great way to create reusable types,) you could actually set a before filter function on your column type that will get called before the data gets coerced. For example:

You can edit the code in this block and rerun it.

Sorting the data

Once the data has been parsed and coerced to the proper type, if a comparator function was provided the data will be sorted according to that function.

You can edit the code in this block and rerun it.

Ready callback

Once the data has been parsed, coerced and sorted (if needbe,) it is ready to be used. Before the success callback passed to fetch is called, you can specify a ready callback when creating your dataset that will fire beforehand.

You can edit the code in this block and rerun it.

Creating Custom Importers

You may have noticed how easy it is to set a custom importer and parser in the dataset constructor by specifying the importer and parser properties. The import system can also easily be extended for custom data formats and other APIs.

An importer must extend the following interface:

For examples of the available importers, see the github repo directory containing the available importers.

Creating Custom Parsers

More likely than not your data might be in a format that requires some custom parsing. The easiest way to do that is to create a parser of your own. To see an example of a custom importer, check out the Github Example.

A custom parser must follow the following structure:

« Google Spreadsheets

Working with Live Data »