Creating a Dataset
In the quick start guide we created a few basic datasets, but Dataset has a set of powerful facilities that allow for importing data and parsing it in a variety of ways. This section will explain how Dataset actually gets data, and transforms it to a Dataset object you can use.
Instantiating a new Miso.Dataset
To create a new dataset from any source, you must call
new Miso.Dataset first. There is a variety of
parameters you can pass, all of which are outlined in the API Docs.
In this guide, we will outline how some of those properties can be used and the general flow Dataset goes through
to create a Dataset instance with your data.
The Dataset Workflow
When you initialize a Dataset and then call fetch, there is a series of operations that take place. It's important to understand them so that you can intercept them where needed.
Specifying your importer and parser
An importer is responsible for getting the data from a source of any kind. Some are very simple and just retrieve it from a local variable. Some are more complex and go get it from a url, for example. To see how to create your own, read the Creating Custom Importer section below.
A parser then takes the data that the importer fetched, and converts it to a format we are expecting in dataset. All of Dataset's parsers must return the same format, which is how we can then convert it to a Dataset object. To see how to create your own, read the Creating Custom Parser section below.
When you initialize your Dataset instance, the appropriate properties are saved
and Dataset will pick an
parser for you unless
you specify them directly as properties. See the Google
spreadsheet tutorial for an example of this.
By default, you must provide one of the following:
data - A var (or actual data structure) representing your data.
url - A url from which the data will be fetched. If the request should be a jsonp request, also
jsonp:true. It's worth noting that
url can also be set to a function that returns
a string url.
Importers and parsers are actually entirely separate from each other, meaning you can have your data on the server
or already loaded as a var and still use the same parser. This independence from each other, means you can mix
and match them as your application requires.
Available Importers are listed here.
Available Parsers are listed here.
.fetch on your dataset instance actually starts the data preparation process.
Until you do that your dataset will not contain any data!
Even if your data is local, you need to call fetch. This is not only for consistancy, but also because
you might have a mix of datasets (some of which are remote) and you want to act on the successful return
of them all.
fetch gives you a deferred you can use in that case.
Data can be fetched in one of two ways:
1. Pass success/error callbacks:
2. Using Deferreds - Dataset makes use of the
Underscore.Deferred extension. Calling
fetch() actualy returns a deferred object that you can then
use, for example, if you have more than one dataset you need to wait on. You can use deferreds as follows:
error callback is important when fetching remote data.
Extracting actual data from importer
When the data is successfully retrieved by the
importer, it passes it to the parser. At this point
you can intercept this passing of the result directly by defining an
extract method, that is useful if your data
array is actually nested somewhere within the response and you need to retrieve it. For example:
How data is added to Dataset
The parser takes the data in, and parsers into a standard format. It then decides how the data should be put into the dataset:
- If this is a new dataset - it just populates it with the data.
If this is an existing dataset and new data is being added it decides based on flags you can set.
→ By default, the data is added as new rows.
→ If you set the
uniqueAgainstto a column name, only rows in which that column value is unique will be added.
→ If you set the
resetOnFetchflag, on subsequent fetches, the data will be wiped and the new data will be added.
Overriding Data Types
Once the data is returned from the parser, it needs to be coerced into the appropriate types.
Dataset supports the following prebuilt data types: number, string, boolean and time.
If no types were specified during instantiation, Dataset will attempt to detect the types of the data. Specifying the types is faster and more reliable. You can specify types like so:
For more information about types, check out the data types tutorial.
Intercepting data before coersion
Sometimes your data comes in with a strage format. For example, maybe you have a column with dollar amounts in it
but you really want to treat that column as a numeric column. Without creating a custom type (which is a great
way to create reusable types,) you could actually set a
before filter function on your column
type that will get called before the data gets coerced. For example:
Sorting the data
Once the data has been parsed and coerced to the proper type, if a
comparator function was provided
the data will be sorted according to that function.
Once the data has been parsed, coerced and sorted (if needbe,) it is ready to be used. Before the
callback passed to
fetch is called, you can specify a
ready callback when creating your dataset
that will fire beforehand.
Creating Custom Importers
You may have noticed how easy it is to set a custom importer and parser in the dataset constructor by specifying
parser properties. The import system can also easily be extended
for custom data formats and other APIs.
For examples of the available importers, see the github repo directory containing the available importers.
Creating Custom Parsers
More likely than not your data might be in a format that requires some custom parsing. The easiest way to do that is to create a parser of your own. To see an example of a custom importer, check out the Github Example.