diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..12f2dbf --- /dev/null +++ b/.gitignore @@ -0,0 +1,5 @@ +*~ +/bazel-* +**/__pycache__ +**/.ipynb_checkpoints/ +**/*.swp diff --git a/AUTHORS b/AUTHORS new file mode 100644 index 0000000..9a7328b --- /dev/null +++ b/AUTHORS @@ -0,0 +1,9 @@ +# This the official list of Bazel rules_closure authors for copyright purposes. +# This file is distinct from the CONTRIBUTORS files. + +# See the latter for an explanation. +# Names should be added to this file as: +# Name or Organization +# The email address is not required for organizations. + +Google Inc. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..2827b7d --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,27 @@ +Want to contribute? Great! First, read this page (including the small print at the end). + +### Before you contribute +Before we can use your code, you must sign the +[Google Individual Contributor License Agreement] +(https://cla.developers.google.com/about/google-individual) +(CLA), which you can do online. The CLA is necessary mainly because you own the +copyright to your changes, even after your contribution becomes part of our +codebase, so we need your permission to use and distribute your code. We also +need to be sure of various other things—for instance that you'll tell us if you +know that your code infringes on other people's patents. You don't have to sign +the CLA until after you've submitted your code for review and a member has +approved it, but you must do it before we can put your code into our codebase. +Before you start working on a larger contribution, you should get in touch with +us first through the issue tracker with your idea so that we can help out and +possibly guide you. Coordinating up front makes it much easier to avoid +frustration later on. + +### Code reviews +All submissions, including submissions by project members, require review. We +use Github pull requests for this purpose. + +### The small print +Contributions made by corporations are covered by a different agreement than +the one above, the +[Software Grant and Corporate Contributor License Agreement] +(https://cla.developers.google.com/about/google-corporate). diff --git a/CONTRIBUTORS b/CONTRIBUTORS new file mode 100644 index 0000000..f605467 --- /dev/null +++ b/CONTRIBUTORS @@ -0,0 +1,14 @@ +# People who have agreed to one of the CLAs and can contribute patches. +# The AUTHORS file lists the copyright holders; this file +# lists people. For example, Google employees are listed here +# but not in AUTHORS, because Google holds the copyright. +# +# https://developers.google.com/open-source/cla/individual +# https://developers.google.com/open-source/cla/corporate +# +# Names should be added to this file as: +# Name + +James Wexler +Jimbo Wilson +Justine Tunney diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..d645695 --- /dev/null +++ b/LICENSE @@ -0,0 +1,202 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/README.md b/README.md new file mode 100644 index 0000000..c448439 --- /dev/null +++ b/README.md @@ -0,0 +1,59 @@ +# Introduction + +The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive. + +The visualizations are implemented as [Polymer](https://www.polymer-project.org) web components, backed by [Typescript](https://www.typescriptlang.org) code and can be easily embedded into Jupyter notebooks or webpages. + +## Facets Overview + +![Overview visualization of UCI census data](/img/overview-census.png "Overview visualization of UCI census data - Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml/datasets/Census+Income]. Irvine, CA: University of California, School of Information and Computer Science") + +Overview gives a high-level view of one or more data sets. It produces a visual feature-by-feature statistical analysis, and can also be used to compare statistics across two or more data sets. The tool can process both numeric and string features, including multiple instances of a number or string per feature. + +Overview can help uncover issues with datasets, including the following: + +* Unexpected feature values +* Missing feature values for a large number of examples +* Training/serving skew +* Training/test/validation set skew + +Key aspects of the visualization are outlier detection and distribution comparison across multiple datasets. +Interesting values (such as a high proportion of missing data, or very different distributions of a feature across multiple datasets) are highlighted in red. +Features can be sorted by values of interest such as the number of missing values or the skew between the different datasets. + +Details about Overview usage can be found in its [README](./facets_overview/README.md). + +## Facets Dive + +![Dive visualization of UCI census data](/img/dive-census.png "Dive visualization of UCI census data - Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml/datasets/Census+Income]. Irvine, CA: University of California, School of Information and Computer Science") + +Dive is a tool for interactively exploring up to tens of thousands of multidimensional data points, allowing users to seamlessly switch between a high-level overview and low-level details. +Each example is a represented as single item in the visualization and the points can be positioned by faceting/bucketing in multiple dimensions by their feature values. Combining smooth animation and zooming with faceting and filtering, Dive makes it easy to spot patterns and outliers in complex data sets. + +Details about Dive usage can be found in its [README](./facets_dive/README.md). + +# Setup + +## Installation +``` +git clone https://github.com/pairml/facets +cd facets +``` + +## Enabling Usage in Jupyter Notebooks + +Pre-built versions of the visualization code can be found in the facets-dist directory. + +To enable use of these visualizations in Jupyter notebooks: +1. Install the jupyter notebook software: http://jupyter.org/install.html +2. Install the visualizations into Jupyter as an nbextension: ```jupyter nbextension install facets-dist/ --user``` (run from the facets top-level directory). + +## Building the Visualizations + +If you make code changes to the visualization and would like to rebuild them for use in Jupyter notebooks, follow these directions: +1. Install bazel: https://bazel.build/ +2. Build the visualizations: ```bazel build facets:facets_jupyter``` +3. Move the resulting vulcanized html file into the facets-dist directory. +4. Reinstall the facets-dist jupyter extension as in the previous section + +**Disclaimer: This is not an official Google product** diff --git a/WORKSPACE b/WORKSPACE new file mode 100644 index 0000000..f4db04c --- /dev/null +++ b/WORKSPACE @@ -0,0 +1,103 @@ +workspace(name = "ai_google_pair_facets") + +http_archive( + name = "io_bazel_rules_closure", + sha256 = "bc41b80486413aaa551860fc37471dbc0666e1dbb5236fb6177cb83b0c105846", + strip_prefix = "rules_closure-dec425a4ff3faf09a56c85d082e4eed05d8ce38f", + urls = [ + "http://mirror.bazel.build/github.com/bazelbuild/rules_closure/archive/dec425a4ff3faf09a56c85d082e4eed05d8ce38f.tar.gz", # 2017-06-02 + "https://github.com/bazelbuild/rules_closure/archive/dec425a4ff3faf09a56c85d082e4eed05d8ce38f.tar.gz", + ], +) + +load("@io_bazel_rules_closure//closure:defs.bzl", "closure_repositories") +load("@io_bazel_rules_closure//closure:defs.bzl", "web_library_external") +load("@io_bazel_rules_closure//closure:defs.bzl", "filegroup_external") + +closure_repositories() + +http_archive( + name = "org_tensorflow_tensorboard", + sha256 = "b793efe5536b06debcfadfa9ce7e774cadf654e5e9d52f6570ac11060d62e3a7", + strip_prefix = "tensorboard-7b3c93ca9b6aea715cc349dc10fb151c11c70e01", + urls = [ + "http://mirror.bazel.build/github.com/tensorflow/tensorboard/archive/7b3c93ca9b6aea715cc349dc10fb151c11c70e01.tar.gz", # 2017-06-14 + "https://github.com/tensorflow/tensorboard/archive/7b3c93ca9b6aea715cc349dc10fb151c11c70e01.tar.gz", + ], +) + +load("@org_tensorflow_tensorboard//third_party:workspace.bzl", "tensorboard_workspace") + +tensorboard_workspace() + +web_library_external( + name = "org_polymer_paper_card", + srcs = ["paper-card.html"], + licenses = ["notice"], # BSD-3-Clause + path = "/paper-card", + sha256 = "daf6f5326501f74811c2e10ca4ca8d2a42613e88f3ac64e218e6a3cf4cc1dac2", + strip_prefix = "paper-card-2.0.0", + urls = [ + "http://mirror.bazel.build/github.com/PolymerElements/paper-card/archive/v2.0.0.tar.gz", + "https://github.com/PolymerElements/paper-card/archive/v2.0.0.tar.gz", + ], + deps = [ + "@org_polymer", + "@org_polymer_iron_flex_layout", + "@org_polymer_iron_image", + "@org_polymer_paper_material_styles", + "@org_polymer_paper_styles", + ], +) + +web_library_external( + name = "org_polymer_paper_material_styles", + srcs = [ + "element-styles/paper-material-styles.html", + ], + licenses = ["notice"], # BSD-3-Clause + path = "/paper-styles", + sha256 = "abd39f4546cf11983ae70a2bb69cbb2af12918874a5fe7d803e447eea77520d6", + strip_prefix = "paper-styles-2.0.0", + urls = [ + "http://mirror.bazel.build/github.com/PolymerElements/paper-styles/archive/v2.0.0.tar.gz", + "https://github.com/PolymerElements/paper-styles/archive/v2.0.0.tar.gz", + ], + deps = [ + "@org_polymer", + "@org_polymer_paper_styles", + ], +) + +web_library_external( + name = "org_polymer_iron_image", + srcs = ["iron-image.html"], + licenses = ["notice"], # BSD-3-Clause + path = "/iron-image", + sha256 = "40c7b2ec941e29a1721c6fb19d6de69308c50a960a3c3319faf2447eed0d4d88", + strip_prefix = "iron-image-2.0.0", + urls = [ + "http://mirror.bazel.build/github.com/PolymerElements/iron-image/archive/v2.0.0.tar.gz", + "https://github.com/PolymerElements/iron-image/archive/v2.0.0.tar.gz", + ], + deps = [ + "@org_polymer", + ], +) + +web_library_external( + name = "org_polymer_iron_validator_behavior", + srcs = ["iron-validator-behavior.html"], + licenses = ["notice"], # BSD-3-Clause + path = "/iron-validator-behavior", + sha256 = "0956488f849c0528d66d5ce28bbfb66e163a7990df2cc5f157a5bf34dcb7dfd2", + strip_prefix = "iron-validator-behavior-1.0.2", + urls = [ + "http://mirror.bazel.build/github.com/PolymerElements/iron-validator-behavior/archive/v1.0.2.tar.gz", + "https://github.com/PolymerElements/iron-validator-behavior/archive/v1.0.2.tar.gz", + ], + deps = [ + "@org_polymer", + "@org_polymer_iron_meta", + ], +) diff --git a/facets-dist/facets-jupyter.html b/facets-dist/facets-jupyter.html new file mode 100755 index 0000000..77e020f --- /dev/null +++ b/facets-dist/facets-jupyter.html @@ -0,0 +1,8714 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/facets/BUILD b/facets/BUILD new file mode 100644 index 0000000..eb239f8 --- /dev/null +++ b/facets/BUILD @@ -0,0 +1,37 @@ +package(default_visibility = ["//visibility:public"]) + +load("@org_tensorflow_tensorboard//tensorboard/defs:web.bzl", "ts_web_library") +load("@org_tensorflow_tensorboard//tensorboard/defs:vulcanize.bzl", "tensorboard_html_binary") + +licenses(["notice"]) # Apache 2.0 + +ts_web_library( + name = "visualizations", + srcs = [ + "visualizations.html", + ], + path = "/facets", + deps = [ + "//facets_overview/components/facets_overview", + "//facets_dive/components/facets_dive", + ], +) + +tensorboard_html_binary( + name = "facets", + input_path = "/facets/visualizations.html", + output_path = "/all/visualizations.html", + deps = [":visualizations"], +) + +# Add javascript to undefine the define function when building the vulcanized +# visualizations. This is to avoid issues with require.js dependency loading +# when using the visualizations inside of a Jupyter notebook. +# TODO(jwexler): Figure out a cleaner way to get vulcanized visualizations that +# work in Jupyter notebooks. +genrule( + name = "facets_jupyter", + srcs = [":facets"], + outs = ["facets-jupyter.html"], + cmd = "sed 's|||' $(location :facets) > $@" +) \ No newline at end of file diff --git a/facets/visualizations.html b/facets/visualizations.html new file mode 100644 index 0000000..be66112 --- /dev/null +++ b/facets/visualizations.html @@ -0,0 +1,2 @@ + + \ No newline at end of file diff --git a/facets_dive/Dive_demo.ipynb b/facets_dive/Dive_demo.ipynb new file mode 100644 index 0000000..e2a04fa --- /dev/null +++ b/facets_dive/Dive_demo.ipynb @@ -0,0 +1,86 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "# Load UCI census and convert to json for sending to the visualization\n", + "import pandas as pd\n", + "features = [\"Age\", \"Workclass\", \"fnlwgt\", \"Education\", \"Education-Num\", \"Martial Status\",\n", + " \"Occupation\", \"Relationship\", \"Race\", \"Sex\", \"Capital Gain\", \"Capital Loss\",\n", + " \"Hours per week\", \"Country\", \"Target\"]\n", + "jsonstr = pd.read_csv(\n", + " \"https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test\",\n", + " names=features,\n", + " sep=r'\\s*,\\s*',\n", + " engine='python',\n", + " skiprows=[0],\n", + " na_values=\"?\").to_json(orient='records')" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "scrolled": false + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# Display the Dive visualization for this data\n", + "from IPython.core.display import display, HTML\n", + "\n", + "HTML_TEMPLATE = \"\"\"\n", + " \n", + " \"\"\"\n", + "html = HTML_TEMPLATE.format(jsonstr=jsonstr)\n", + "display(HTML(html))" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.4.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/facets_dive/README.md b/facets_dive/README.md new file mode 100644 index 0000000..c830160 --- /dev/null +++ b/facets_dive/README.md @@ -0,0 +1,307 @@ + +Facets Dive is a data visualization for interactively exploring large numbers of records at once—many thousands at a time. +Each record should be an object with key/value pairs representing the features of that record, and the values should be strings or numbers. + +## Getting Started + +In this section, you'll learn how to use Dive embedded in your on page or app. +The two things you need are your own data, and the Dive Polymer element. + +### Providing Data to Dive + +The `` element has many attributes you can set to customize its behavior, but the only one you absolutely must set is `data`. +This should be an array of JavaScript objects, where each object represents a single record. + +For example, say your data is a list of food items. +Each food has a unique name, belongs to a category, and provides calories. +As JSON, your data would look something like this: + +```js +[{ + "name": "apple", + "category": "fruit", + "calories": 95 +},{ + "name": "broccoli", + "category": "vegetable", + "calories": 50 +},{ + ...Many more foods... +}] +``` + +The objects don't all need to have exactly the same set of keys. +If an object is missing keys that are present in another object, that record will still be shown in Dive. + +At this time, Dive only handles numeric and string values. +If the values on your objects are complex (like arrays, or nested objects), these will be cast as strings prior to being visualized. + +### Providing Sprites For Dive to Render + +By default, Dive will render text onto a circle to represent each data point. +However, you can supply a sprite atlas that it can use instead. + +A sprite atlas is one big image containing many tiny images at predictable coordinates. +Starting from the top-left hand corner of the image, sprites proceed across and down, from left-to-right and from top-to-bottom. + +For example, consider a data set with 10,000 data points. +Indexed from zero, they'd be arranged in a sprite atlas like so: + +``` ++---------+---------+---------+- - - - -+---------+ +| | | | | | +| 0 | 1 | 2 | ... | 99 | +| | | | | | ++---------+---------+---------+- - - - -+---------+ +| | | | | | +| 100 | 101 | 102 | ... | 199 | +| | | | | | ++---------+---------+---------+- - - - -+---------+ +| | | | | | +| 200 | 201 | 202 | ... | 299 | +| | | | | | ++---------+---------+---------+- - - - -+---------+ +| | | | | | + . . . . . +| . | . | . | . | . | + . . . . . +| | | | | | ++---------+---------+---------+- - - - -+---------+ +| | | | | | +| 9900 | 9901 | 9902 | ... | 9999 | +| | | | | | ++---------+---------+---------+- - - - -+---------+ +``` + +To specify the URL to an atlas to use, set the `atlasUrl` property of the Dive Polymer Element in JavaScript (or the `atlas-url` attribute in HTML). +If the atlas image is served from a different domain than the visualization, it will have to use [CORS headers](https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API/Tutorial/Using_textures_in_WebGL#Cross-domain_textures) to be useful. +In that case, you'll also have to set the `crossOrigin` property (or `cross-origin` HTML attribute) to be either `anonymous` or `use-credentials` just like you would for an `` or `