 
  
  
  
    A New Way To Create and View Datasets & Collections!
    
    Explore the Docs »
    
    
    Website |
    View Demo (BETA) |
    Report Bug |
    Request Feature
  
Dataset Overview Page (Aug. '23) | Dataset Files Table (Dec. '23)
Dataverse Frontend is currently in beta and under active development. While it offers exciting new features, please note that it may not be stable for production use. We recommend sticking to the latest stable Dataverse release for mission-critical applications. If you choose to use this repository in production, be prepared for potential bugs and breaking changes. Always check the official documentation and release notes for updates and proceed with caution.
To stay up-to-date with all the latest changes, join the Google Group
The Dataverse Project is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others' work more easily. Researchers, journals, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility. Read more on the Dataverse Website.
The Dataverse Frontend repository is an initiative undertaken in 2023 to modernize the UI and design of the Dataverse Project by creating a stand-alone interface to allow users and organizations to implement their own Dataverse installations and utilize the JavaScript framework of their choice.
The goals of Dataverse Frontend:
- Modernize the application
- Separate the frontend and backend logic, transition away from Monolithic Architecture
- Reimagine the current Dataverse backend as a headless API-first instance.
- The Dataverse Frontend becomes a stand-alone SPA (Single Page Application)
- Modularize the UI to allow third-party extension of the base project
- Increase cadence of development, decrease time between release cycles to implement new features
- Introduce testing automation
- Give priority and transparency to coding and design to support Harvard University's commitment to ensuring the highest standards for Accessibility Compliance
- Empower the community to create, contribute, and improve.
New Features:
- Node Application using ReactJS for the project baseline
- Native localization support through the i18n library
- Accessibility compliant code built from the ground-up
- Improved modularity via Web Components
- Cypress testing automation
- Storybook for UI Component Library
- 2023-08-01: View mode of the dataset page
- 2023-12-13: Files table on the dataset page
- 2024-09-13: Collection page, collection and dataset creation and file uploading
The Dataverse Frontend project uses several environments to support different stages of development, testing, and deployment. Each environment serves a specific purpose for stakeholders ranging from developers to end users.
All environments follow an “all-in-one” setup, where the frontend and backend applications run together on a Payara server. Although these environments use the all-in-one setup, the SPA is infrastructure-agnostic and could also be deployed independently on other platforms, such as Docker containers, object storage services (e.g., Amazon S3 buckets), or any static hosting service/CDN, as long as it can communicate with the backend APIs.
The Beta environment provides a remote space for testing the latest changes. GitHub Actions automatically deploy the current develop branches of both the frontend and backend.
- Audience: Development team, QA analysts, project managers, selected users for early feedback
- URL: beta.dataverse.org/spa
The Demo environment showcases the latest officially released version of the SPA, compatible with the latest Dataverse backend release. Deployments target specific tagged releases (e.g., 0.1.0) and are performed on demand.
- Audience: Project managers, curation team, early adoption testers
- URL: demo.dataverse.org/spa
The QA environment is a dedicated, short-lived testing space. It is deployed on demand with feature branches (e.g., feature/xxx), frequently overwritten, and used for validating new features and bug fixes before merging into development.
- Audience: QA analysts, development team
- URL: qa.dataverse.org/spa
Spike Environments are temporary, project-specific deployments tied to individual branches. They are used to prototype new ideas and gather early feedback on unstable or in-progress work. Deployments can occur automatically on merge or on demand.
- Audience: Development team, project managers
- URLs: Unique per project, e.g., project-foo.dataverse.orgorproject-abc.dataverse.org
- Note: Setting up a Spike Environment requires a working Keycloak instance. See Keycloak Deployment for details.
- The existing Dataverse API will be added to and extended from the present backend architecture while the existing UI and current Dataverse functionalities are preserved.
- The SPA will continue its life as a separate application, supported on its own maintenance schedule.
- When the SPA has matured enough for an official release, we will switch to the new version and the old backend will be moved into maintenance mode with no new features being introduced and focusing only on critical bugfixes.
Changes from the original Dataverse JSF application
The design system and frontend in this repo are inspired by the Dataverse Project Style Guide, but the following changes have been made, especially for accessibility.
While JSF refers to "Dataverses" (sometimes called "sub-dataverses" or child dataverses by the Dataverse community), the SPA calls them "Collections".
We added an underline to links to make them accessible.
Now we are using Bootstrap with a theme, so there is only one definition for the secondary color. Since Bootstrap applies the secondary color to the labels automatically, the color of the file label is now the global secondary color which is a lighter shade of grey than what it used to be.
We changed the citation block to be white with a colored border, to make the text in the box more accessible.
We have introduced an update to the breadcrumb navigation UI. Unlike in the original JSF application, where breadcrumbs did not reflect the user's current location within the site, our new SPA design now includes this feature in the breadcrumbs. Additionally, we have aligned with best practices by positioning all breadcrumbs at the top, before anything else in the UI.
We have also introduced action items as the last item of the breadcrumb, eg: Collection > Dataset Name > Edit Dataset Metadata
This update gives users a clear indication of their current position within the application's hierarchy.
Our main goal is to replicate the behavior of the original JSF application in all its functionalities, although during development we have found opportunities to review certain behaviors and apply changes where we find appropriate.
The original Dataset JSF page uses Solr to search for files based on the available filters. Past dataset versions are not indexed in Solr, so the filter option is not available (hidden) for such versions. When a version is indexed, the search text is searched in Solr, and Solr grammar can be applied. When the version is not indexed, the search text is searched in the database.
The new SPA does not use Solr as the API endpoint it uses performs all queries on the database. Filters and search options are available for all versions in the same way, homogenizing behavior, although losing the possibility of using the Solr grammar.
The decision of this change is made on the assumption that Solr may not be required in the context of files tab search, whose search facets are reduced compared to other in-application searches. Therefore, if we find evidence that the assumption is incorrect, we will work on extending the search capabilities to support Solr.
We have also introduced infinite scroll pagination here.
The original JSF Dataverses/Datasets/Files list on the home page uses normal paging buttons at the bottom of the list. We have implemented infinite scrolling in this list, replacing the normal paging buttons, but the goal would be to be able to toggle between normal paging and infinite scrolling via a toggle setting or button.
To avoid potential performance issues we agree on hiding the counts of collections, datasets and files in the Collection Page Items List.
A feature has been added to suggest an identifier to the user based on the collection name entered.
Given that at the moment the SPA only supports file uploading through direct upload (S3), the storage selector on the create collection page is disabled. The collection is always created using the default storage, which must be S3
The Account Page has been updated to remove breadcrumbs, as the page is not part of the main navigation.
Links to share a collection or a dataset via LinkedIn, X or Facebook will now open in a new tab instead of a popup.
A feature has been added that lets users customize their collections by adding a carousel with featured collections, datasets, files, blog posts, news, and other types of content.
Requirements for deploying the SPA
To enable authentication for Dataverse built-in user accounts in the SPA, you must deploy a Keycloak instance as an OIDC authentication provider and broker for the SPA.
However, using Keycloak is not mandatory. If you do not require authentication for Dataverse built-in users, you can use any OIDC provider to handle user authentication in the SPA.
For supporting built-in users, Keycloak must be properly configured and integrated with the Dataverse backend using the Dataverse Built-in Users SPI.
On the SPA side, ensure that the PKCE environment variables are set up to connect to the chosen OIDC provider for authentication.
Additionally, to allow the SPI to authenticate users against the Dataverse database, the Dataverse database must be accessible from the Keycloak service within the deployed infrastructure.
Detailed information on how to configure a remote Keycloak instance is available in the Keycloak Deployment documentation.
Interested in what's being developed currently? See the open issues for a full list of proposed features (and known issues), and what we are working on in the currently planned sprint.
We are developing the new Dataverse Frontend in quarterly milestones.
The current milestone for Frontend Development is described in Proposal: SPA Beta Features for Q2 2024.
Keep an eye out on The Institute for Quantitative Social Science (IQSS) Dataverse Roadmap at Harvard University to get a look at upcoming initiatives for the project.
For more information on the Dataverse re-architecture project, see the original documentation, Restructuring the Dataverse UI as a Single-Page Application.
All notable changes to this project are documented in our CHANGELOG.md. The changelog follows the Keep a Changelog format and adheres to Semantic Versioning.
We also maintain a separate Design System Changelog for component-specific changes.
For Contributors: Please ensure you add appropriate changelog entries for user-facing changes when submitting pull requests. See our Changelog Guidelines for details.
We love PRs! Read the Contributor Guidelines for more info. Any contributions you make are greatly appreciated.
Got Questions? Join the conversation on Zulip, or our Google Groups for Developers and Users. Or attend community meetings, hosted by the Global Dataverse Community Consortium to collaborate with the interest groups for Frontend Development and Containerization, learn and share with communities around the world!
Thanks to Chromatic for providing the visual testing platform that helps us review UI changes and catch visual regressions.
Distributed under the Apache License, Version 2.0. See LICENSE for more information.
