Cascadia Data Alliance

Data Science Collaborations


The opportunity: The Cascadia region of North America is home to some of the world’s leading technology, research, and medical organizations. Large amounts of biomedical research data are being generated in this geographical area that could be used to support accelerated research if shared effectively. Development of a robust regional data sharing ecosystem has the potential to position the Pacific Northwest as a global leader in data-driven innovation in biomedical research and healthcare, now and into the future. 

Cascadia Data Alliance logo

The challenge: cultural, technical and policy barries have historically limited cross-organizational data sharing.

The Cascadia Data Alliance will change that.

The Hutch Data Commonwealth’s goal in spearheading the Cascadia Data Alliance is to establish a health research data sharing ecosystem with organizations across the Pacific Northwest region. The Alliance will facilitate creation of shared best practices in data governance and groundbreaking partnerships. Improved governance and collaboration will be used to drive towards a common data platform to accelerate research and innovation across the community.

Cascadia Region of North America
Cascadia Region of North America

The Cascadia Data Discovery Initiative (CDDI)

The Cascadia Data Discovery Initiative (CDDI) is a first step toward achieving a regional data sharing ecosystem. CDDI aims to accelerate data discovery and subsequent data sharing for biomedical research.

Data Discovery

CDDI will enable sharing of metadata (descriptive information about underlying data) to facilitate data discovery so researchers can find interesting and unique datasets more quickly.

Data Sharing

CDDI will collect and curate governance documentation and will establish additional governance support to accelerate sharing of the underlying data between researchers at participating organizations.

How CDDI Works

Graphic for How the Cascadia Data Discovery Platform works

To achieve the goals of driving data discovery and sharing, CDDI will establish a Cascadia data discovery platform that allows researchers to query metadata and find datasets and resources relevant to their research. When a researcher finds a dataset or resource they would like to learn more about, they can connect to the researcher who uploaded the associated metadata to learn more. The researchers could then use Cascadia governance support tools (e.g. harmonized agreements) to facilitate data sharing and collaboration.

CDDI Participating Institutions

 

Preserving Privacy to Enable Data Access

Cascadia is also working to enhance sharing of data that contains oftentimes sensitive information in a way that provides strong privacy preservation. Cascadia partners are working together to develop and demonstrate technology solutions that both preserve privacy and facilitate meaninful analysis of health research data. We are collaborating with Microsoft to investigate the potential for an open-source differential privacy platform to allow for improving health research data access to investigators while enhancing privacy protections.

Wanted: Collaborative partners (organizations or individuals)

We are looking for collaborators to:

  • Contribute metadata to enable data discovery. We are particularly interested in extracting metadata from genomics, biospecimen, imaging, and public health data.
  • Participate in technology or methodology development.
  • Contribute to user testing of new data discovery or governance tools.
  • Test the differential privacy platform with health data.
  • Participate on the governance working group.

Benefits of Participation in Cascadia

Near-Term Benefits for Demonstration Project and Use Case Participants:

  • Subsidized cloud storage and cloud computing costs
  • Cascadia staff support and/or funding for existing lab staff for associated data management, metadata extraction, and data sharing activities
  • Improved ability for participating labs to reuse their own data and share within their home institution
  • Closer ties to regional collaborators with complementary skills and expertise (and access to their broader networks)
  • Opportunity to provide substantive input on the discovery platform and tools design
  • Hands-on data science and computer engineering support

Longer-Term Benefits:

  • Access to the Cascadia data discovery platform to find interesting or unique datasets
  • Reduction in red tape to accelerate data sharing
  • Improved ability to access data for preliminary analysis to support grant applications (e.g. CIHR or NIH grant)
  • Stronger open data ecosystem in the Cascadia region and beyond

Additional initiatives to accelerate access to health research data are in the planning phase and will further support the Cascadia Data Alliance’s broader goal of building a strong regional data sharing ecosystem.

Cascadia Data Alliance Leadership

Brenda Kostelecky, Ph.D.

Director, Cascadia Data Alliance Program

Brenda is a scientific program director experienced in developing and implementing innovative and strategic programs to advance research and discovery. She has several years’ experience leading successful cross-organizational initiatives at the National Cancer Institute and American Association for Cancer Research. She has a scientific background in molecular biology, structural biology and cancer biology research.