Data Privacy Principles for the Enterprise Data Lake

  • Data required for analytics purposes will be brought into the data lake for development of dashboards to preserve performance and security within original systems. This data may also be used to support integration with other systems.
  • Security that mimics original systems will be used within the data lake and applied to analytics resources. Where this is not possible Data Guardians (system managers) will be consulted to review options and determine decisions for moving forward.
  • Whole tables or systems will be copied into the data lake to eliminate the burden of managing specific views for analytics needs.
  • Raw data from each system is partitioned and stored separately from other university data assets within the data lake.
  • When data pipelines are created during new system implementations, people resources for the implementation will not be used, or will be minimally used such that it does not affect the implementation timeline.
  • Data in the enterprise data lake is refreshed nightly with full data reload. If data reloads are required to be more frequent, please consult with the Office of Data Management, Analytics and Visualization. Live data streams will be supported for processing within the data lake when required by systems.
  • Data transformations will occur in the data lake or prior to entering the data lake rather than within the analytics software to reduce rework and maintain integrity of data definitions.
  • Business analytics needs and KPIs will be determined by Data Guardians and business stakeholders with the consultation of the Office of Data Management, Analytics and Visualization.
  • Any new analytics question or project that requires use of multiple data domains will need approval from all Data Guardians prior to initiation of project.
  • Business areas stakeholders will remain subject matter experts of the data and will be responsible for assisting in the validation of data analysis if/when necessary.
  • Business stakeholders for system data will remain the Data Guardians and Data Stewards (as defined by the Data Governance Committee) of the data, regardless of location or copy.
  • Access for datasets within the data lake or for dashboards containing multiple data domains will require approval from all Data Guardians prior to permission being granted.
  • Documentation of the Data Source will be required from Data Guardian or Data Steward, which will include high level description of the data, contact information for any approvals, and a business glossary of the data, when applicable.
     

Definitions:

Data Guardians - are typically business area managers or directors and are responsible for ensuring effective local protocols are in place to guide the appropriate use of their data asset/system(s). Access to, and use of, institutional data will generally be administered by the appropriate Data Domain Guardian (or delegated Data Steward). 

 

Data Stewards - Data stewards have intimate knowledge of business processes and data within their business-specific data domain. They are integral in adding value to institutional data through defining business glossaries in the data catalog. They are responsible for monitoring compliance, addressing data accuracy issues, and reviewing requests for data access. All data stewards are members of the data steward council, which helps determine a university-wide glossary of terms and brings data issues to the Data Governance Committee for discussion and consideration.


Data Domains - a collection of data elements that are all grouped by the parties responsible for the sake of data governance. Data domains typically are grouped by business areas and needs.

Details

Article ID: 12617
Created
Thu 3/9/23 11:25 AM
Modified
Wed 3/15/23 10:32 AM

Related Services / Offerings (1)

A Data Hub is a system to ingest, transform, and store enterprise business data. It is a data exchange with data flow at its core allowing for a “hub and spoke” architecture to manage system integrations.