Health Data Analytics
Clearwater Technology provides data management and analytics facilities to meet health data science requirements. Clearwater aggregates data from heterogeneous sources, and provides data cleaning and alignment processes to feed analytic pipelines. Analysis can be completed collaboratively in a range of programming languages including Python and R.
Clearwater integrates with Clarinet to create a health data science platform for clinical research, and for natural history and academic study. Clearwater and Clarinet together, with support from Certus, bring structure and rigour to data analytics projects. In Clearwater technology, CRF design is underpinned by a comprehensive metadata repository. This powerful addition makes data definitions, field definitions and validation rules available within the analytics platform. This reduces complexity and can drive automated data cleaning processes. The metadata repository also automatically generates online documentation allowing a comprehensive data dictionary and CRF manual to be made available online and maintained cooperatively.
Clearwater includes a general data lake into which heterogeneous data is imported from multiple sources. These data can be in any format and in any structure. In line with Map-Reduce patterns, data mappers are constructed to build and populate data tables from the data lake. Using the meta-data management facilities, data tables can automatically be coerced to the correct data type and similarly statistical summaries can be generated. All data cleaning or removal processes are recorded in an audit log to track changes and maintain full data provenance.
The Clearwater Technology platform includes Jupyter Notes to create a collaborative analysis environment. Analysis within Jupyter can be completed in a range of programming languages.
National Neuromuscular Database (NND)