Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

...

Note that like all ClinSpark application assets, it lives within private subnets of the Foundry Health Production Virtual Private Cloud (VPC).  It is not exposed to the internet.  It's Its data is encrypted at rest, and in transit via SSH.

...

As updates are applied to Master, the Replica receives these updates in real time.  There is a delay, but due to the architecture of AWS RDS Aurora, this delay is typically no greater than 20 msmilliseconds.  So 20 milliseconds ms after a change to the production database is made, the customer's Read Replica has this update available.

...

During onboarding, 2 DNS names and one set of credentials will be provided to you.  One is the DNS name of your dedicated bastion host to access the replica via SSH.  And the other is the DNS name of the read replica itself with the AWS VPC.  For this example, let's pretend these DNS name are as follows:

...

Info

The above approach is appropriate if only one or two trusted users will access the bastion.  It is secure as long as the private keys are secured.  Customers with more users will want to find more scalable ways to access the replica data.  One could be a SSH persistent tunnel on the customer site, and used by other users at the site.  AWS has a variety of options for this.  See AWS VPNs and Direct Connect, which are two mechanisms to securely connect your sites to your own AWS account.  From there a variety of options exist, including AWS PrivateLink to create secure connections between AWS accounts.

It is the customer's responsibility to select and configure any options such as the above.  Foundry Health will assist by suggesting approaches and making required configurations on our end.  But the majority of the setup is in customer-owned infrastructure, and for this the customer is solely responsible.

Clinical Data Interchange Standards

...

Consortium (CDISC

...

)

Here is how the Clinical Data Interchange Standards Consortium describes the ODM CDISC is an open, multidisciplinary, non-profit organization committed to the development of industry standards to support the electronic acquisition, exchange, submission, and archiving of clinical trials data and metadata for medical and biopharmaceutical product development.

Operational Data Model (CDISC ODM)

Here is how the Clinical Data Interchange Standards Consortium describes the ODM in the introduction of the specification:

...

To the maximum extent possible, the ClinSpark data model is based on the CDISC Operational Data Model (ODM) standard. In fact, a significant portion of the database schema for ClinSpark was generated directly from the ODM XML schema.  ODM scope is limited to the core entities of clinical study data.  The ClinSpark data model includes this but goes far beyond this scope.  As such the CDISC ODM and it's related documentation are excellent sources of information to understand the subset of the ClinSpark data model which overlaps.  But for concerns outside of CDISC ODM, such as volunteers, lab data and many others, the ODM documentation will not be helpful.The CDISC ODM documentation can   

The CDISC ODM documentation can be freely downloaded from here.  We recommend that anyone intending to work with and understand the ClinSpark data model spend some time getting familiar with the ODM, as it provides valuable insights into both design and intended usage.

...

Laboratory Data Model (CDISC LAB)

The CDISC Laboratory Data Team (LAB) has as its mission the development of a standard model for the acquisition and interchange of laboratory data.  The specification and all related documentation can be found on the CDISC LAB homepageTo the extent possible and practical, the ClinSpark data model is based on the CDSIC ODM 1.3.2 data model.  A few of the benefits of this are:

  • Standards-based simplifies interoperability.  ClinSpark natively produces and accepts CDISC data, simplifying integration with other products and vendors supporting this standard.
  • Simplified training and transferable knowledge.  CDISC ODM is fairly widely used within the industry.  This makes our data model relatively easy for new users familiar with ODM to comprehend. 
  • High quality data foundation vs proprietary makes data more future-proof. 

The following section highlights a series of key subject areas within the ClinSpark data model.  This section covers both CDISC and non-CDISC entities, since this is the nature of the ClinSpark data model.  

The entity diagrams and the relationships between them is intended to help readers understand what can be found in tables and also how tables can be joined using SQL.  This is not meant to be comprehensive, it should be used in conjunction with the other schema documentation provided.

Study Design

The following subject areas are involved with study design.  adheres to the CDISC LAB specification.  Model entities which come directly from this specification include:

  • Base Specimen
  • Base Battery
  • Base Test
  • Base Result

Additionally the way that ClinSpark handles the concept of Assession comes directly from the CDISC LAB standard, section 3.4.8.  Here is the guidance from this section:

Accession data identifies the specimen collection kit and the laboratory from where it came. The convention that seems to be standard for all laboratories is that one accession number identifies one accession used at one subject visit. Central Laboratory ID must always be populated or required.

ClinSpark Data Model

To the extent possible and practical, the ClinSpark data model is based on the CDSIC ODM 1.3.2 data model.  A few of the benefits of this are:

  • Standards-based simplifies interoperability.  ClinSpark natively produces and accepts CDISC data, simplifying integration with other products and vendors supporting this standard.
  • Simplified training and transferable knowledge.  CDISC ODM is fairly widely used within the industry.  This makes our data model relatively easy for new users familiar with ODM to comprehend. 
  • High quality data foundation vs proprietary makes data more future-proof. 

The following section highlights a series of key subject areas within the ClinSpark data model.  This section covers both CDISC and non-CDISC entities, since this is the nature of the ClinSpark data model.  

The entity diagrams and the relationships between them is intended to help readers understand what can be found in tables and also how tables can be joined using SQL.  This is not meant to be comprehensive, it should be used in conjunction with the other schema documentation provided.

Study Design

The following subject areas are involved with study design.  

Note that ClinSpark has made significant extensions to the ODM in a number of areas.  Two examples are device connectivity and importation from the volunteer record.  There is no provision within ODM for either of these features.  To support this, extensions have been made to data within the ItemGroup and ItemRef domains.  This allows noting fields which will have their values populated from direct capture from medical devices, or from a volunteer record within the database.  Other such extensions exist throughout the ClinSpark data model.  As such, domains which originate in the CDISC standard often contain a superset of the data described by CDISC and also additions created by Foundry Health.

Org, Volunteer, Study, Sites, StudySites an Subjects

...


TableFrom CDISC?Notes
1orgNoAn org represents an entity performing clinical research (CRO / CRU). Orgs can have multiple sites that execute studies.
2studyYesThis element collects static structural information about an individual study.  A study is related to a given clinical trial protocol.
3siteNoA site is a physical place belonging to an organization.  An organization having multiple physical clinical sites will have multiple site rows.
4study_siteNoAn association between a physical site and a study.  A study site is different than a physical location.  Often, pharma sponsors will specify sites with arbitrary codes and those codes must pass through during data export time. In addition, this domain encapsulates recruitment efforts for a given study / site.
5volunteerNoA volunteer is someone who indicates that they are interested in participating in clinical research for the given org.
6subjectNoYesSomeone participating in clinical research within the context of a given study. Creates glue between the volunteer and the participation.

Form Definition

Image RemovedImage Added


TableFrom CDISC?Notes
1StudyMetaDataYesStudyMetaData (MetadataVersion in CDISC) defines the types of study events, forms, item groups, and items that form the study data.  This is basically an aggregation of all CRF design elements for a study.
2FormYes

A Form (FormDef in CDISC) describes a type of form that can occur in a study. 

A form is basically a container for item groups. This class explicitly excludes the required 'repeating' (Yes|No) attribute from the domain due to the fact that phase 1 studies are different and that it's likely that most of the forms will repeat as they are related to a study event. At ODM build time, we'll check for those forms that repeat and ensure that we create the repeating attribute properly. 

3ItemGroupYes

An ItemGroup (ItemGroupDef in CDISC) describes a type of item group that can occur within a study.

It basically is collection of related items in a given form.

4ItemYes

An Item (ItemDef in CDISC) describes a type of item that can occur within a study. Item properties include name, datatype, measurement units, range or codelist restrictions, and several other properties.

It basically represents the definition of a piece of data collected.

5ItemGroupRefYesA reference to a given ItemGroup.  This reference can hold data about the association.
6ItemRefYesA reference to a given Item.  This reference can hold data about the association.
7MethodYesODM representation that allows a value of an Item .  This reference can hold data about the association.to be computed. In ODM, this OID can be found on an ItemRef - meaning that item is computed by this method via invocation of the formal expression

Activity Plans

An activity plan is a schedule of events for a given cohort.  Activity Plans do not appear in CDSIC.  In ODM, there is no similar construct in the ODM to capture this concept.  Activity Plan, Segment and Scheduled Activity are ClinSpark specificthe notion of a FormRef. However, this design doesn't fit well with ph1 trials where forms are commonly repeated for a given study event (ie many PKs in a given day). As such, FormRef is implicitly available by way Scheduled Activities that are a part of a Segment / Activity Plan.


TableFrom CDISC?Notes
1StudyYesThis element collects static structural information about an individual study.  A study is related to a given clinical trial protocol.
2Activity PlanNoA schedule of events for a given cohort. Plans can be assigned to multiple cohorts. A timed plan must have a reference time in order to properly provide UI feedback as segments and scheduled activities are set.

Untimed Activity Plan:
  • Reference time must be null
  • Single segment; segment must be root and must have offset second set to zero


Timed Activity Plan:
  • Must have a reference time
  • Can have 1-n segments; always sort by offset seconds
  • Reference segment must have offset seconds of zero

Activity Plan fills the role of the FormRef in the ODM.

3SegmentPartially

Holds a group of scheduled activities in an activity plan. The segment's offset seconds is essentially the time of the reference event, all scheduled activities are relative to this.

Modeled somewhat of off CDISC SDM:

"Segments are often seen as the basic building blocks of study design. A segment usually specifies a combination of planned observations and interventions, which may or may not involve treatment, during a period of time."

4Scheduled ActivityNoWraps a form, but adds metadata including timing.
5FormYesA form is basically a container for item groups.
6Study EventYesA study event represents a given 'visit'. In phase 1 trials this will commonly simply refer to a 'day'. When scheduling forms for a given schedule, the builder must associate the study event. Note: there are common study events that are typically reserved for special events: unscheduled, common (AE, CM), etc

...


TableFrom CDISC?Notes
1epochNoAn epoch is typically specified in a study protocol and typically signifies some milestone type events within the trial. ie: screening, treatment, followup, etc
2cohortNoA cohort is a way to break up epochs into different groupings. Protocols will occasionally indicate that epochs should be broken up (perhaps in different trial arms to test different dose levels, etc), or this can just purely be an organizational thing
3cohort_assignmentNo

Binds a Volunteer as Subjects within a given cohort along with an Activity Plan. We allow assigning different schedules, and we allow for later scheduling a subject.

4activity_planNoA schedule of events for a given cohort.
6subjectNoYesSomeone participating in clinical research within the context of a given study.

...


TableFrom CDISC?Notes
1item_dataYesA piece of collected data
2item_group_dataYesAggregates item data
3form_dataYesForm data represents data collected for a given subject. Instead of storing the scheduled time on this domain, we leverage the relationship to the encapsulated scheduledActivity domain and thus its relationship to the segment.
4study_event_dataYesClinical data for a study event (visit) for a given subject
5study_eventYesA study event represents a given 'visit'. In phase 1 trials this will commonly simply refer to a 'day'. When scheduling forms for a given schedule, the builder must associate the study event. Note: there are common study events that are typically reserved for special events: unscheduled, common (AE, CM), etc
6subjectNoYesSomeone participating in clinical research within the context of a given study.

...


TableFrom CDISC?Notes
1item_dataYesA piece of collected data.  Note that lab orders and all results are associated to a given Item Data.  You will need to join through Item Data when working with lab data.
2base_specimenYes (CDISC LAB)Modeled off of CDISC Lab. A specimen is collected from a subject and assigned to a given item data instance. There can be multiple batteries (test groups) associated to a given specimen. Combines Accession Level and Base Specimen from spec.
3base_batteryNoYes (CDISC LAB)A panel related to a specimen - typically this is just a 1:1.
4base_test_resultPartiallyYes (CDISC LAB)Yes (CDISC LAB)Combines CDISC Lab BaseTest and BaseResult.  These are the results from the lab.
5lab_orderNoWhen specimens are collected, this domain represents that an order is generated. It causes a manifest file to be created (PDF) and potentially a file order to be dumped on to the file system and made available to web services.
6lab_interfaceNoEncapsulates how to send and receive orders and results from a particular safety lab.  Sites may have multiple labs, and if so each of these will have their own lab interface instance.
7study_lab_panelNoSomething that can be ordered from item level
8specimen_containerNoWhen setting up samples or labs, users can optionally choose a container.
9lab_repeatNoA domain that allows for the management of lab repeat workflows

...


TableFrom CDISC?Notes
1volunteerNoSomeone who indicates that they are interested in participating in clinical research for the given org.Someone who indicates that they are interested in participating in clinical research for the given org.
2volunteer_medical_conditionNoAssociates a given condition to a given volunteer
3volunteer_noteNoA simple note that can be attached to the volunteer record
4volunteer_correspondenceNoRepresents a call or text to / from a volunteer by way of Twilio
5volunteer_substance_useNoWe purposely don't track SUOCCUR, it allows us to indicate that the vlunteer is not using the substance.
6recruitment_appointmentNoAllows for a given volunteer to be assigned to a given time slot
7study_siteNoAn association between a study and a site
8subjectNoYesSomeone participating in clinical research within the context of a given study.

...