Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: added ToC and tweaked formatting
Table of Contents

Overview

ClinSpark supports data exports into SAS transport files (SAS XPT v5 and v8 format formats) compatible with FDA reporting guidelines.

Opening SAS XPT Files

The SAS Transport File Format is an openly documented specification maintained by SAS. The files are intended to be opened with a viewer that has the intended purpose of handling SAS data sets.

SAS offers a free ‘universal’ viewer application for Windows users to open SAS transport files. Details about that application and user guides are available on their support site.With XPT There may be other applications available with the purpose of opening and viewing SAS data sets, such as (paid) plugins for Microsoft Excel or statistical software such as JMP.

Automatic characters conversion

With SAS transport files all character data are stored in as ASCII, regardless of the operating system and XPT file format. ClinSpark natively supports UTF-8 character characters in all data collection interfaces. For instance, Lab Data result might come into ClinSpark encoded with UTF-8 characters. In order to comply with the XPT standard while exporting data from ClinSpark we apply special , a conversion procedures . This document describes these conversion processes in details.

Automatic characters conversion

ClinSpark automatically normalize is applied.

ClinSpark performs a UTF-8 Character conversion and normalizes all UTF-8 characters using NFD Unicode form. The normalization converts characters with diacritical marks, change changes all letters letter case, decompose decomposes ligatures, or convert and converts half-width katakana characters to full-width characters. One part of that processing is to remove accents, which is language and charset character set specific. See NFD Unicode normalization form for a detailed specification.

Our Normalizer The normalizer also decomposes the original characters into a combination of a base character and a diacritic sign (this could be multiple signs in different languages). á, é and í have the same sign: 0301 for marking the ' ˊ accent. Our processing engine normaliser will match find all such diacritic codes and we will replace them with an empty string.

Customer controlled characters conversion

In addition to an automatic conversion, ClinSpark allows customers to specify their own mapping rules. The following tables defined define “default” conversion rules:

Original Character

Desired Outcome

“β”

“B”

“ß”

“B”

“µ”

“u”

“²”

“2”

Automatic

...

character filtering

Ones Once automatic conversion and customer defined conversion took have taken place ClinSpark, performs some final filtering steps:

  • strips off out all non-ASCII characters

  • erases all the ASCII control characters

  • removes non-printable characters from Unicode

  • removes leading and trailing whitespaces

Conversion Audit

In cases where a conversion took place has occurred, ClinSpark generates an audit log. Audit The audit is captured in the log final named with naming based on the following rules:

Code Block
"${domainFileName}-log.csv"

where ${domainFileName} is a reporting data domain defined for each item during study design. Audit The audit file is then placed right by where included with the XPT file is, example contains of the . See the following example for TransferData-XPT_${date}zip:

Code Block
AE.xpt
AE-log.csv
Info

Note: an audit file is generated ONLY if data conversion took place, i.e. study data were modified during an export