Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel6
outlinefalse
typelist
printablefalse

Summary

Certain reports available through ClinSpark are provided in CSV (Comma Separated Value) format. CSV is a widely used and flexible format that relies on a delimiter between reported values. CSV is often preferred over XLS (Microsoft Excel) due to inherent limitations with how data can be reported into Excel worksheets. This article will explain these limitations and provide suggestions on how to work with CSV files provided by ClinSpark.

Row and File Size Limits

Excel has a limit to the number of rows and columns that can be used per worksheet, which is exceeded quickly with certain study/device data sets. CSV files however can hold many more rows of data than XLS, and do not have these issues. Additionally there are file size limits with Excel workbooks which are difficult to design and implement certain reports around. Formatting requirements can also cause challenges with mixing various types of data. Customers can learn more about Excel file limits from this Microsoft support article.

Tips for using Excel and CSV files

The default mechanism of simply opening CSV files in Excel through ‘double clicking’ the available file local file folders or a browser directly may be problematic due to some assumed default behavior when handling imported data. One example is how the formatting of dates & times are changed if there are differences in the reported datetime format in CSV, and the interpreted datetime in Excel. Another common issue we see are with handling of numeric values in CSV that contain a leading sequence of numbers such a as zero (0); where upon opening the file in Excel the zeros appear to be stripped or removed. Excel may not always properly handle the import of the CSV contents for expected review based on these behaviors.

...

We recommend that customers follow Microsoft’s suggested practice for importing the contents of CSV files into an existing or new workbook using Import tools. These help articles offer suggestions on how to accomplish this workflow.

https://support.microsoft.com/en-us/office/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857

https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba

https://support.microsoft.com/en-us/office/import-data-from-a-csv-html-or-text-file-b62efe49-4d5b-4429-b788-e1211b5e90f6https://support.microsoft.com/en-us/office/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857

Example Workflow

The following is an example on how to import a transfer data report set using the Clinical Data Text (delimited) formatreport type, which outputs as a CSV file formatseries of CSV files. This is a unique example, because unlike the CSV file format suggests where the data is separated by comma, the delimiter is a pipe (vertical bar, |).

...

For this report, the data is separated into multiple files. These can be imported/reviewed individually, or together in into the same Excel workbook.

...

Using a basic text editor (not Excel!; examples are Notepad which is built in to Windows, or the more powerful and free open source Notepad++), we can inspect the contents of the files to confirm that they are contain delimited values using a pipe symbol. There are other Other characters, such as quotations, are used in data fields as well.

...

If opening these files CSV in Excel from a local folder by ‘double clicking’ the file, Excel opens automatically modifies the contents, and strips the quotations from one of the fields (STUDYID) and modifies the field contents.

...

Additionally, it Excel has placed all of the file contents of into the file into first cell (A) of the Workbook, instead of separating the file contents into individual cells (as typically expected).

However when using the Import Wizard, steps can be taken to ensure the data from the CSV is properly imported.

...

The import wizard allows you to clearly define what the delimiters are. In this example, it’s necessary to specify that the pipe ( | ) is the delimiter, and the text qualifier ( " ).

...

The last step is important to consider when there are mixed data types in the file. For example, the handling of datetime values may be more appropriate to format change into a region specific date valueformat. Or, leave the datetimes unaltered from the source file and preserve as 'Text'. Each column of the reported data should be reviewed to ensure the proper data format is applied.

...

When the import is complete, the remaining resulting view in Excel is more appropriate than beforesimply opening it from the file browser. Not only has have the CSV contents been properly reviewed during the import process, but correct data types have been set and applied to the workbook, making the data much easier to review.