© 2024 IQVIA - All Rights Reserved

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Summary

Certain reports available through ClinSpark are provided in CSV (Comma Separated Value) format. CSV is a widely used and flexible format that relies on a delimiter between reported values. CSV is often preferred over XLS (Microsoft Excel) due to inherent limitations with how data can be reported into Excel worksheets. This article will explain these limitations and provide suggestions on how to work with CSV files provided by ClinSpark.

Row and File Size Limits

Excel has a limit to the number of rows and columns that can be used per worksheet, which is exceeded quickly with certain study/device data sets. CSV files however can hold many more rows of data than XLS, and do not have these issues. Additionally there are file size limits with Excel workbooks which are difficult to design and implement certain reports around. Formatting requirements can also cause challenges with mixing various types of data. Customers can learn more about Excel file limits from this Microsoft support article.

Tips for using Excel and CSV files

The default mechanism of opening CSV files in Excel through local file folders or a browser directly may be problematic due to some assumed default behavior when handling imported data. One example is how the formatting of dates & times are changed if there are differences in the format in CSV, and the interpreted datetime in Excel. Another common issue we see are with handling of numeric values in CSV that contain a leading sequence of numbers such a zero; where upon opening the file in Excel the zeros appear to be stripped or removed. Excel may not always properly handle the import of the CSV contents for expected review based on these behaviors.

We recommend that customers follow Microsoft’s suggested practice for importing the contents of CSV files into an existing or new workbook using Import tools. These help articles offer suggestions on how to accomplish this workflow.

https://support.microsoft.com/en-us/office/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857

https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba

https://support.microsoft.com/en-us/office/import-data-from-a-csv-html-or-text-file-b62efe49-4d5b-4429-b788-e1211b5e90f6

Example Workflow

The following is an example on how to import a transfer data set using the Clinical Data Text (delimited) report type, which outputs series of CSV files. This is a unique example, because unlike the CSV file format suggests where the data is separated by comma, the delimiter is a pipe (vertical bar).

For this report, the data is separated into multiple files. These can be imported/reviewed individually, or together into the same Excel workbook.

Using a basic text editor (not Excel), we can inspect the contents of the files to confirm that they contain delimited values using a pipe symbol. Other characters, such as quotations, are used in data fields as well.

If opening these CSV in Excel from a local folder by ‘double clicking’ the file, Excel opens automatically modifies the contents, and strips the quotations from one of the fields (STUDYID).

Additionally, Excel has placed all of the file contents into the first cell (A) of the Workbook, instead of separating the file contents into individual cells (as typically expected).

However when using the Import Wizard, steps can be taken to ensure the data from the CSV is properly imported.

To start, users can review the contents of the CSV ‘preview’ and determine if there is a better way to represent the data types other than delimited.

The import wizard allows you to clearly define what the delimiters are. In this example, it’s necessary to specify that the pipe ( | ) is the delimiter, and the text qualifier ( “ ).

The last step is important to consider when there are mixed data types in the file. For example, the handling of datetime values may be more appropriate to change into a region specific date format. Or, leave the datetimes unaltered from the source file and preserve as ‘Text’. Each column of the reported data should be reviewed to ensure the proper data format is applied.

When the import is complete, the remaining view in Excel is more appropriate than simply opening it from the file browser. Not only have the CSV contents been properly reviewed during the import process, but correct data types have been set and applied to the workbook, making the data much easier to review.

  • No labels