Jun 29 2016

Easiest way to import CSV files

Greetings to all readers. Let us discuss a very commonly faced issue in this blog, import of data into a database through a program. Programmers and client organisations equally dread the upload of legacy data into the database while installing new software systems as well as modifying existing systems. Even in day to day applications there might arise situations where we need to write bulk data into the database. Everyone is interested in easier methods of exporting and importing data. One commonly used method is csv files.

A csv file contains data that is either exported from a database or to be inserted into a database. The first row contains the column names and the remaining rows till the end contains data. While uploading the file we have to validate the file format on client side as well as from the server side. After file uploading we have to read the file and split it into the values that has to be stored in the database. To make these tasks easier I have started using either data tables or entities.

Let us discuss how to import csv files easily

Using DataTables
Using Model

Both methods have advantages and disadvantages. To insert data into database, we need to first split the file it into rows and then split it into columns. Then we can write code to insert values into the database.

In both methods, most of the uploading concept are same. First I will explain the common codes. Sample code for file upload:


<div class="file-field input-field"> <div class="btn blue waves-effect waves-light"> <span> Files</span> <input type="file" id="csvFile" accept=".csv"> </div> <div class="file-path-wrapper"> <input class="file-path validate" type="text"> </div> <span id="fileUploadErrorMsg"></span> </div> This is the code which will Show a button File. On clicking, a window opens from where we can choose file. If we give accept=".csv" then the browse window will show only the csv files.

Here it shows only the csv files. But still we can change it to all and select any file type.

We need to check the format. FormData contains a property called ContentType. Check the ContentType to ensure that the file is .csv. If so, continue, else display a message select the csv file.

Code to set client side validation:

var formData = new FormData(); var file = document.getElementById("csvFile").files[0]; formData.append("Doc", file); if (formData.ContentType == ".csv") { //upload api call }else{ alert(‘Please insert .csv file’); } This code will show an alert if it does not upload a csv file. If it is a csv file, call the appropriate function. It contains the file upload logic and logic for csv file reading and inserting to db.

Controller (for file upload)

public JsonResult UploadCsvFile(UploadViewModel vm) { return TryExecute(() => { var hpf = vm.Doc; //Code gives the extension of the file string fileExtension = Path.GetExtension(hpf.FileName).ToLower(); // comparing the extension is .csv if(fileExtension == ".csv") { string path = "";//specify your folder path var fileName = Guid.NewGuid() + hpf.FileName; var savedFileName = Path + fileName; hpf.SaveAs(savedFileName); return savedFileName; } else { throw new Exception("Please upload csv file."); } }); } The above code helps to check whether the uploaded file is csv or not. If not, throw an exception, else it will upload the file to the specific path.

UploadViewModel (which is used by the above method)

UploadViewModel contains the following values.

public class UploadViewModel { public HttpPostedFileBase Doc { get; set; } } HttpPostedFileBase it comes from system.web and it contains the following datas
public virtual int ContentLength { get; } public virtual string ContentType { get; } public virtual string FileName { get; } public virtual Stream InputStream { get; } public virtual void SaveAs(string filename);
Now let's start focusing on the main part of this blog, the logic of reading and inserting data. After uploading csv, we can start reading the file and insert it into database. First, let us look at inserting data using Data Table.

Data Table :

public DataTable CsvReader(string path) { var oStreamReader = new StreamReader(path); ​ DataTable oDataTable = null; var rowCount = 0; string[] columnNames = null; string[] oStreamDataValues = null; //using while loop read the stream data till end while (!oStreamReader.EndOfStream) { var readLine = oStreamReader.ReadLine(); if (readLine == null) continue; var oStreamRowData = readLine.Trim(); if (oStreamRowData.Length <= 0) continue; oStreamDataValues = oStreamRowData.Split(','); //Bcoz the first row contains column names, we will poluate //the column name by //reading the first row and RowCount-0 will be true only once if (rowCount == 0) { rowCount = 1; columnNames = oStreamRowData.Split(','); oDataTable = new DataTable(); ​ //using foreach looping through all the column names foreach (DataColumn oDataColumn in columnNames.Select(csvcolumn => new DataColumn(csvcolumn.ToUpper(), typeof(string)))) { //setting the default value of empty.string to newly created column oDataColumn.DefaultValue = string.Empty; ​ //adding the newly created column to the table oDataTable.Columns.Add(oDataColumn); } } else { //creates a new DataRow with the same schema as of the oDataTable
if (oDataTable == null) continue; DataRow oDataRow = oDataTable.NewRow(); ​ //using foreach looping through all the column names for (int i = 0; i < columnNames.Length; i++) { oDataRow[columnNames[i]] = oStreamDataValues[i] == null ? string.Empty : oStreamDataValues[i].ToString(); } ​ //adding the newly created row with data to the oDataTable
oDataTable.Rows.Add(oDataRow); } }
//close the oStreamReader object oStreamReader.Close(); //release all the resources used by the oStreamReader object oStreamReader.Dispose(); ​ //Looping through all the rows in the Datatable foreach (DataRow oDataRow in oDataTable.Rows) { ​ string RowValues = string.Empty;
//Looping through all the columns in a rowforeach (string csvcolumn in columnNames) { RowValues += csvcolumn + "=" + oDataRow[csvcolumn].ToString() + "; } } return oDataTable; }

While using data table, the order of the data in the table and the column headings of the csv file need not match. Data can be entered in any order into the csv file. The order depends only on the first row which contain the column name. All the data will be following the order of the first row.
The disadvantage of using DataTable is, if we misspell the column name on csv file, it cannot find matching column name in the database and will fail to insert value to the database. So for doing this we need to give same column names as in the database.

Now let us look at inserting data using Model.


public bool CsvImportForLead(string path) { bool status = false; string[] lines = System.IO.File.ReadAllLines(path); for (var i = 1; i < lines.Length; i++) { var columns = lines[i].Split(',').ToArray(); var person = new Person { FirstName = columns[0], LastName = columns[1], Company = columns[2], Email = columns[3], PhoneNo = columns[4], Website = columns[5] };

//write create code
} _unitOfWork.Commit(); status = true;
return status; }

Here, the main advantage is it does not depend on the column name of the csv file. But it also has a disadvantage that the order of the column names in the csv file and the table should be maintained. Otherwise the entire code will break and upload will fail. But if we provide the proper data, It will execute easily.

If we use Model then we need to assign values with assumptions. If we use data table we need to use column names carefully. For DataTable create an object for the DataTable and assign values to it and add it to the database.

Hope this blog will be useful for the readers while using csv files.

Loading Disqus Comments ...