Edit: 2011/04/28 As several people have pointed out, the code in tutorial does not create a true CSV parser. We did, however, publish a followup tutorial that is a fully functional CSV parser.
And here it is: C# Tutorial - Using The Built In OLEDB CSV Parser
So you are sitting around and you somehow have 100 Comma Separated Value files (CSV) and you are not quite sure exactly what the best way to read them is. Well, if you are using Visual Studio and C#, you are in quite a bit of luck, because you can read a CSV file quite easily. With one very small function you can spit out a list of values, separated conveniently by rows and columns. Then you can take this list and use it however you want, perhaps in a DataTable or GridView object.
The parser we are going to build today is going to be extremely simple, and will in fact break on more complicated CSV files (files that have commas actually in the data, etc...). But for most CSV files, this will work fine - and look for a tutorial in the near future about building a parser that can easily deal with even the most convoluted of CSV files.
To start off, you need to open up Visual Studio and start a new C# application project, so go ahead and do that, naming the project whatever you want. Once your project is up and ready, you need to find a place to build and call your parser function. If you right click on your Form1.cs, then go to 'View Code' you will get your form1's code. Inside the main Form1 : Form class, under your public Form1() definition is the perfect place for your function for now. Later on you can move it to somewhere more permanent, but for now we will get the function working.
Sadly, one of the namespaces we will be using is not declared by default in the standard 'using' statements at the top of our file. But all you have to do is add it below all the others:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using System.IO; //System.IO is not used by default
Now we can tear up some serious code. The first step is to declare our parser function. It will look something like:
public List<string[]> parseCSV(string path)
{
}
This is a pretty simple function, which will take in a string that represents the path of the CSV file and spit out a List of string arrays. Now you may be asking why not just use a string array of string arrays (string[][])? Well, adding elements to an array is not exactly efficient, but a list can be added to, subtracted from, and is just generally a lot more flexible.
The next thing we need to do is declare our return variable, which is really just one line. So inside the function, as the first line, we have:
List<string[]> parsedData = new List<string[]>();
This is just declaring a List of string arrays that will hold our file
information as we read each line. Now the cool thing is that the
System.IO namespace has this neat class called StreamReader, which can
open a text based file and read it line by line. This gives us the
advantage of just calling a method that reads the file line by line
rather than byte by byte. StreamReader is extremely handy for reading
text files and is a perfect candidate for us in this case.
We are gong to declare our StreamReader with a using statement so it will be disposed once it leaves scope, and when it is disposed it will be closed automatically. So declaring the new StreamReader object will look something like:
using (StreamReader readFile = new StreamReader(path))
{
}
Take notice that the actual declaration is inside the using statement. Inside this statement we will be doing everything involving reading the file and building our list of string arrays. Now all a comma separated values file is is exactly what you would think - a file full of values separated by commas. Each line really corresponds to a row of data, so all we have to do is read the file line by line, then separate the values. Since the StreamReader Class can read a file line by line, all we really need to do is take the line and split it. But first we have to declare some variables, inside the using block of course.
We will need two variables, one to hold the line as it is read and an
array to hold the separated values. We will call these line and row:
using (StreamReader readFile = new StreamReader(path))
{
string line;
string[] row;
Next we have to read the file line by line, which can be done with a very simple while statement. We will be reading the file until the current line is null, which will be the case when there are no more lines to read. To do this we set our line variable to our currently read line, then when the line is null, stop reading the file. It will look like so:
while ((line = readFile.ReadLine()) != null)
{
}
A very simple while loop that runs through the file until you reach a
line that is empty. *Take note that a line is not null if it is space,
newline, or the like. A line is only null if there is truly nothing
there.* Inside this loop, all we need to do is split each line at the
commas, then add the resulting array to our list. Luckily there are many
times you need to split a string, so the basic string class has a method
to do just this. After we split the line, adding it to the list is just
as simple. We call list.add(). So with two lines of code we can do
what we need to. After our additions our while statement will look
like:
while ((line = readFile.ReadLine()) != null)
{
row = line.Split(',');
parsedData.Add(row);
}
Simple yet effective. So simple in fact, that you really don't have to read in just comma separated files, but any file separated by a standard character can be read. all you have to do is change the split() call to whatever character is splitting the file. As mentioned above, the first line sets our row variable to the values of our split string, and the second line adds that string array to our list. Not difficult to understand at all.
The end of the while loop actually means the end of our using block as well. After the using block we have a completely filled list of string arrays, which represent rows of data in our CSV file. All we need to do now is put the whole thing in a Try-Catch block, which will catch any errors we may get when attempting to open or read the file.
We don't actually need anything fancy, in fact we will just catch any exception we get in the using block (since there are a bunch of different kinds that could be thrown). So our final function will look something like:
public List<string[]> parseCSV(string path)
{
List<string[]> parsedData = new List<string[]>();
try
{
using (StreamReader readFile = new StreamReader(path))
{
string line;
string[] row;
while ((line = readFile.ReadLine()) != null)
{
row = line.Split(',');
parsedData.Add(row);
}
}
}
catch (Exception e)
{
MessageBox.Show(e.Message);
}
return parsedData;
}
Notice that we just take the message from the exception and display it with the standard MessageBox class. This will work, and our function will actually just return an empty list on any error, which means that our code actually doesn't break, we just don't get any data. So after we return our parseData list, whether filled or not, our function ends.
A small function that is easy to understand and even easier to build. Even better, since it returns a basic list object, you can use the data returned to do anything from fill a grid to making complex calculations. You can also make this function read any type of separated file, just change the separator in the split statement.

If you would like a full Visual Studio Solution for this tutorial, one can be found here. I hope this tutorial was informative and most of all useful, and I'll be back soon with a tutorial on the more complicated version of this parser (the one that can deal with complex CSV files).
Source Files:
Very cool! Easy to follow and useful. thanks
http://oscar.angress.de/2011/02/17/csv-converter/
Thanks , finally I found what i look for !
But what if one of the fields has a comma in it? *bug*
In the article itself, it says that this is just a simple CSV parser:
Then it's not a csv parser. It's only something that splits lines based on a character. If I build an xml parser that only reads nodes and not attributes, then I can't call it an xml parser can I?
Sorry man, you're only halfway there. Calling something incomplete 'complete' doesn't make it so.
Try renaming your page to "comma-parser" or something.
CSV stands for "Comma Seperated Values" ... idiot! What a moronic comment from somebody try to sound smart. Haha
This code looks good... I read the comments it is not good to comment like this.. I did not see any CSV with quotes...
Can you please tell me how do I refer CSV file with indirect link.
We are using different CSV files. So instead of having hard code can we refer csv through the other directry or other files like xml.
Thanks in advance.
Every one could have done what you have done here. If people are searchig for a csv parser it is just to find the one that can address some complicated csv file with commas in the data, etc.
But thanks for your post.
For those with more complex CSV parsing needs, check out this post: Using The Built In OLEDB CSV Parser
with OLEDB we lose the leading zeros
so, by the comments above, the OLE DB CSV Parser is NOT a CSV Parser, because it does have bugs... :P
When people are stupid, they just are.
Does anybody have done work on complicated CSV parsers. If yaa, please tell me.
What happen , if there is commas and spaces in between ?
Shweta, check out the Using The Built In OLEDB CSV Parser tutorial.
I have built one, for the code go to http://programmersfirststep.orgfree.com/CommaSeperatedValue.cs
my post above,is not a tutorial, but it is the source code, so i assume that you can pretty much make out the information (it does not use the split method, or that other tutorial listed by the author, it is a csv parser that only has one current flaw.... when the line ends and is in a quote it doesnt continue it as a token, but im working on that
This was very informative especially to the ones just starting to learn parsing a csv file...
keep up the good work!
More Power...
very good and helpfull tutorial i just started to parse CSV files and this will help me more .
thanks
How convert CSV to XML file using ASP.NET and VISUALSTUDIO 2008 1. Load my file in a folder in my WebSite 2. Build the XML file (from CSV file, of course)!!! Thanks
?1. Load the CSV into a DataTable using this tutorial.
?2. Serialize the DataTable to XML using .NET's built in XML serialization.
Rich,
Nice work and I like it all except I think you should be using a
in preference of
-
and strings are so limiting...
/
:-)
Simple and effective. But, if you're csv data has comma's embedded in values, this could break.
Here is a one at codeplex that can handle this. http://commonlibrarynet.codeplex.com/
Hello.. Thanks for the good work on this great helpful tutorial. I am still a newbie for C#, as this is the first programming i which to learn. Now still try hard to understand it and play with it.
I need some help or advise, for the DataGrid, is it possible to edit the name of the "column1" ? If can, den how am i going to do that?
Thanks in advance.
:)
This helped me kick off a project for work. Where do I send the check? Thanks for explaining the steps in English. This makes it a great learning exercise. I look forward to finding more solutions here.
Thanks! Really this helped me to start my dream desktop application flying. I was trying some other menthod but everytime I found some obstruction. Now it helped me to anticipate the hurdle.
Good tutorial, one question though.
The csv film I'm reading contains "" which then get displayed as ??
Does anyone know how I can get round this problem?
Many thanks,
Scott.
I get following error.
Cannot serialize the DataTable. DataTable name is not set.
when i tries to do
DataTable.WriteXml()
Good Note
the code is working for me..
thanks
hi
I get the below error when I run the code on a Live server as opposed to my local machine... any ideas? thanks
System.IO.FileNotFoundException: Could not find file 'c:\windows\system32\inetsrv\Test.csv'. File name: 'c:\windows\system32\inetsrv\Test.csv' at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, String msgPath, Boolean bFromProxy) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options) at System.IO.StreamReader..ctor(String path, Encoding encoding, Boolean detectEncodingFromByteOrderMarks, Int32 bufferSize) at System.IO.StreamReader..ctor(String path, Boolean detectEncodingFromByteOrderMarks) at _Default.parseCSV(String path)
Does the file "c:\windows\system32\inetsrv\Test.csv" exist on the remote server? When you run it on a remote machine, the code is not looking for that file on your local computer.
Where do you specify the text file to be parsed? And how do you bind parsedData to a datagrid? Thanks.
The source code for the working example is provided at the bottom of the post. It should have answers for both of your questions.
You code seems good as far as it goes. But it's possible for values to have embedded commas. These values should be wrapped in double quotes. In addition, it's possible for values to have embedded double quotes. So some additional steps are required to correctly part CSV file.
My code to both read and write CSV files can be viewed at http://www.blackbeltcoder.com/Articles/files/reading-and-writing-csv-files-in-c.
Great Tutorial. Thanks You!!!
Love C#, and your tutorial. This will help me create a Service that reads parameter output files from a SMT Solder Reflow oven at work, then move that data to a SQL Server table as a record of process control. To Anonymous - The file cannot be seen unless you store it in the C:\ drive of your Server. Try changing path in code to read: \\[LocalPCname]\c\$\windows\system32\inetsrv\Test.csv then run on server (your server logon will, of course, need to have permissions to look at that folder on your local PC, with Sharing) or place the CSV file on the Server's C: drive
Thanks mate, this is just what I was looking for.
There“s one mistake in your example code.
The line "dataGridView1.DataSource = newTable";
comes direct after the line newTable.Rows.Add(row);
otherwise the DataGridView will be empty.
/// /// Parses CSV data containing Quoted cells with double quote escape for quotes (Excel "Save as CSV" standard format), "cell,1",cell 2,"Cell ""3""" /// /// CSV data in string format /// public static List> ParseCSVData(String data) { List> csvData = new List>(); // parse the CSV data
char prevChr = ' ';
Boolean inQuotes = false; StringBuilder cellBuilder = new StringBuilder(); List rowBuilder = new List();
foreach (char chr in data.ToCharArray()) { if (chr == '"' && inQuotes) // cell quotes ended { inQuotes = false; } else if (chr == '"' && !inQuotes && prevChr != chr) // cell encapsulated by quotes { inQuotes = true; } else { if (chr == '\r' && !inQuotes) // secondary newline character (ignored) { // do nothing } if (chr == '\n' && !inQuotes) // new line character { rowBuilder.Add(cellBuilder.ToString()); cellBuilder = new StringBuilder(); csvData.Add(rowBuilder); rowBuilder = new List(); } if (chr == ',' && !inQuotes) // cell delimiter { rowBuilder.Add(cellBuilder.ToString()); cellBuilder = new StringBuilder(); } else { cellBuilder.Append(chr); } } prevChr = chr; }
return csvData; }
Thanks for the code, very nicely explained. Everybody should learn from you how to explain code!
Excellent article, great intro to many key concepts. For those folks complaining about needing to remove embdded commas inside quoted fields, here is a method that removes the comma and surrounding quotes. It could be improved - it only expects one field in the line that has quotes with embedded commas. [language] /// /// The Description field in the input line from the CSV file sometimes contains /// an embedded comma. This must be removed in order to easily parse the line /// using split(","), otherwise the Description field gets split into two fields. /// This method removes the embedded comma and surrounding quotes. /// /// The line that may have a Description field with an embedded comma. /// string: the input line, cleaned of the embedded comma and surrounding quotes. private string FixDescriptionField(string line) { const int START_INDEX_ZERO = 0; const int NUMBER_OF_REMOVED_CHARS = 3;
string CleanLine = line;
// Find index of first quote, if any int firstQuoteIdx = line.IndexOf('\"', START_INDEX_ZERO);
if( firstQuoteIdx > 0 ) { // Find the column index of the closing quote int secondQuoteIdx = line.IndexOf('\"', firstQuoteIdx+1);
// If there was a closing quote and sanity check passes... if (secondQuoteIdx != -1 && secondQuoteIdx > firstQuoteIdx) { // Must be a line that has a quoted description and embedded comma. string Description = line.Substring(firstQuoteIdx + 1, secondQuoteIdx - firstQuoteIdx - 1);
// Replace the embedded comma with a space Description = Description.Replace(',', ' ');
// Rebuild the line without the quotes or embedded comma CleanLine = line.Substring(0, firstQuoteIdx); CleanLine += Description;
// Figure out the new line length after removing the 2 quotes and 1 comma. int FinalCleanLineLength = line.Length - NUMBER_OF_REMOVED_CHARS; CleanLine += line.Substring(secondQuoteIdx + 1, FinalCleanLineLength - CleanLine.Length); } }
return CleanLine; } [/language]
Thanks found this a very useful starting ground. Really well explained. Look forward to more.
This method does not pick leading zeros... gochhhaaaa :(
/// /// This method read provided CSV file and return a datatable. /// /// The location to pick the file. private DataTable ReadCSV(string FileName) {
DataTable csvDataTable = new DataTable(); try { //no try/catch - add these in yourselfs or let exception happen String[] csvData = File.ReadAllLines(FileName);
//if no data if (csvData.Length == 0) { return csvDataTable; }
String[] headings = csvData[0].Split(','); //for each heading for (int i = 0; i \< headings.Length; i++) { ////replace spaces with underscores for column names //headings[i] = headings[i].Replace(" ", "_");
//add a column for each heading csvDataTable.Columns.Add(headings[i], typeof(string)); }
//populate the DataTable for (int i = 1; i \< csvData.Length; i++) { //create new rows DataRow row = csvDataTable.NewRow();
for (int j = 0; j \< headings.Length; j++) { //fill them row[j] = csvData[i].Split(',')[j]; }
//add rows to over DataTable csvDataTable.Rows.Add(row); } } catch (Exception ex) { MessageBox.Show( ex.Message.ToString()); } //return the CSV DataTable return csvDataTable;
}
Perfect for beginners, best tutorials out there.
just make a note that using OLEDB to connect and read from CSV will lose the leading zeros if the values have any
Great tutorial thanks
If you are concerned about commas, try using the pike | as the separator. Murach's VB 2010 uses this method, and save the file with a dat extension. The result is the similar.
this is crap
What a piece of shit code, fucking retard, put useful code or don't waste people' time
I have just read the code and all the feedback the good, the bad, the pointless, and the frankly abusive comments that are clearly being posted by a bunch of ungrateful retards. As a highly experienced and very well paid systems development manager, I'm disgusted with the way such retards moan and groan about what is essentially a good and useful post. These abusive posts bo not inspire us talented professionals to want to help any of you. So i would ask that all the 'anonymous' little twats out there get back to surfing porn and stop littering these forums full of thier pathetic tittle tattle!
If Suppose CSV File Column field String Contains charter , then While splitting we will get an Error?Any Solution.
eg :Address column has "RajRoad,Dharwad";
C# Tutorial - Using The Built In OLEDB CSV Parser
great work! keep up the good work!
Works great! Thanks!
Great little routine which works like a dream, yes I have commas between my string delimiters which means adding a bit more code but this post has pointed me in the right direction.
Nice, but this is missing a few details. What happens if a field contains an embedded comma? In that case, this code will fail.
The way programs like Excel handle this is to wrap fields with embedded commas inside double quotes. This logic also requires special handling for fields that contain double quotes (the are duplicated: "").
If you want to see my code for dealing with this, I've posted it at http://www.blackbeltcoder.com/Articles/files/reading-and-writing-csv-files-in-c.
This guy know how to code. Check his link it's definatly a more robust CSV parser :) The original article is very basic for starters but is not solid enough for real life applications.
And others posted code is definatly weird. Thanks Jonathan
http://www.blackbeltcoder.com/Articles/files/reading-and-writing-csv-files-in-c
I forgot, if you ever read this... Setting the encoding for StreamReader file are also a nice addon because sometimes files are not encoded in the format you think :)
Even simpler and more effective, use the one that microsoft provide you.
http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.aspx