< Browse > Home / Programming / Blog article: Tutorial Reading A Text File Using LINQ

| RSS

Tutorial Reading A Text File Using LINQ

August 13th, 2008 | 6 Comments | Posted in Programming

Introduction

At times us developers have to deal with delimited text files in our applications. Such files have been around since yonks and I often come across data import/export tasks where delimited files are used. Till now the common way in .NET has been to read each line and then extract data using some sort of creative string functions within for loops. But there is another way by using LINQ. In this tutorial I will show you how to use LINQ to read such data. By the end of tutorial you will appreciate how easy and logical it is to use LINQ for reading data from delimited text files.

Sample Data

I will use a sample file which contains a data about customers. When working with text files we must know the number of columns and the data contained in each column. Below is a list of columns in their right order for our file.

  1. First Name
  2. Last Name
  3. Job Title
  4. City
  5. Country

The file itself will contain this data. I have pulled this out of Employees table in Northwind database.

image

Reading Data

Before we start reading our csv file we will create a class which will hold a record we will read from our csv file. For this I will create a customer class which looks like this. I know I pulled data from Employees table so just imagine that data above is customer data. Employees can also be customers ;)

public class Customer
{
    string Firstname { get; set; }
    string Lastname { get; set; }
    string JobTitle { get; set; }
    string City { get; set; }
    string Country { get; set; }
}
Reading Entire File

Now we are ready to read data from our file using LINQ. Using this code we can read the entire file. I am also using a foreach statement to output the results.

var query =
        from line in File.ReadAllLines(filePath)
        let customerRecord = line.Split(',')
        select new Customer()
            {
                Firstname = customerRecord[0],
                Lastname = customerRecord[1],
                JobTitle = customerRecord[2],
                City = customerRecord[3],
                Country = customerRecord[4]
            };
foreach (var item in query)
{
    Console.WriteLine("{0}, {1}, {2}, {3}, {4}",
        item.Firstname, item.Lastname, item.JobTitle, item.City, item.Country);
}

File.ReadAllLines() returns an array of lines and we then use the split function of array to split it by a comma. Its just that simple.

Reading selected records

We can use this code to read all customers who live in UK.

var query =
        from c in
            (from line in File.ReadAllLines(filePath)
             let customerRecord = line.Split(',')
             select new Customer()
                 {
                     Firstname = customerRecord[0],
                     Lastname = customerRecord[1],
                     JobTitle = customerRecord[2],
                     City = customerRecord[3],
                     Country = customerRecord[4]
                 })

        where c.Country == "UK"
        select c;

This code can be used to read customers who have sales in their job title.

var query =
        from c in
            (from line in File.ReadAllLines(filePath)
             let customerRecord = line.Split(',')
             select new Customer()
                 {
                     Firstname = customerRecord[0],
                     Lastname = customerRecord[1],
                     JobTitle = customerRecord[2],
                     City = customerRecord[3],
                     Country = customerRecord[4]
                 })

        where c.JobTitle.Contains("Sales")
        select c;

I am sure that above queries can be polished by using a bit more syntax sugar but I am just too excited to see LINQ working with a csv file.

Conclusion

LINQ makes it extremely simple to work with delimited text files. Once we have the records from a csv file in an object we can use all the power of LINQ to query our hearts out on such files. This functionality is available to us via LINQ to objects and we do not need another flavour of LINQ to achieve this. Stay tuned for more posts on LINQ.

kick it on DotNetKicks.com

Technorati Tags: ,,






Leave a Reply 4333 views, 19 so far today |
Tags: , ,
Follow Discussion

6 Responses to “Tutorial Reading A Text File Using LINQ”

  1. Mike Borozdin Says:

    Nice :-)!

  2. Deepak Says:

    Thanks Mike

  3. Mike Borozdin Says:

    I’m really fascinated by the fact that LINQ can be used in a great variety of scenarios.

    I remember I was a bit confused when I saw it for the first time. I was primary frustrated by the fact that the first operator was “from” and “select” was the last one. But then I understood the reasons.

  4. Deepak Says:

    Mike,

    The best thing about LINQ is that it comes with such a small learning curve for C# and VB.NET programmers. And if you have written any SQL then after writing few LINQ queries you feel like you knew it all the time.

    In my case I feel that after a long time something (LINQ) excites me so much that I want to learn every little detail about it.

  5. JabberBoxer Says:

    Just wanted to share the code we used to extend the functionality of your read CSV, this will now read your CSV into multiple objects.

    ‘~ Example of CSV file
    ‘~ “InvoiceNumber”,”InvoiceDate”, “FirstName”, “LastName”
    ‘~ “11111″,”9/1/2008″, “Jerrame”, “Hertz”
    ‘~ “22222″,”9/2/2008″, “John”, “Doe”
    ‘~ “33333″,”9/3/2008″, “Jane”, “Doe”

    Imports System.IO

    Module Module1
    Sub Main()
    ReadCSV(”c:\test.csv”)
    End Sub
    End Module

    Module LINQ2CSV
    Public Sub ReadCSV(ByVal FilePath As String)
    ‘~ Load the values from the CSV file.
    Dim Orders = From line In File.ReadAllLines(FilePath) _
    Let OrderRecord = line.Split(”,”) _
    Select New Invoice() _
    With {.Order = New Order( _
    TrimQoutes(OrderRecord(0)), _
    TrimQoutes(OrderRecord(1))), _
    .Customer = New Customer( _
    TrimQoutes(OrderRecord(0)), _
    TrimQoutes(OrderRecord(2)), _
    TrimQoutes(OrderRecord(3)))}

    End Sub

    ‘~ Remove qoutes from string. May be able to avoid this if I use the
    ‘~ parse object instead of the the file object.
    Private Function TrimQoutes(ByVal value As String) As String
    Return value.Trim.Replace(”"”", “”)
    End Function

    End Module
    ‘~ Order object.
    Public Class Invoice
    Public [Order] As Order
    Public [Customer] As Customer
    End Class

    Public Class Order
    Public InvoiceNumber As String
    Public InvoiceDate As String

    Public Sub New(ByVal InvoiceNumber As String, ByVal InvoiceDate As String)
    Me.InvoiceNumber = InvoiceNumber
    Me.InvoiceDate = InvoiceDate
    End Sub
    End Class

    Public Class Customer
    Public InvoiceNumber As String
    Public FirstName As String
    Public LastName As String

    Public Sub New(ByVal InvoiceNumber As String, ByVal FirstName As String, ByVal LastName As String)
    Me.InvoiceNumber = InvoiceNumber
    Me.FirstName = FirstName
    Me.LastName = LastName
    End Sub

    End Class

Trackbacks

  1. Remove Duplicate Lines From A Text File Using LINQ | One .Net Way  

Leave a Reply